Before we can represent our training and testing images as bag of feature histograms, we first need to establish a vocabulary of visual words. We will form this vocabulary by sampling many local features from our training set (10's or 100's of thousands) and then cluster them with k-means. The number of k-means clusters is the size of our vocabulary and the size of our features. For example, you might start by clustering many SIFT descriptors into k=50 clusters. This partitions the continuous, 128 dimensional SIFT feature space into 50 regions. For any new SIFT feature we observe, we can figure out which region it belongs to as long as we save the centroids of our original clusters. Those centroids are our visual word vocabulary. Now we are ready to represent our training and testing images as histograms of visual words. For each image we will densely sample many SIFT descriptors. Instead of storing hundreds of SIFT descriptors, we simply count how many SIFT descriptors fall into each cluster in our visual word vocabulary. This is done by finding the nearest neighbor k-means centroid for every SIFT feature. Thus, if we have a vocabulary of 50 visual words, and we detect 200 distinct SIFT features in an image, our bag of SIFT representation will be a histogram of 50 dimensions where each bin counts how many times a SIFT descriptor was assigned to that cluster. The total of all the bin-counts is 200. The histogram should be normalized so that image size does not dramatically change the bag of features magnitude. Note: Instead of using SIFT to detect invariant keypoints which is time-consuming, you are recommended to densely sample keypoints in a grid with certain step size (sampling density) and scale. There are many design decisions and free parameters for the bag of SIFT representation (number of clusters, sampling density, sampling scales, SIFT parameters, etc.) so accuracy might vary. Hints: Use KMeans in Sklearn to do clustering and find the nearest cluster centroid for each SIFT feature; Use cv2.xfeatures2d.SIFT_create() to create a SIFT object; Use cv2.Keypoint() to generate key points; Use sift.compute() to compute SIFT descriptors given densely sampled keypoints. Be mindful of RAM usage. Try to make the code more memory efficient, otherwise it could easily exceed RAM limits in Colab, at which point your session will crash. If your RAM is going to run out of space, use gc.collect() for the garbage collector to collect unused objects in memory to free some space. Store data or features as NumPy arrays instead of lists. Computation on NumPy arrays is much more efficient than lists. CodeText:   from sklearn import neighbors np.random.seed(56) ##########--WRITE YOUR CODE HERE--########## # The following steps are just for your reference # You can write in your own way # # # densely sample keypoints # def sample_kp(shape, stride, size): # return kp # # # extract vocabulary of SIFT features # def extract_vocabulary(raw_data, key_point): # return vocabulary # # # extract Bag of SIFT Representation of images # def extract_feat(raw_data, vocabulary, key_point): # return feat # # # sample dense keypoints # skp = sample_kp((train_data[0].shape[0],train_data[0].shape[1]),(64,64), 8) # vocabulary = extract_vocabulary(train_data, skp) # train_feat = extract_feat(train_data, vocabulary, skp) # test_feat = extract_feat(test_data, vocabulary, skp) train_feat = test_feat = ##########-------END OF CODE-------########## # this block should generate # train_feat and test_feat corresponding to train_data and test_data

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Problem (1) in Python 

 Before we can represent our training and testing images as bag of feature histograms, we first need to establish a vocabulary of visual words. We will form this vocabulary by sampling many local features from our training set (10's or 100's of thousands) and then cluster them with k-means. The number of k-means clusters is the size of our vocabulary and the size of our features. For example, you might start by clustering many SIFT descriptors into k=50 clusters. This partitions the continuous, 128 dimensional SIFT feature space into 50 regions. For any new SIFT feature we observe, we can figure out which region it belongs to as long as we save the centroids of our original clusters. Those centroids are our visual word vocabulary.

Now we are ready to represent our training and testing images as histograms of visual words. For each image we will densely sample many SIFT descriptors. Instead of storing hundreds of SIFT descriptors, we simply count how many SIFT descriptors fall into each cluster in our visual word vocabulary. This is done by finding the nearest neighbor k-means centroid for every SIFT feature. Thus, if we have a vocabulary of 50 visual words, and we detect 200 distinct SIFT features in an image, our bag of SIFT representation will be a histogram of 50 dimensions where each bin counts how many times a SIFT descriptor was assigned to that cluster. The total of all the bin-counts is 200. The histogram should be normalized so that image size does not dramatically change the bag of features magnitude.

Note:

  • Instead of using SIFT to detect invariant keypoints which is time-consuming, you are recommended to densely sample keypoints in a grid with certain step size (sampling density) and scale.
  • There are many design decisions and free parameters for the bag of SIFT representation (number of clusters, sampling density, sampling scales, SIFT parameters, etc.) so accuracy might vary.

Hints:

  • Use KMeans in Sklearn to do clustering and find the nearest cluster centroid for each SIFT feature;

  • Use cv2.xfeatures2d.SIFT_create() to create a SIFT object;

  • Use cv2.Keypoint() to generate key points;

  • Use sift.compute() to compute SIFT descriptors given densely sampled keypoints.

  • Be mindful of RAM usage. Try to make the code more memory efficient, otherwise it could easily exceed RAM limits in Colab, at which point your session will crash.

  • If your RAM is going to run out of space, use gc.collect() for the garbage collector to collect unused objects in memory to free some space.

  • Store data or features as NumPy arrays instead of lists. Computation on NumPy arrays is much more efficient than lists.

CodeText:
 
from sklearn import neighbors

np.random.seed(56)

##########--WRITE YOUR CODE HERE--##########
# The following steps are just for your reference
# You can write in your own way
#
# # densely sample keypoints
# def sample_kp(shape, stride, size):
# return kp
#
# # extract vocabulary of SIFT features
# def extract_vocabulary(raw_data, key_point):
# return vocabulary
#
# # extract Bag of SIFT Representation of images
# def extract_feat(raw_data, vocabulary, key_point):
# return feat
#
# # sample dense keypoints
# skp = sample_kp((train_data[0].shape[0],train_data[0].shape[1]),(64,64), 8)
# vocabulary = extract_vocabulary(train_data, skp)
# train_feat = extract_feat(train_data, vocabulary, skp)
# test_feat = extract_feat(test_data, vocabulary, skp)

train_feat =
test_feat =

##########-------END OF CODE-------##########
# this block should generate
# train_feat and test_feat corresponding to train_data and test_data
Expert Solution
steps

Step by step

Solved in 3 steps

Blurred answer
Knowledge Booster
Types of trees
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education