Before we can represent our training and testing images as bag of feature histograms, we first need to establish a vocabulary of visual words. We will form this vocabulary by sampling many local features from our training set (10's or 100's of thousands) and then cluster them with k-means. The number of k-means clusters is the size of our vocabulary and the size of our features. For example, you might start by clustering many SIFT descriptors into k=50 clusters. This partitions the continuous, 128 dimensional SIFT feature space into 50 regions. For any new SIFT feature we observe, we can figure out which region it belongs to as long as we save the centroids of our original clusters. Those centroids are our visual word vocabulary. Now we are ready to represent our training and testing images as histograms of visual words. For each image we will densely sample many SIFT descriptors. Instead of storing hundreds of SIFT descriptors, we simply count how many SIFT descriptors fall into each cluster in our visual word vocabulary. This is done by finding the nearest neighbor k-means centroid for every SIFT feature. Thus, if we have a vocabulary of 50 visual words, and we detect 200 distinct SIFT features in an image, our bag of SIFT representation will be a histogram of 50 dimensions where each bin counts how many times a SIFT descriptor was assigned to that cluster. The total of all the bin-counts is 200. The histogram should be normalized so that image size does not dramatically change the bag of features magnitude. Note: Instead of using SIFT to detect invariant keypoints which is time-consuming, you are recommended to densely sample keypoints in a grid with certain step size (sampling density) and scale. There are many design decisions and free parameters for the bag of SIFT representation (number of clusters, sampling density, sampling scales, SIFT parameters, etc.) so accuracy might vary. Hints: Use KMeans in Sklearn to do clustering and find the nearest cluster centroid for each SIFT feature; Use cv2.xfeatures2d.SIFT_create() to create a SIFT object; Use cv2.Keypoint() to generate key points; Use sift.compute() to compute SIFT descriptors given densely sampled keypoints. Be mindful of RAM usage. Try to make the code more memory efficient, otherwise it could easily exceed RAM limits in Colab, at which point your session will crash. If your RAM is going to run out of space, use gc.collect() for the garbage collector to collect unused objects in memory to free some space. Store data or features as NumPy arrays instead of lists. Computation on NumPy arrays is much more efficient than lists. CodeText: from sklearn import neighbors np.random.seed(56) ##########--WRITE YOUR CODE HERE--########## # The following steps are just for your reference # You can write in your own way # # # densely sample keypoints # def sample_kp(shape, stride, size): # return kp # # # extract vocabulary of SIFT features # def extract_vocabulary(raw_data, key_point): # return vocabulary # # # extract Bag of SIFT Representation of images # def extract_feat(raw_data, vocabulary, key_point): # return feat # # # sample dense keypoints # skp = sample_kp((train_data[0].shape[0],train_data[0].shape[1]),(64,64), 8) # vocabulary = extract_vocabulary(train_data, skp) # train_feat = extract_feat(train_data, vocabulary, skp) # test_feat = extract_feat(test_data, vocabulary, skp) train_feat = test_feat = ##########-------END OF CODE-------########## # this block should generate # train_feat and test_feat corresponding to train_data and test_data

Before we can represent our training and testing images as bag of feature histograms, we first need to establish a vocabulary of visual words. We will form this vocabulary by sampling many local features from our training set (10's or 100's of thousands) and then cluster them with k-means. The number of k-means clusters is the size of our vocabulary and the size of our features. For example, you might start by clustering many SIFT descriptors into k=50 clusters. This partitions the continuous, 128 dimensional SIFT feature space into 50 regions. For any new SIFT feature we observe, we can figure out which region it belongs to as long as we save the centroids of our original clusters. Those centroids are our visual word vocabulary. Now we are ready to represent our training and testing images as histograms of visual words. For each image we will densely sample many SIFT descriptors. Instead of storing hundreds of SIFT descriptors, we simply count how many SIFT descriptors fall into each cluster in our visual word vocabulary. This is done by finding the nearest neighbor k-means centroid for every SIFT feature. Thus, if we have a vocabulary of 50 visual words, and we detect 200 distinct SIFT features in an image, our bag of SIFT representation will be a histogram of 50 dimensions where each bin counts how many times a SIFT descriptor was assigned to that cluster. The total of all the bin-counts is 200. The histogram should be normalized so that image size does not dramatically change the bag of features magnitude. Note: Instead of using SIFT to detect invariant keypoints which is time-consuming, you are recommended to densely sample keypoints in a grid with certain step size (sampling density) and scale. There are many design decisions and free parameters for the bag of SIFT representation (number of clusters, sampling density, sampling scales, SIFT parameters, etc.) so accuracy might vary. Hints: Use KMeans in Sklearn to do clustering and find the nearest cluster centroid for each SIFT feature; Use cv2.xfeatures2d.SIFT_create() to create a SIFT object; Use cv2.Keypoint() to generate key points; Use sift.compute() to compute SIFT descriptors given densely sampled keypoints. Be mindful of RAM usage. Try to make the code more memory efficient, otherwise it could easily exceed RAM limits in Colab, at which point your session will crash. If your RAM is going to run out of space, use gc.collect() for the garbage collector to collect unused objects in memory to free some space. Store data or features as NumPy arrays instead of lists. Computation on NumPy arrays is much more efficient than lists. CodeText: from sklearn import neighbors np.random.seed(56) ##########--WRITE YOUR CODE HERE--########## # The following steps are just for your reference # You can write in your own way # # # densely sample keypoints # def sample_kp(shape, stride, size): # return kp # # # extract vocabulary of SIFT features # def extract_vocabulary(raw_data, key_point): # return vocabulary # # # extract Bag of SIFT Representation of images # def extract_feat(raw_data, vocabulary, key_point): # return feat # # # sample dense keypoints # skp = sample_kp((train_data[0].shape[0],train_data[0].shape[1]),(64,64), 8) # vocabulary = extract_vocabulary(train_data, skp) # train_feat = extract_feat(train_data, vocabulary, skp) # test_feat = extract_feat(test_data, vocabulary, skp) train_feat = test_feat = ##########-------END OF CODE-------########## # this block should generate # train_feat and test_feat corresponding to train_data and test_data

Database System Concepts

7th Edition

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Chapter1: Introduction

Section: Chapter Questions

Problem 1PE

See similar textbooks

Problem (1) in Python

Now we are ready to represent our training and testing images as histograms of visual words. For each image we will densely sample many SIFT descriptors. Instead of storing hundreds of SIFT descriptors, we simply count how many SIFT descriptors fall into each cluster in our visual word vocabulary. This is done by finding the nearest neighbor k-means centroid for every SIFT feature. Thus, if we have a vocabulary of 50 visual words, and we detect 200 distinct SIFT features in an image, our bag of SIFT representation will be a histogram of 50 dimensions where each bin counts how many times a SIFT descriptor was assigned to that cluster. The total of all the bin-counts is 200. The histogram should be normalized so that image size does not dramatically change the bag of features magnitude.

Note:

Instead of using SIFT to detect invariant keypoints which is time-consuming, you are recommended to densely sample keypoints in a grid with certain step size (sampling density) and scale.
There are many design decisions and free parameters for the bag of SIFT representation (number of clusters, sampling density, sampling scales, SIFT parameters, etc.) so accuracy might vary.

Hints:

Use KMeans in Sklearn to do clustering and find the nearest cluster centroid for each SIFT feature;
Use cv2.xfeatures2d.SIFT_create() to create a SIFT object;
Use cv2.Keypoint() to generate key points;
Use sift.compute() to compute SIFT descriptors given densely sampled keypoints.
Be mindful of RAM usage. Try to make the code more memory efficient, otherwise it could easily exceed RAM limits in Colab, at which point your session will crash.
If your RAM is going to run out of space, use gc.collect() for the garbage collector to collect unused objects in memory to free some space.
Store data or features as NumPy arrays instead of lists. Computation on NumPy arrays is much more efficient than lists.

CodeText:

from sklearn import neighbors

np.random.seed(56)

##########--WRITE YOUR CODE HERE--##########

# The following steps are just for your reference

# You can write in your own way

# # densely sample keypoints

# def sample_kp(shape, stride, size):

# return kp

# # extract vocabulary of SIFT features

# def extract_vocabulary(raw_data, key_point):

# return vocabulary

# # extract Bag of SIFT Representation of images

# def extract_feat(raw_data, vocabulary, key_point):

# return feat

# # sample dense keypoints

# skp = sample_kp((train_data[0].shape[0],train_data[0].shape[1]),(64,64), 8)

# vocabulary = extract_vocabulary(train_data, skp)

# train_feat = extract_feat(train_data, vocabulary, skp)

# test_feat = extract_feat(test_data, vocabulary, skp)

train_feat =

test_feat =

##########-------END OF CODE-------##########

# this block should generate

# train_feat and test_feat corresponding to train_data and test_data

Expert Solution

Step by step

Solved in 3 steps

SEE SOLUTION Check out a sample Q&A here

Knowledge Booster

Learn more about

Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.