hw1

pdf

School

University of California, Riverside *

*We aren’t endorsed by this school

Course

240

Subject

Electrical Engineering

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by LieutenantCapybara2644

EE 240: Pattern Recognition and Machine Learning Homework 1 Due date: April 19, 2023 Description: These questions will explore different aspects of Bayes theorem, maximum likelihood estimation, and linear regression. Reading assignment: AML: Ch. 1 & 5, ESL Ch. 1–2 Homework and lab assignment submission policy: All homework must be submitted online via https://eLearn.ucr.edu . Submit your homeworks as a single Python notebook that is free of any typos or errors. Talk to your TA to make sure that the Python version match. Homework solutions should be written and submitted individually, but discussions among students are encouraged. All assignments should be submitted by the due date. You will incur 25% penalty for every late day. H1.1 Exercise 1.10 in AMLbook : Here is an experiment that illustrates the difference between a single bin and multiple bins. Run a computer simulation for flipping 1,000 fair coins. Flip each coin independently 10 times. Let’s focus on 3 coins as follows. c 1 is the first coin flipped; c rand is a coin you choose at random; c min is the coin that had the minimum frequency of heads (pick the earlier one in case of a tie). Let ν 1 , ν rand , and ν min be the fraction of heads you obtain for the respective three coins. For a coin, let μ be its probability of heads. (10 pts) (a) What is μ for the three coins selected? (b) Repeat this entire experiment a large number of times (e.g., 100,000 runs of the entire experiment) to get several instances of ν 1 , ν rand , and ν min and plot the histograms of the distributions of ν 1 , ν rand , and ν min . Notice that which coins end up being c rand and c min may differ from one run to another. (c) Using part 1b plot estimates for P [ | ν - μ | > ] as a function of , together with the Hoeffding bound 2 e - 2 2 N on the same graph. (d) Which coins obey the Hoeffding bound, and which do not? Explain why. H1.2 Posterior probability estimation for bin selection problem. (Feel free to write a script for this) (a) Suppose we have ten bins (four labeled A, six labeled B). Each bin has balls with two colors (red and blue). The distribution of red and blue balls in bin A is (0.3, 0.7). The distribution of red and blue balls in bin B is (0.7, 0.3). We randomly select a bin and draw two balls with replacement . That is, we select a bin, pick one ball, put it back, and pick another ball from the same bin. Estimate the probability that we selected bin A given the selected balls are red and blue. ( 10 pts ) (b) Suppose we have ten bins (four labeled A, six labeled B). Each bin has balls with four colors (red, blue, white, black). The distribution of balls in bin A is (0.1, 0.3, 0.2, 0.4). The distribution of balls in bin B is (0.4, 0.2, 0.3, 0.1). We randomly select a bin and draw two balls with replacement. Estimate the probability that we selected bin A given the selected balls are red and blue. ( 10 pts ) H1.3 Let us consider the problem of nearest-mean classifier. Suppose we are given N training samples ( x 1 , y 1 ) , . . . , ( x N , y N ) from two classes with y n ∈ { +1 , - 1 } . We saw in lecture 2 that we can decide a label for a test vector x as g ( x ) = sign w T x + b , where w = 2( μ + - μ - ) and b = μ - 2 2 - μ + 2 2 . μ + is a mean vector for samples in the + ve class and μ - is a mean vector for samples the - ve class. Show that w T x + b ≡ ∑ N n =1 α n x n , x + b and calculate the values of the α n . (10 pts) 1

H1.4 K-nearest neighbor classification for MNIST data: In this problem set, we consider the problem of handwritten digit recognition. We will use a subset of the MNIST database, which has become a bench- mark for testing a wide range of classification algorithms. See http://yann.lecun.com/exdb/mnist/ if you’d like to read more about it. You may want to import MNIST dataset using sklearn: The sklearn.datasets package is able to directly download data sets from the repository using the function fetch_openml . For example, to download the MNIST digit recognition database: >>> from sklearn.datasets import fetch_openml >>> mnist = fetch_openml(’mnist_784’) The MNIST database contains a total of 70000 examples of handwritten digits of size 28x28 pixels, labeled from 0 to 9: >>> mnist.data.shape (70000, 784) >>> mnist.target.shape (70000,) >>> np.unique(mnist.target) array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]) In the MNIST database, each training or test example is a 28 × 28 grayscale image. To ease programming of learning algorithms, these images have been converted to vectors of length 28 2 = 784 by storing the pixels in raster scan (row-by-row) order. In this question, we explore the performance of K nearest neighbor (K-NN) classifiers at distinguishing handwritten digits. Pick images corresponding to three digits in your student ID, say “1”, “2”, and “7”. To determine neighborhoods, let’s use the Euclidean distance between pairs of vector-encoded digits x i and x j : d ( x i , x j ) = x i - x j = 784 l =1 [ x i ( l ) - x j ( l )] 2 . ( 30 pts ) (a) Implement a function that finds the K nearest neighbors of any given test digit, and classifies it according to a majority vote of their class labels. Construct a training set with 200 examples of each class ( N = 600 total examples). What is the empirical accuracy (fraction of data classified correctly) of 1-NN and 3-NN classifiers on the test examples from these classes? (b) Plot 5 test digits that are correctly classified by the 1-NN classifier, and 5 which are incorrectly classified. Do you see any patterns? H1.5 Linear regression: Implement a solution for house price prediction using Python. Data and starter codes in Python can be found at Python linear regression . You should have a few modules installed: sudo pip install sklearn sudo pip install scipy sudo pip install scikit-learn The commands above work in Linux. We start by loading the modules, and the dataset. Without data we cannot make good predictions. The first step is to load the dataset. The data will be loaded using Python Pandas, a data analysis module. It will be loaded into a structure known as a Panda Data Frame, which allows for each manipulation of the rows and columns. 2

(a) Let us first fit the price based on size because we expect to find some correlation between the two. We create two arrays: X (size) and Y (price).The data will be split into a training and test set. We will use the training data to find a best fit line and make predictions on the test data. import matplotlib import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model import pandas as pd # Load CSV and columns df = pd.read_csv("Housing.csv") Y = df[’price’] X = df[’lotsize’] X = X.values # Split the data into training/testing sets X_train = X[:-250] X_test = X[-250:] # Split the targets into training/testing sets Y_train = Y[:-250] Y_test = Y[-250:] # Plot outputs plt.scatter(X_test, Y_test, color=’black’) plt.title(’Test Data’) plt.xlabel(’Size’) plt.ylabel(’Price’) plt.xticks(()) plt.yticks(()) plt.show() Write a code to find the best linear fit y = Xw + b by minimizing least-squares cost: Xw + b - y 2 2 . You can either write the explicit solution or use a gradient descent method, but you must write the code yourself. Recall that the solution of least-squares problem is ˆ w ˆ b = X T 1 T [ X 1 ] - 1 X T 1 T y , where 1 is a vector of all ones. Compute the predicted price (Y) using the regression coefficients, w , b and plot the best fit. # Write your code for linear regression (solving least-squares problem) ... # Prediction on the test data Y_predict = np.dot(X_test,w) + b # Plot outputs plt.plot(X_test, Y_predict, color=’red’,linewidth=3) 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

(b) Next you will fit the price based on size, number of bedrooms and bathrooms. We create two arrays: X (size, bedrooms, baths ) and Y (price).The data will be split into a training and test set as before. We will use the training data to find a best fit “plane” and make predictions on the test data. # Select regression variables X = df[’lotsize’, ’bedrooms’, ’bathrms’] X = X.values # fill the remaining lines same as before and write your code for least-squares ... ( 30 pts ) Maximum points: 100 4