assignment2

pdf

School

University of Toronto *

*We aren’t endorsed by this school

Course

1513

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

3

Uploaded by ProfessorProton19692

Report
ECE1513 Introduction to Machine Learning Winter 2024 Assignment 2 : Linear Regression This assignment is adapted from ECE 421 Assignments Prepared by Nicholas Papernot Submission This assignment has three parts. For each part, you need to upload one file on Crowdmark. If the question asks for your code, you need to copy and paste your code in the file that you submit. Graphs produced should be clearly labelled. Assignment due date is February 6, 2024 at 11:30 pm. No late assignments will be accepted. Grading This assignment is graded out of 36 points. It counts for 10% of your final grade. Part 1: Linear Regression with Scalar Inputs and Outputs Assume we collected a dataset D = { ( x i , t i ) } i 1 .. 7 of N = 7 points (i.e., observations) with inputs { x i } i 1 .. 7 = (1 , 2 , 3 , 4 , 5 , 6 , 7) and outputs { t i } i 1 .. 7 = (6 , 4 , 2 , 1 , 3 , 6 , 10) for a regression problem with both scalar inputs and outputs. 1. (1 point) Draw a scatter plot of the dataset using matplotlib in Python. 2. (6 points) Let us use a linear regression model g w,b ( x ) = wx + b to model this data. Write down the analytical expression of the mean square error of this model on dataset D . Your loss should take the form of 1 2 N X i 1 ..N A i w 2 + B i b 2 + C i wb + D i w + E i b + F i where A i , B i , C i , D i , E i , and F i are expressed only as a function of x i and t i or constants. Do not fill-in any numerical values yet. 3. (4 points) Derive the analytical expressions of w and b by minimizing the mean squared loss from the previous question. Your expressions for parameters w and b should only depend on A = i A i , B = i B i , C = i C i , D = i D i and E = i E i . Do not fill-in any numerical values yet. 4. (2 points) Give approximate numerical values for w and b by plugging in numerical values from the dataset D . 5. (1 points) Use numpy polyfit to double-check your solution with the scatter plot from the question earlier. This would yield the values of w and b . Paste your lines of code for this question and show you obtained the correct solution in the previous questions. Page 1 of 3
Part 2: Linear Regression Matrix Form The goal of this part is to revisit part 1, but solving it with a different technique. This will serve as a “warm-up” to part 3. In the rest of this problem, any reference to a dataset refers to the dataset described in part 1. 1. (1 point) Verify that one can rewrite the linear regression model g w,b ( x ) = wx + b in the simpler form of g w ( ⃗x ) = ⃗x⃗w if one assumes each input ⃗x is a two-dimensional row vector such that a point in our dataset is now ⃗x i = ( x i , 1) where x i is the scalar input described in part 1. Write the components of the new column vector ⃗w as a function of w and b from part 1. 2. (4 points) Derive analytically ⃗w X ⃗w t 2 2 where X is a N × 2 matrix such that each row of X is a vector ⃗x i described in the previous question, and t = { t i } i 1 .. 7 . 3. (1 point) Conclude that the model’s weight value ⃗w which minimizes the mean square error must satisfy 2 X X ⃗w 2 X t = 0 4. (1 point) Assuming that X X is invertible, derive analytically the value of ⃗w . 5. (2 points) Using numPy , implement the solution you found in the previous question and verify that you obtain the same results for w and b than in part 1. You may find dot , matmul , transpose and linalg.inv helpful. Paste your lines of code for this question and show you obtained the correct solution in the previous questions. Part 3: Regularization and Vectorized Input Let us now assume that D is a dataset with d features per input and N > 0 inputs. We have D = n ( x ij ) j 1 ..d , t i i 1 ..N o . In other words, each ⃗x i is a column vector with d components indexed by j such that x ij is the j th component of ⃗x i . The output t i remains a scalar (real value). Let us assume for simplicity that we have a simplified linear regression model, as presented in question 1 of part 2. We would like to train a regularized linear regression model, where the mean squared loss is augmented with an 2 regularization penalty 1 2 ⃗w 2 2 on the weight parameter ⃗w : ε ( ⃗w, D ) = 1 2 N X i 1 ..N ( g ⃗w ( ⃗x i ) t i ) 2 + λ 2 ⃗w 2 2 where λ > 0 is a hyperparameter that controls how much importance is given to the penalty. 1. (3 points) Let A = i 1 ..N ⃗x i ( ⃗x i ) . Give a simple analytical expression for the components of A . In other words, write an expression for A jk where j is the row and k is the column. 2. (6 points) Let us write b = i 1 ..N t i ⃗x i , prove that the following holds: ⃗w ε ( ⃗w, D ) = 1 N ( A⃗w b ) + λ⃗w Page 2 of 3
3. (2 points) Write down the matrix equation that ˜w should satisfy, where: ⃗w = arg min ⃗w ε ( ⃗w, D ) Your equation should only involve A, b, λ, N, and ⃗w . 4. (2 points) Given that A + λNI d is invertible, solve the equation stated in question 3 and deduce an analytical solution for ⃗w . You’ve obtained a linear regression model regularized with an 2 penalty. Page 3 of 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help