assignment2

pdf

School

University of Toronto *

*We aren’t endorsed by this school

Course

1513

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

Uploaded by ProfessorProton19692

ECE1513 Introduction to Machine Learning Winter 2024 Assignment 2 : Linear Regression This assignment is adapted from ECE 421 Assignments Prepared by Nicholas Papernot Submission This assignment has three parts. For each part, you need to upload one file on Crowdmark. If the question asks for your code, you need to copy and paste your code in the file that you submit. Graphs produced should be clearly labelled. Assignment due date is February 6, 2024 at 11:30 pm. No late assignments will be accepted. Grading This assignment is graded out of 36 points. It counts for 10% of your final grade. Part 1: Linear Regression with Scalar Inputs and Outputs Assume we collected a dataset D = { ( x i , t i ) } i ∈ 1 .. 7 of N = 7 points (i.e., observations) with inputs { x i } i ∈ 1 .. 7 = (1 , 2 , 3 , 4 , 5 , 6 , 7) and outputs { t i } i ∈ 1 .. 7 = (6 , 4 , 2 , 1 , 3 , 6 , 10) for a regression problem with both scalar inputs and outputs. 1. (1 point) Draw a scatter plot of the dataset using matplotlib in Python. 2. (6 points) Let us use a linear regression model g w,b ( x ) = wx + b to model this data. Write down the analytical expression of the mean square error of this model on dataset D . Your loss should take the form of 1 2 N X i ∈ 1 ..N A i w 2 + B i b 2 + C i wb + D i w + E i b + F i where A i , B i , C i , D i , E i , and F i are expressed only as a function of x i and t i or constants. Do not fill-in any numerical values yet. 3. (4 points) Derive the analytical expressions of w and b by minimizing the mean squared loss from the previous question. Your expressions for parameters w and b should only depend on A = ∑ i A i , B = ∑ i B i , C = ∑ i C i , D = ∑ i D i and E = ∑ i E i . Do not fill-in any numerical values yet. 4. (2 points) Give approximate numerical values for w and b by plugging in numerical values from the dataset D . 5. (1 points) Use numpy polyfit to double-check your solution with the scatter plot from the question earlier. This would yield the values of w and b . Paste your lines of code for this question and show you obtained the correct solution in the previous questions. Page 1 of 3

Part 2: Linear Regression Matrix Form The goal of this part is to revisit part 1, but solving it with a different technique. This will serve as a “warm-up” to part 3. In the rest of this problem, any reference to a dataset refers to the dataset described in part 1. 1. (1 point) Verify that one can rewrite the linear regression model g w,b ( x ) = wx + b in the simpler form of g w ( ⃗x ) = ⃗x⃗w if one assumes each input ⃗x is a two-dimensional row vector such that a point in our dataset is now ⃗x i = ( x i , 1) where x i is the scalar input described in part 1. Write the components of the new column vector ⃗w as a function of w and b from part 1. 2. (4 points) Derive analytically ∇ ⃗w ∥ X ⃗w − ⃗ t ∥ 2 2 where X is a N × 2 matrix such that each row of X is a vector ⃗x i described in the previous question, and ⃗ t = { t i } i ∈ 1 .. 7 . 3. (1 point) Conclude that the model’s weight value ⃗w ∗ which minimizes the mean square error must satisfy 2 X ⊤ X ⃗w ∗ − 2 X ⊤ ⃗ t = 0 4. (1 point) Assuming that X ⊤ X is invertible, derive analytically the value of ⃗w ∗ . 5. (2 points) Using numPy , implement the solution you found in the previous question and verify that you obtain the same results for w and b than in part 1. You may find dot , matmul , transpose and linalg.inv helpful. Paste your lines of code for this question and show you obtained the correct solution in the previous questions. Part 3: Regularization and Vectorized Input Let us now assume that D is a dataset with d features per input and N > 0 inputs. We have D = n ( x ij ) j ∈ 1 ..d , t i i ∈ 1 ..N o . In other words, each ⃗x i is a column vector with d components indexed by j such that x ij is the j th component of ⃗x i . The output t i remains a scalar (real value). Let us assume for simplicity that we have a simplified linear regression model, as presented in question 1 of part 2. We would like to train a regularized linear regression model, where the mean squared loss is augmented with an ℓ 2 regularization penalty 1 2 ∥ ⃗w ∥ 2 2 on the weight parameter ⃗w : ε ( ⃗w, D ) = 1 2 N X i ∈ 1 ..N ( g ⃗w ( ⃗x i ) − t i ) 2 + λ 2 ∥ ⃗w ∥ 2 2 where λ > 0 is a hyperparameter that controls how much importance is given to the penalty. 1. (3 points) Let A = ∑ i ∈ 1 ..N ⃗x i ( ⃗x i ) ⊤ . Give a simple analytical expression for the components of A . In other words, write an expression for A jk where j is the row and k is the column. 2. (6 points) Let us write ⃗ b = ∑ i ∈ 1 ..N t i ⃗x i , prove that the following holds: ∇ ⃗w ε ( ⃗w, D ) = 1 N ( A⃗w − ⃗ b ) + λ⃗w Page 2 of 3

3. (2 points) Write down the matrix equation that ˜w ∗ should satisfy, where: ⃗w ∗ = arg min ⃗w ε ( ⃗w, D ) Your equation should only involve A, ⃗ b, λ, N, and ⃗w ∗ . 4. (2 points) Given that A + λNI d is invertible, solve the equation stated in question 3 and deduce an analytical solution for ⃗w ∗ . You’ve obtained a linear regression model regularized with an ℓ 2 penalty. Page 3 of 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

WWW Information Hunt- How Computers Work and other Questions.docx

Discussion.docx

solutions1_2021.pdf

MIS HW 3 OBrien Meaghan.docx

“Sons of Stuxnet” paper mod 2.docx

CS122aF21Midterm2(3PM)Solution.pdf

2.3.7 Packet Tracer - Answers.docx

Completed-C310-HW4.docx

Completed- C310-HW3.docx

pokemon.py

Lesson 02 Research Assignment.docx

Lesson 04 Research Assignment.docx

Recommended textbooks for you

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE

Computer Science

ISBN:9780357392676

Author:FREUND, Steven

Publisher:CENGAGE L

Programming with Microsoft Visual Basic 2017

Computer Science

ISBN:9781337102124

Author:Diane Zak

Publisher:Cengage Learning

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

Oracle 12c: SQL

Computer Science

ISBN:9781305251038

Author:Joan Casteel

Publisher:Cengage Learning

CMPTR

Computer Science

ISBN:9781337681872

Author:PINARD

Publisher:Cengage

SEE MORE TEXTBOOKS

Recommended textbooks for you

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
Programming with Microsoft Visual Basic 2017
Computer Science
ISBN:9781337102124
Author:Diane Zak
Publisher:Cengage Learning
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
Oracle 12c: SQL
Computer Science
ISBN:9781305251038
Author:Joan Casteel
Publisher:Cengage Learning
CMPTR
Computer Science
ISBN:9781337681872
Author:PINARD
Publisher:Cengage