hw05b_blank

pdf

School

Santa Barbara City College *

*We aren’t endorsed by this school

Course

W54

Subject

Computer Science

Date

Oct 30, 2023

Type

pdf

Pages

Uploaded by sarkarved

Data 100, Fall 2023 Homework #5B Total Points: 24 Submission Instructions You must submit this assignment to Gradescope by the on-time deadline, Thursday, Oc- tober 5th at 11:59 PM Pacific . Please read the syllabus for the grace period policy . No late submissions beyond the grace period will be accepted. While course sta ff is happy to help you if you encounter di ﬃ culties with submission, we may not be able to respond to last-minute requests for assistance (TAs need to sleep, after all!). We strongly encourage you to plan to submit your work to Gradescope several hours before the stated deadline. This way, you will have ample time to reach out to sta ff for submission support. There are two parts to this assignment listed on Gradescope: • Homework 05 Coding : Submit your Jupyter notebook zip file for Homework 5A, which can be generated and downloaded from DataHub by using the grader.export() cell provided. • Homework 05 Written : Submit a single PDF to Gradescope that contains both (1) your answers to all manually graded questions from the Homework 5A Jupyter Notebook and (2) your answers to all questions in this Homework 5B document. To receive credit on this assignment, you must submit both your coding and written portions to their respective Gradescope portals . Your written submission (a single PDF) can be generated as follows: 1. Access your answers to manually graded Homework 5A questions in one of three ways: • Automatically create PDF (recommended) : We have provided a cell to generate your written response in the Homework 5A notebook for you. Run the cell and click to download the generated PDF. This function will extract your response to the manually-graded questions and put them on separate pages. This process may fail if your answer is not properly formatted; if this is the case, check out common errors and solutions described on Ed or follow either of the two ways described below. 1

Homework #5B 2 • Manually download PDF : If there are issues with automatically generating the PDF, on DataHub, you can try downloading the PDF by clicking on File-> SaveandExportNotebookAs...->PDF . If you choose to go this route, you must take special care to ensure all appropriate pages are chosen for each question on Gradescope. • Take screenshots : If that doesn’t work either, you can take screenshots of your answers (and your code if present) to manually-graded questions and include them as images in a PDF. The manually-graded questions are listed at the top of the Homework 5A notebook. 2. Answer the below Homework 5B written questions in one of many ways: • You can type your answers. We recommend LaTeX, the math typesetting lan- guage. Overleaf is a great tool to type in LaTeX. • Download this PDF, print it out, and write directly on these pages. If you have a tablet, you may save this PDF and write directly on it. • Write your answers on a blank sheet of physical or digital paper. • Note: If you write your answers on physical paper, use a scanning application (e.g., CamScanner, Apple Notes) to generate a PDF. 3. Combine these two sets of answers together into one PDF document and submit it to the appropriate Gradescope written portal. You can use PDF merging tools, e.g., Adobe Reader, Smallpdf ( https://smallpdf.com/merge-pdf ) or Apple Preview ( https://support.apple.com/en-us/HT202945 ). 4. Important : When submitting on Gradescope, you must tag pages to each ques- tion correctly (it prompts you to do this after submitting your work). This signifi- cantly streamlines the grading process for our readers. Failure to do this may result in a score of 0 for untagged questions. You are responsible for ensuring your submission follows our requirements. We will not be granting regrade requests nor extensions to submissions that don’t follow instructions. If you encounter any di ﬃ culties with submission, please don’t hesitate to reach out to sta ff prior to the deadline. Collaborators Data science is a collaborative activity. While you may talk with others about the homework, we ask that you write your solutions individually. If you do discuss the assignments with others, please include their names at the top of your submission.

Homework #5B 3 Properties of Linear Regression Residuals 1. (10 points) In the lecture, we spent a great deal of time talking about Simple Linear Regression (SLR), which you also saw in Data 8. To briefly summarize, the simple linear regression model assumes that given a single observation x , our predicted response for this observation is ˆ y = ✓ 0 + ✓ 1 x . In Lecture 10, we saw that the ✓ 0 = ˆ ✓ 0 and ✓ 1 = ˆ ✓ 1 that minimize the average L 2 loss (or Mean Squared Error - MSE) for the simple linear regression model are: ˆ ✓ 0 = ¯ y - ˆ ✓ 1 ¯ x ˆ ✓ 1 = r σ y σ x Or, rearranging terms, our predictions ˆ y are: ˆ y = ¯ y + r σ y x - ¯ x σ x (a) (3 points) As we saw in the lecture, a residual e i , for data point i 2 { 1 , . . . , n } , is defined to be the di ff erence between a true response y i and predicted response ˆ y i . Specifically, e i = y i - ˆ y i . Note that there are n data points, and each data point is denoted by ( x i , y i ) . Prove, using the equation for ˆ y above, that P n i =1 e i = 0. (b) (2 points) Prove that ¯ y = ¯ ˆ y . You may use your result from part (a). (c) (2 points) Show that (¯ x, ¯ y ) is on the simple linear regression line. e is Yi g t r y EE es si mi no Ein EI g i ng ng ng o g i s g tr EE ai z g s g tag Ii É I get EE o g d a n n s z E E of a g s g d a toss y s y

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Homework #5B 4 (d) (3 points) Show that the residuals are uncorrelated with the predictor variable, that is 1 n n X i =1 ✓ e i - ¯ e σ e ◆ ✓ x i - ¯ x σ x ◆ = 0 , where ¯ e = 1 n P n i =1 e i , σ 2 e = 1 n P n i =1 ( e i - ¯ e ) 2 , and σ 2 x = 1 n P n i =1 ( x i - ¯ x ) 2 . You may assume that σ e , σ x , and at least one residual are not exactly zero. Use the properties of estimating equations derived in the lecture. É e i o é o e I a i residuals are uncorrelated with the meditor variable

Homework #5B 5 Properties of a Linear Model With No Constant Term 2. (4 points) Suppose that we don’t include an intercept term in our model. That is, our model is now ˆ y = ✓ x, where ✓ is the single parameter for our model that we need to optimize. (In this equation, x is a scalar, corresponding to a single observation.) As usual, we are looking to find the value ˆ ✓ that minimizes the average L 2 loss (MSE) across our observed data { ( x i , y i ) } , for i 2 { 1 , . . . , n } : R ( ✓ ) = 1 n n X i =1 ( y i - ✓ x i ) 2 The estimating equations derived in the lecture no longer hold. In this problem, we’ll derive a solution to this simpler model. We’ll see that the least squares estimate of the slope in this model di ff ers from the simple linear regression model, and we’ll also explore whether or not our properties from the previous problem still hold. Use calculus to find the minimizing ˆ ✓ . That is, you may prove that: ˆ ✓ = P x i y i P x 2 i Hint: You may start by following the format of SLR in lecture 10 and replace the SLR model with the model defined above. dy s nt Es 2 ni ly on Es mi y 0 a i G E ni g É Mi

Homework #5B 6 MSE “Minimizer” 3. (10 points) Recall from calculus that given some function g ( x ) , the x you get from solving dg ( x ) dx = 0 is called a critical point of g – this means it could be a minimizer or a maximizer for g . In this question, we will explore some basic properties and build some intuition on why, for certain loss functions such as squared L 2 loss, the critical point of the empirical risk function (defined as an average loss on the observed data) will always be the minimizer. Given some linear model f ( x ) = ✓ x for some real scalar ✓ , we can write the empirical risk of the model f given the observed data { x i , y i } , for i 2 { 1 , . . . , n } as the average L 2 loss (MSE): 1 n n X i =1 ( y i - ✓ x i ) 2 = n X i =1 1 n ( y i - ✓ x i ) 2 (a) (3 points) Let’s investigate one of the n functions in the summation in the MSE. Define g i ( ✓ ) = 1 n ( y i - ✓ x i ) 2 for i 2 { 1 , . . . , n } . In this case, note that the MSE can be written as P n i =1 g i ( ✓ ) . Recall from calculus that we can use the 2nd derivative of a function to describe its curvature about a certain point (if it is facing concave up, down, or possibly a point of inflection). You can take the following as a fact: A function is convex if and only if the function’s 2nd derivative is non-negative on its domain. Based on this property, verify that g i ( ✓ ) is a convex function . (b) (2 points) Briefly explain intuitively in words why given a convex function g ( ✓ ) , the critical point we get by solving dg ( ✓ ) d ✓ = 0 minimizes g . You can assume that dg ( ✓ ) d ✓ is a function of ✓ (and not a constant). (c) (3 points) Now that we have shown that each term in the summation of the MSE is a convex function, one might wonder if the entire summation is convex, given 9 b s t C y o n t g o s th y o ni gilt 3d n n E Rt i g i is convex For a convex function g 101 its 2nd derivative is non negative so the slope is always increasing

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Homework #5B 7 that it is a sum of convex functions. Let’s look at the formal definition of a convex function . Algebraically speaking, a function g ( ✓ ) is convex if for any two points ( ✓ i , g ( ✓ i )) and ( ✓ j , g ( ✓ j )) on the function, g ( c ⇥ ✓ i + (1 - c ) ⇥ ✓ j )  c ⇥ g ( ✓ i ) + (1 - c ) ⇥ g ( ✓ j ) for any real constant 0  c  1 . The function g evaluated on any point between ✓ i and ✓ j will always lie at or below the secant line connecting g ( ✓ i ) and ( g ( ✓ j ) See a graph in this Wikipedia article https://en.wikipedia.org/wiki/Convex_ function . Intuitively, the above definition says that, given the plot of a convex function g ( ✓ ) , if you connect 2 randomly chosen points on the function, the line segment will always lie on or above g ( ✓ ) (try this with the graph of g ( ✓ ) = ✓ 2 ). i. (2 points) Using the definition above, show that if g ( ✓ ) and h ( ✓ ) are both convex functions, their sum g ( ✓ ) + h ( ✓ ) will also be a convex function. ii. (1 point) Based on what you have shown in the previous part, explain intu- itively why a (finite) sum of n convex functions is still a convex function when n > 2 . g c l c Oj Eeg i t d c g o t a Ceo i tf C oj E ch ai t C C h of HS Sum G R HS Yum

Homework #5B 8 (d) (2 points) Remember from part (a) that the MSE can be written as: 1 n n X i =1 ( y i - ✓ x i ) 2 = n X i =1 1 n ( y i - ✓ x i ) 2 = n X i =1 g i ( ✓ ) Explain why solving for the critical point of the MSE by taking the gradient with respect to the parameter ✓ and setting that expression to 0 , is guaranteed that the solution we find will minimize the MSE. Closing note: In this question, we have discussed only the simple linear model with no constant term—a single-variable function. However, the above properties extend more generally to all multivariable linear regression models; this proof is beyond the scope of this course and is left to a future you. Congratulations! You have finished Homework 5B!

0.0.1 Question 2b Create a plot using any seaborn and/or matplotlib.pyplot functions of your choice to visualize samples , which is the simulated distribution of Pishi votes using a sample of size 50. Include descriptive titles and labels. An example is included below. The total area under the plot must be normalized to 1. Your plot may not match exactly ours due to randomness of the data generating process in np.random.multinomial . Hint : use plt.xlim(left, right) (documentation) to specify the left and right limits of the x-axis. In [22]: sns . set_style( "whitegrid" ) plt . figure(figsize = ( 10 , 6 )) sns . histplot(samples, bins =20 , kde = True , color = 'skyblue' ) plt . title( 'Simulated Distribution of Pishi Votes (Sample Size = 50)' , fontsize =16 ) plt . xlabel( 'Proportion of Votes for Pishi' , fontsize =14 ) plt . ylabel( 'Density' , fontsize =14 ) plt . xlim( 0 , 1 ) plt . show() 1

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

0.0.2 Question 2c According to Shiny’s 50-person sample, 20% of her discussion section reported that they would vote for Pishi in the end-of-semester contest. In the cell below, create a plot using any seaborn and/or matplotlib.pyplot functions of your choice to visualize Shiny’s sample statistic superimposed on the simulated sample distribution you plotted in the previous part. In other words, include - a vertical line that passes through 20%, - a vertical line that passes through the mean of the simulated sample distribution, and - the simulated sample distribution itself. You should choose contrasting colors and include a descriptive title, labels, and a legend if needed. An example is included below. In [24]: sns . set_style( "whitegrid" ) plt . figure(figsize = ( 10 , 6 )) sns . histplot(samples, bins =20 , kde = True , color = 'skyblue' , label = 'Simulated Distribution' ) plt . axvline( 0.20 , color = 'red' , linestyle = '--' , label = "Shiny's Sample Statistic (20%)" ) plt . axvline(np . mean(samples), color = 'green' , linestyle = '-' , label = 'Mean of Simulated Distributi plt . title( 'Shiny \' s Sample Statistic vs. Simulated Distribution of Pishi Votes' , fontsize =16 ) plt . xlabel( 'Proportion of Votes for Pishi' , fontsize =14 ) plt . ylabel( 'Density' , fontsize =14 ) plt . xlim( 0 , 1 ) plt . legend() plt . show() 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

0.0.3 Question 2d Based on your analysis above, could Shiny’s result have arisen due to chance alone? If not, what could be a potential source of bias? Type your answer here, replacing this text. 5