CL9 - Linear Regression

pdf

School

University of Illinois, Urbana Champaign *

*We aren’t endorsed by this school

Course

257

Subject

Mathematics

Date

Apr 3, 2024

Type

pdf

Pages

17

Uploaded by CoachGuanaco4012

Report
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 1/17 Matplotlib created a temporary config/cache directory at /tmp/matplotlib-45lty7mr becaus e the default path (/tmp/cache/matplotlib) is not a writable directory; it is highly rec ommended to set the MPLCONFIGDIR environment variable to a writable directory, in partic ular to speed up the import of Matplotlib and to better support multiprocessing. Lesson 9: Linear Least Squares Linear Least Squares and Normal Equations Consider a set of $m$ data points $\{(t_1,y_1),(t_2,y_2), ...,(t_m,y_m)\}$. Suppose we want to find a straight line that best fits these data points. Mathematically, we want to find the slope and intercept ($x_1$ and $x_2$ respectively) such that $$y_i = x_1 t_i + x_2, \qquad i \in \{1,2,...,m\} $$ We can write this more explicitly as ${\bf y} = {\bf t}\cdot x_1 + 1\cdot x_2$, which lets us separate the coefficients we are trying to solve for ($x_1$ and $x_2$) from the input data that we have (${\bf y}$ and ${\bf t}$). Now we can put these values into a matrix format: $$\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_m\end{bmatrix} = \begin{bmatrix}t_1 & 1 \\ t_2 & 1 \\ \vdots & \vdots \\ t_m & 1\end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} \qquad \Leftrightarrow \qquad {\bf y} = {\bf A}{\bf x}$$ where ${\bf A}$ is the design matrix that is a function of the $t_i$ data, ${\bf y}$ is the vector with the $y_i$ data and ${\bf x}$ is the vector with the coefficients $x_i$. Notice how the values that multiply our coefficients $x_i$ are the columns of ${\bf A}$. Generally, if we have a linear system where ${\bf A}$ is an $m \times n$ matrix and $m > n$ we call this system overdetermined . For these systems, the equality is usually not exactly satisfiable as ${\bf y}$ may not lie in the column space of ${\bf A}$. Therefore, an overdetermined system is better written as $${\bf A}{\bf x} \cong {\bf y} $$ For an overdetermined system ${\bf A}{\bf x} \cong {\bf y}$, we are typically looking for a solution ${\bf x}$ that minimizes the squared norm of the residual vector ${\bf r} = {\bf y}-{\bf A x}$, $$ \min_{\bf x} ||{\bf r}||^2 = \min_{\bf x} ||{\bf y} - {\bf A x}||^2$$ This problem is called a linear least-squares problem , and the solution ${\bf x}$ is called least- squares solution. You learned during lecture that the solution of the least squares problem is also the solution of the system of normal equations : $${\bf A}^T {\bf A}{\bf x} = {\bf A}^T {\bf y} $$ Example 1 - fitting a line In [1]: import numpy as np import numpy.linalg as la import scipy.linalg as sla import matplotlib.pyplot as plt
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 2/17 Consider the data set given by pts : [2 3 5 6] Check your answers: Define the matrix ${\bf A}$ that can be used to solve for the coefficients $x_1$ and $x_2$. Store the value of it in variable A_line . Hint: Try to define the design matrix programmatically (i.e. not hard-coded). Rather than iterating through a loop, you can directly construct ${\bf A}^T$ as a NumPy array whose first entry is t_line and whose second entry is a vector of 1 s, then apply the transpose with .T to obtain ${\bf A}$ as desired. Hint: To obtain the vector of 1 s with appropriate length, you may use t_line**0 or np.ones_like(t_line) . Recalling the syntax for power in Python and broadcasting, can you reason why the first option works? [[2. 1.] [3. 1.] In [2]: pts = np . array ([[ 2 , 4 ], [ 3 , 2 ], [ 5 , 1 ], [ 6 , 0 ]]) t_line = pts [:, 0 ] y_line = pts [:, 1 ] plt . plot ( t_line , y_line , 'o' ) print ( t_line ) In [19]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) # Define A_line here A_line = np . ones ( pts . shape ) width , length = pts . shape for i in range ( width ): A_line [ i ][ 0 ] = pts [ i ][ 0 ] A_line [ i ][ 1 ] = 1 print ( A_line )
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 3/17 [5. 1.] [6. 1.]] Check your answers: Use the normal equations to find the coefficients in ${\bf x}$. Store the value of it in variable x_line . Plot the points and fit curve: [<matplotlib.lines.Line2D at 0x7f3b849b5d60>] Linear Least Squares and QR While easy to understand and remember, using the normal equations can be prone to numerical instabilities (you will learn more about this if you take a Numerical Methods or Analysis class! Hint hint!) Instead of solving the normal equations using the design matrix ${\bf A}$, we will instead use its QR decomposition $$ {\bf A} = {\bf QR} $$ where ${\bf Q}$ is an orthogonal matrix and ${\bf R}$ is an upper triangular matrix. Hence the least squares solution ${\bf x}$ can be obtained as: $$ {\bf A}^T{\bf Ax} = {\bf A}^T{\bf y} \quad \Leftrightarrow \quad {\bf Rx} = {\bf Q}^T{\bf y}$$ Check your answers: In [4]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) x_line = np . linalg . solve (( A_line . T @ A_line ), ( A_line . T@y_line )) # x_line = (A_line.T@y_line) @ (np.linalg.inv(A_line.T @ A_line)) In [5]: plt . plot ( t_line , y_line , 'o' ) plt . plot ( t_line , t_line * x_line [ 0 ] + x_line [ 1 ]) Out[5]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 4/17 Write the least_sq function that uses QR decomposition and finds the least squares solution ${\bf x}$ given matrix ${\bf A}$ and vector ${\bf y}$. Do not use the numpy.linalg.inv function. Hint: Try using the functions numpy.linalg.qr and scipy.linalg.solve_triangular . Let's confirm that we get the same answer as we did above with the normal equations. [<matplotlib.lines.Line2D at 0x7f3b84930640>] Example 2: quadratic curve Lets look at another set of points: [<matplotlib.lines.Line2D at 0x7f3b84910dc0>] In [6]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) def least_sq ( A , y ): # Find the least squares solution to Ax = y # by solving Rx = Q^T y for x where A = QR Q , R = np . linalg . qr ( A ) x = np . linalg . solve ( R , ( Q . T@y )) return x In [7]: xqr = least_sq ( A_line , y_line ) In [8]: plt . plot ( t_line , y_line , 'o' ) plt . plot ( t_line , t_line * xqr [ 0 ] + xqr [ 1 ]) plt . plot ( t_line , t_line * x_line [ 0 ] + x_line [ 1 ], 'x' ) Out[8]: In [9]: n_quad = 50 t_quad = np . linspace ( - 1 , 1 , n_quad ) y_quad = t_quad ** 2 + np . random . randn ( n_quad ) * 0.05 + 5 plt . plot ( t_quad , y_quad , 'o' ) Out[9]:
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 5/17 This looks a lot like a quadratic function! Let's try to fit a curve to this: $$y = x_1 \,t^2 + x_2\,t + x_3$$ Check your answer: We want to find the coefficients $x_1$, $x_2$, and $x_3$ that solve the least squares problem ${\bf A x} \cong {\bf y}$; first, we need to construct the design matrix. Taking ${\bf x} = [x_1, x_2,x_3]^T$, define the design matrix ${\bf A}$ corresponding to the model written above. Store the value of it in variable A_quad . Hint: As with example A, try defining the matrix ${\bf A}^T$ then apply the transpose to obtain ${\bf A}$ in a single line of code. This time, what will the first entry (row of the transpose/column of the design matrix) be? [[ 1.00000000e+00 -1.00000000e+00 1.00000000e+00] [ 9.20033319e-01 -9.59183673e-01 1.00000000e+00] [ 8.43398584e-01 -9.18367347e-01 1.00000000e+00] [ 7.70095793e-01 -8.77551020e-01 1.00000000e+00] [ 7.00124948e-01 -8.36734694e-01 1.00000000e+00] [ 6.33486047e-01 -7.95918367e-01 1.00000000e+00] [ 5.70179092e-01 -7.55102041e-01 1.00000000e+00] [ 5.10204082e-01 -7.14285714e-01 1.00000000e+00] [ 4.53561016e-01 -6.73469388e-01 1.00000000e+00] [ 4.00249896e-01 -6.32653061e-01 1.00000000e+00] [ 3.50270721e-01 -5.91836735e-01 1.00000000e+00] [ 3.03623490e-01 -5.51020408e-01 1.00000000e+00] [ 2.60308205e-01 -5.10204082e-01 1.00000000e+00] [ 2.20324865e-01 -4.69387755e-01 1.00000000e+00] [ 1.83673469e-01 -4.28571429e-01 1.00000000e+00] [ 1.50354019e-01 -3.87755102e-01 1.00000000e+00] [ 1.20366514e-01 -3.46938776e-01 1.00000000e+00] [ 9.37109538e-02 -3.06122449e-01 1.00000000e+00] In [10]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) A_quad = np . ones (( 50 , 3 )) for i in range ( 50 ): A_quad [ i ][ 0 ] = ( t_quad [ i ]) ** 2 A_quad [ i ][ 1 ] = t_quad [ i ] A_quad [ i ][ 2 ] = 1 print ( A_quad )
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 6/17 [ 7.03873386e-02 -2.65306122e-01 1.00000000e+00] [ 5.03956685e-02 -2.24489796e-01 1.00000000e+00] [ 3.37359434e-02 -1.83673469e-01 1.00000000e+00] [ 2.04081633e-02 -1.42857143e-01 1.00000000e+00] [ 1.04123282e-02 -1.02040816e-01 1.00000000e+00] [ 3.74843815e-03 -6.12244898e-02 1.00000000e+00] [ 4.16493128e-04 -2.04081633e-02 1.00000000e+00] [ 4.16493128e-04 2.04081633e-02 1.00000000e+00] [ 3.74843815e-03 6.12244898e-02 1.00000000e+00] [ 1.04123282e-02 1.02040816e-01 1.00000000e+00] [ 2.04081633e-02 1.42857143e-01 1.00000000e+00] [ 3.37359434e-02 1.83673469e-01 1.00000000e+00] [ 5.03956685e-02 2.24489796e-01 1.00000000e+00] [ 7.03873386e-02 2.65306122e-01 1.00000000e+00] [ 9.37109538e-02 3.06122449e-01 1.00000000e+00] [ 1.20366514e-01 3.46938776e-01 1.00000000e+00] [ 1.50354019e-01 3.87755102e-01 1.00000000e+00] [ 1.83673469e-01 4.28571429e-01 1.00000000e+00] [ 2.20324865e-01 4.69387755e-01 1.00000000e+00] [ 2.60308205e-01 5.10204082e-01 1.00000000e+00] [ 3.03623490e-01 5.51020408e-01 1.00000000e+00] [ 3.50270721e-01 5.91836735e-01 1.00000000e+00] [ 4.00249896e-01 6.32653061e-01 1.00000000e+00] [ 4.53561016e-01 6.73469388e-01 1.00000000e+00] [ 5.10204082e-01 7.14285714e-01 1.00000000e+00] [ 5.70179092e-01 7.55102041e-01 1.00000000e+00] [ 6.33486047e-01 7.95918367e-01 1.00000000e+00] [ 7.00124948e-01 8.36734694e-01 1.00000000e+00] [ 7.70095793e-01 8.77551020e-01 1.00000000e+00] [ 8.43398584e-01 9.18367347e-01 1.00000000e+00] [ 9.20033319e-01 9.59183673e-01 1.00000000e+00] [ 1.00000000e+00 1.00000000e+00 1.00000000e+00]] Now we solve for the coefficients ${\bf x}$ using the least_sq function you have defined above. Store the value of it in variable x_quad . array([9.35089178e-01, 1.33628054e-03, 5.01979902e+00]) [<matplotlib.lines.Line2D at 0x7f3b84876880>] In [11]: x_quad = least_sq ( A_quad , y_quad ) x_quad Out[11]: In [12]: plt . plot ( t_quad , y_quad , 'o' ) plt . plot ( t_quad , x_quad [ 0 ] * t_quad ** 2 + x_quad [ 1 ] * t_quad + x_quad [ 2 ]) Out[12]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 7/17 As you can see, even though the resulting function does not appear linear, we are still able to use linear least squares here. This is because the function only needs to be linear with respect to the coefficients $x_i$ (e.g. cannot make direct use of $x_i^2$). In fact, what we are doing here is simply taking linear combinations of some basis functions: $$ y = x_1 f_1(t) + x_2 f_2(t) + x_3 f_3(t) $$$$\begin{align*} f_1(t) &= t^2 \\ f_2(t) &= t \\ f_3(t) &= 1 \end{align*} $$ As long as we can write the function we are trying to fit in this manner (or somehow get it in that form), we can use least squares. Example 3: exponential fit Assume the following set of data points: [<matplotlib.lines.Line2D at 0x7f3b8485d3a0>] Let's say we want to fit the function In [13]: n_exp = 50 t_exp = np . linspace ( 0.1 , np . e * 4 , n_exp ) y_exp = np . exp ( t_exp * 0.5 ) * 3 + np . random . randn ( n_exp ) * 2 y_exp = np . where ( y_exp < 1e-6 , 1e-6 , y_exp ) plt . plot ( t_exp , y_exp , 'o' ) Out[13]:
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 8/17 $$ y = x_1\,e^{x_2 \, t} $$ to the above plot. Notice that we can't directly use linear least squares, since the function is not linear with respect to every coefficient $x_i$. What if we take the natural log of each side? $$ \ln y = \ln x_1 + x_2 t $$ With the change of variables $\bar y = \ln y$ and $\bar x_1 = \ln x_1$, we can rewrite the above equation as: $$ \bar y = \bar x_1 + x_2 t $$ Check your answers: We need to write the design matrix ${\bf A}$ and the right-hand side vector ${\bf \bar y}$ needed to obtain the least squares solution ${\bf \bar x} = [\bar x_1, x_2]$. Store the value of ${\bf A}$ in variable A_exp and the value of ${\bf \bar y}$ in variable ybar . Hint: Recall that numpy.log computes the natural logarithm by default. [[ 1. 0.1 ] [ 1. 0.31985974] [ 1. 0.53971948] [ 1. 0.75957922] [ 1. 0.97943896] [ 1. 1.19929871] [ 1. 1.41915845] [ 1. 1.63901819] [ 1. 1.85887793] [ 1. 2.07873767] [ 1. 2.29859741] [ 1. 2.51845715] [ 1. 2.73831689] [ 1. 2.95817663] [ 1. 3.17803638] [ 1. 3.39789612] [ 1. 3.61775586] [ 1. 3.8376156 ] [ 1. 4.05747534] [ 1. 4.27733508] [ 1. 4.49719482] [ 1. 4.71705456] [ 1. 4.9369143 ] [ 1. 5.15677405] In [15]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) # A_exp = np.ones((50,2)) A_exp = np . array ([ t_exp ** 0 , t_exp ]) # print(A_exp.shape) A_exp = A_exp . T print ( A_exp ) # print(A_exp) # A_exp[:,0] = np.log(t_exp[i]) # for i in range(50): # A_exp[i][0] = np.log(t_exp[i]) # A_exp[i][1] = t_exp[i] ybar = np . log ( y_exp ) # print(A_exp) # print(y_exp)
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 9/17 [ 1. 5.37663379] [ 1. 5.59649353] [ 1. 5.81635327] [ 1. 6.03621301] [ 1. 6.25607275] [ 1. 6.47593249] [ 1. 6.69579223] [ 1. 6.91565197] [ 1. 7.13551172] [ 1. 7.35537146] [ 1. 7.5752312 ] [ 1. 7.79509094] [ 1. 8.01495068] [ 1. 8.23481042] [ 1. 8.45467016] [ 1. 8.6745299 ] [ 1. 8.89438964] [ 1. 9.11424939] [ 1. 9.33410913] [ 1. 9.55396887] [ 1. 9.77382861] [ 1. 9.99368835] [ 1. 10.21354809] [ 1. 10.43340783] [ 1. 10.65326757] [ 1. 10.87312731]] Now, solve for the coefficients ${\bf \bar x}$ using the least_sq function you have defined above. Store the value of it in variable xbar . array([1.08057653, 0.50197379]) Check your answers: You can now use the change of variables to determine the coefficients $x_1$ and $x_2$ from the equation $y = x_1\,e^{x_2 \, t}$ Store the value of them in variables x1_exp and x2_exp respectively. Hint: You will need to use numpy.exp . Plot the points and the fit together: [<matplotlib.lines.Line2D at 0x7f202ddf52e0>] In [47]: xbar = least_sq ( A_exp , ybar ) xbar Out[47]: In [54]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) # x1_exp = np.linalg.solve((A_exp.T @ A_line), (A_line.T@y_line)) x1_exp = np . exp ( xbar [ 0 ]) x2_exp = xbar [ 1 ] In [55]: plt . plot ( t_exp , y_exp , 'o' ) plt . plot ( t_exp , x1_exp * np . exp ( t_exp * x2_exp )) Out[55]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 10/17 Using real data: climate change
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 11/17
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 12/17 In addition to fitting functions to data points, we can use least squares to make predictions about future events. In this example, we have a dataset containing the extent of arctic sea ice for each year since 1979 (given as the average during each October). We will fit least squares models to some of this data, using them to predict the extent of arctic sea ice in future years. This is based on "Using Data from Climate Science to Teach Introductory Statistics" and uses NOAA data contributed by Florence Fetterer , et al. year mo data-type region extent area 0 1979 10 Goddard N 8.75 6.19 1 1980 10 Goddard N 9.18 6.50 2 1981 10 Goddard N 8.86 6.27 3 1982 10 Goddard N 9.42 6.67 4 1983 10 Goddard N 9.33 6.79 In [56]: import pandas as pd ice_data = pd . read_csv ( "N_10_extent_v3.0.csv" , dtype = { 'year' : np . int64 , 'extent' : np . do ice_data . head () # print out first 5 rows of the data Out[56]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 13/17 We have imported a data set for you using Pandas , a library that provides routines for manipulating data tables. You do not need to know how to use Pandas for this activity or course . Generally, Pandas is useful for organizing large, heterogeneous data sets (for which NumPy falls short). array([1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021]) array([8.75, 9.18, 8.86, 9.42, 9.33, 8.56, 8.55, 9.48, 9.05, 9.13, 8.83, 8.48, 8.54, 9.32, 8.79, 8.92, 7.83, 9.16, 8.34, 8.45, 8.6 , 8.38, 8.3 , 8.16, 7.85, 7.93, 7.35, 7.54, 6.04, 7.35, 6.92, 6.98, 6.46, 5.89, 7.45, 7.23, 6.97, 6.08, 6.77, 6.13, 5.73, 5.33, 6.77]) We can use the first 22 years of the ice data to fit a linear function, and then predict future values of ice extent. Below, we plot the years 1979 to 2000 with the corresponding ice extents. [<matplotlib.lines.Line2D at 0x7f202c1bb370>] Check your answers: Using your least_sq function and general knowledge of least squares, fit a linear model to the ice extent: $$y = x_1 \,t + x_2$$ Store your coefficients in the array x_ice_linear . In [57]: year = ice_data [ 'year' ] . values year Out[57]: In [58]: extent = ice_data [ ' extent' ] . values extent Out[58]: In [59]: n_ice = 22 plt . xlabel ( "Year" ) plt . ylabel ( "Arctic Sea Ice Extent (1,000,000 sq km)" ) plt . plot ( year [: n_ice ], extent [: n_ice ], 'o' ) Out[59]:
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 14/17 Below is the plot of the 22 given points with those corresponding from the least squares model: [<matplotlib.lines.Line2D at 0x7f202c0f4fd0>] Suppose we want to use the same model (linear least squares fitted to years 1979 to 2000) to predict the ice extent beyond year 2000. For example, let's consider data from years 1979 to 2021: array([1979., 1980., 1981., 1982., 1983., 1984., 1985., 1986., 1987., 1988., 1989., 1990., 1991., 1992., 1993., 1994., 1995., 1996., 1997., 1998., 1999., 2000., 2001., 2002., 2003., 2004., 2005., 2006., 2007., 2008., 2009., 2010., 2011., 2012., 2013., 2014., 2015., 2016., 2017., 2018., 2019., 2020., 2021.]) Check your answers: In [64]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) t_ice = year [: n_ice ] # definition of t y_ice = extent [: n_ice ] # definition of y # Determine x_ice_linear using your least_sq() function A_ice = np . ones (( n_ice , 2 )) for i in range ( n_ice ): A_ice [ i ][ 0 ] = t_ice [ i ] A_ice [ i ][ 1 ] = 1 # print(A_ice) x_ice_linear = least_sq ( A_ice , y_ice ) In [65]: line_model = x_ice_linear [ 0 ] * t_ice + x_ice_linear [ 1 ] plt . xlabel ( "Year" ) plt . ylabel ( "Arctic Sea Ice Extent (1,000,000 sq km)" ) plt . plot ( year [: n_ice ], extent [: n_ice ], 'o' ) plt . plot ( t_ice , line_model , 's' ) Out[65]: In [66]: years_plus = np . linspace ( 1979 , 2021 , 43 ) years_plus Out[66]:
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 15/17 Determine the predicted ice extent values for the years 1979 to 2021 according to your x_ice_linear model from above. Store these values in the array ice_plus . Hint: Do not recompute the model using the additional ice extent data. We can conceive the data from years 1979 to 2000 as our training set and the data from years 2001 to 2021 as our testing/validation set. You can click here learn more about training set vs. testing set vs. validation set in data analysis. Below, we plot each the actual and predicted data of ice extent for the years 1979 to 2021. [<matplotlib.lines.Line2D at 0x7f202c053c10>] How does the predicted ice extent differ from the actual ice extent for each year? What is the estimated ice extent in 2021? What was the actual value? 7.847075098814301 6.77 The linear model does well until year 2000 (i.e. on the training set), then it fails to account for an apparent increase in the rate of ice melt. Let's use quadratic model to try improving this: In [68]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) # print(t_ice) ice_plus = x_ice_linear [ 0 ] * years_plus + x_ice_linear [ 1 ] In [69]: plt . xlabel ( "Year" ) plt . ylabel ( "Arctic Sea Ice Extent (1,000,000 sq km)" ) plt . plot ( year , extent , 'o' ) plt . plot ( years_plus , ice_plus ) Out[69]: In [70]: ice_plus [ - 1 ] Out[70]: In [71]: extent [ - 1 ] Out[71]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 16/17 $$ y = x_1 \, t^2 + x_2 \,t + x_3 $$ Check your answers: Using the same y_ice and t_ice as before (i.e. the training set of years 1979 to 2000), fit the quadratic model above to the ice extent (that is, solve for $x_1$, $x_2$, and $x_3$). Store your coefficients in the array x_ice_quad . We plot the quadratic model below. [<matplotlib.lines.Line2D at 0x7f202c02be20>] This model better captures the ice extent beyond year 2000. Try this: In what year does this quadratic model predict that the ice extent will reach 0 (i.e. the artic ice will have completely melted)? In [74]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) A_ice2 = np . ones (( n_ice , 3 )) for i in range ( n_ice ): A_ice2 [ i ][ 0 ] = t_ice [ i ] ** 2 A_ice2 [ i ][ 1 ] = t_ice [ i ] A_ice2 [ i ][ 2 ] = 1 # print(A_ice) x_ice_quad = least_sq ( A_ice2 , y_ice ) In [75]: quad_model = x_ice_quad [ 0 ] * years_plus ** 2 + x_ice_quad [ 1 ] * years_plus + x_ice_quad [ 2 ] plt . xlabel ( "Year" ) plt . ylabel ( "Arctic Sea Ice Extent (1,000,000 sq km)" ) plt . plot ( year , extent , 'o' ) plt . plot ( years_plus , quad_model ) Out[75]:
12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 17/17 Solve $x_1\,t^2 + x_2\,t + x_3 = 0$ with your group to compute the answer below. Hint: You seek the solutions to a quadratic equation, the analytical formula for which you have likely memorized. In [36]: