a6-solution
pdf
keyboard_arrow_up
School
Rumson Fair Haven Reg H *
*We aren’t endorsed by this school
Course
101
Subject
Statistics
Date
Nov 24, 2024
Type
Pages
7
Uploaded by CoachRiverTiger30
Assignment 6: Linear Model Selection
SDS293 - Machine Learning
Due: 1 November 2017 by 11:59pm
Conceptual Exercises
6.8.2 (p. 259 ISLR)
For parts each of the following, indicate whether each method is
more or less flexible
than least
squares. Describe how each method’s trade-off between bias and variance impacts its prediction
accuracy. Justify your answers.
(a) The lasso
Solution:
Puts a budget constraint on least squares. It is therefore less flexible. The lasso
will have improved prediction accuracy when its increase in bias is less than its decrease in
variance.
(b) Ridge regression
Solution:
For the same reason as above, this method is also less flexible. Ridge regression
will have improved prediction accuracy when its increase in bias is less than its decrease in
variance.
(c) Non-linear methods (PCR and PLS)
Solution:
Non-linear methods are more flexible and will give improved prediction accuracy
when their increase in variance are less than their decrease in bias.
6.8.5 (p. 261)
Ridge regression tends to give similar coefficient values to correlated variables, whereas the lasso
may give quite different coefficient values to correlated variables. We will now explore this property
in a very simple setting.
Suppose that
n
= 2
, p
= 2
, x
11
=
x
12
, x
21
=
x
22
.
Furthermore, suppose that
y
1
+
y
2
= 0 and
x
11
+
x
21
= 0 and
x
12
+
x
22
= 0, so that the estimate for the intercept in a least squares, ridge
regression, or lasso model is zero:
ˆ
β
0
= 0.
1
(a) Write out the ridge regression optimization problem in this setting.
Solution:
In general, Ridge regression optimization looks like:
min
1
...n
X
i
(
y
i
-
ˆ
β
0
-
1
...p
X
j
ˆ
β
j
x
j
)
2
+
λ
1
...p
X
i
ˆ
β
2
i
In this case,
ˆ
β
0
= 0
and
n
=
p
= 2
. So, the optimization simplifies to:
min
h
(
y
1
-
ˆ
β
1
x
11
-
ˆ
β
2
x
12
)
2
+ (
y
2
-
ˆ
β
1
x
21
-
ˆ
β
2
x
22
)
2
+
λ
(
ˆ
β
2
1
+
ˆ
β
2
2
)
i
(b) Argue that in this setting, the ridge coefficient estimates satisfy
ˆ
β
1
=
ˆ
β
2
.
Solution:
We know the following:
x
11
=
x
12
, so we’ll call that
x
1
, and
x
21
=
x
22
, so we’ll
call that
x
2
. Plugging this into the above, we get:
min
h
(
y
1
-
ˆ
β
1
x
1
-
ˆ
β
2
x
1
)
2
+ (
y
2
-
ˆ
β
1
x
2
-
ˆ
β
2
x
2
)
2
+
λ
(
ˆ
β
2
1
+
ˆ
β
2
2
)
i
Taking the partial derivatives of the above with respect to
ˆ
β
1
and
ˆ
β
2
and setting them equal
to 0 will give us the point at which the function is minimized. Doing this, we find:
ˆ
β
1
(
x
2
1
+
x
2
2
+
λ
) +
ˆ
β
2
(
x
2
1
+
x
2
2
)
-
y
1
x
1
-
y
2
x
2
= 0
and
ˆ
β
1
(
x
2
1
+
x
2
2
) +
ˆ
β
2
(
x
2
1
+
x
2
2
+
λ
)
-
y
1
x
1
-
y
2
x
2
= 0
Since the right-hand side of both equations is identical, we can set the two left-hand sides
equal to one another:
ˆ
β
1
(
x
2
1
+
x
2
2
+
λ
) +
ˆ
β
2
(
x
2
1
+
x
2
2
)
-
y
1
x
1
-
y
2
x
2
=
ˆ
β
1
(
x
2
1
+
x
2
2
) +
ˆ
β
2
(
x
2
1
+
x
2
2
+
λ
)
-
y
1
x
1
-
y
2
x
2
and then cancel out common terms:
ˆ
β
1
(
x
2
1
+
x
2
2
+
λ
) +
ˆ
β
2
(
x
2
1
+
x
2
2
)
-
y
1
x
1
-
y
2
x
2
=
ˆ
β
1
(
x
2
1
+
x
2
2
) +
ˆ
β
2
(
x
2
1
+
x
2
2
+
λ
)
-
y
1
x
1
-
y
2
x
2
ˆ
β
1
(
x
2
1
+
x
2
2
) +
ˆ
β
1
λ
+
ˆ
β
2
(
x
2
1
+
x
2
2
) =
ˆ
β
1
(
x
2
1
+
x
2
2
) +
ˆ
β
2
(
x
2
1
+
x
2
2
) +
ˆ
β
2
λ
ˆ
β
1
λ
+
ˆ
β
2
(
x
2
1
+
x
2
2
) =
ˆ
β
2
(
x
2
1
+
x
2
2
) +
ˆ
β
2
λ
ˆ
β
1
λ
=
ˆ
β
2
λ
Thus,
ˆ
β
1
=
ˆ
β
2
.
(c) Write out the lasso optimization problem in this setting.
Solution:
min
h
(
y
1
-
ˆ
β
1
x
11
-
ˆ
β
2
x
12
)
2
+ (
y
2
-
ˆ
β
1
x
21
-
ˆ
β
2
x
22
)
2
+
λ
(
|
ˆ
β
1
|
+
|
ˆ
β
2
|
)
i
2
(d) Argue that in this setting, the lasso coefficients
ˆ
β
1
and
ˆ
β
2
are not unique – in other words,
there are many possible solutions to the optimization problem in (c). Describe these solutions.
Solution:
One way to demonstrate that these solutions are not unique is to make a geometric
argument. To make things easier, we’ll use the alternate form of Lasso constraints that we
saw in class, namely:
|
ˆ
β
1
|
+
|
ˆ
β
2
|
< s
.
If we were to plot these constraints, they take the
familiar shape of a diamond centered at the origin
(0
,
0)
.
Next we’ll consider the squared optimization constraint, namely:
(
y
1
-
ˆ
β
1
x
11
-
ˆ
β
2
x
12
)
2
+ (
y
2
ˆ
β
1
x
21
-
ˆ
β
2
x
22
)
2
Using the facts we were given regarding the equivalence of many of the variables, we can
simplify down to the following optimization:
min
h
2(
y
1
-
(
ˆ
β
1
+
ˆ
β
2
)
x
11
i
This optimization problem has a minimum at
ˆ
β
1
+
ˆ
β
2
=
y
1
x
11
, which defines a line parallel
to one edge of the Lasso-diamond
ˆ
β
1
+
ˆ
β
2
=
s
.
As
ˆ
β
1
and
ˆ
β
2
vary along the line
ˆ
β
1
+
ˆ
β
2
=
y
1
x
11
, these contours touch the Lasso-diamond
edge
ˆ
β
1
+
ˆ
β
2
=
s
at different points. As a result, the entire edge
ˆ
β
1
+
ˆ
β
2
=
s
is a potential
solution to the Lasso optimization problem!
A similar argument holds for the opposite Lasso-diamond edge, defined by:
ˆ
β
1
+
ˆ
β
2
=
-
s
.
Thus, the Lasso coefficients are not unique.
The general form of solution can be given by
two line segments:
ˆ
β
1
+
ˆ
β
2
=
s
;
ˆ
β
1
≥
0;
ˆ
β
2
≥
0 and
ˆ
β
1
+
ˆ
β
2
=
-
s;
ˆ
β
1
≤
0;
ˆ
β
2
≤
0
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Applied Exercises
6.8.9 (p. 263 ISLR)
In this exercise, we will predict the number of applications received using the other variables in the
College
data set. For consistency, please use
set.seed(11)
before beginning.
(a) Split the data set into a training set and a test set.
(b) Fit a linear model using least squares on the training set, and report the test error obtained.
(c) Fit a ridge regression model on the training set, with
λ
chosen by cross-validation. Report
the test error obtained.
(d) Fit a lasso model on the training set, with
λ
chosen by cross-validation. Report the test error
obtained, along with the number of non-zero coefficient estimates.
(e) Fit a PCR model on the training set, with
M
chosen by cross-validation.
Report the test
error obtained, along with the value of
M
selected by cross-validation.
(f) Fit a PLS model on the training set, with
M
chosen by cross-validation.
Report the test
error obtained, along with the value of
M
selected by cross-validation.
(g) Comment on the results you obtained. How accurately can we predict the number of college
applications received? Is there much difference among the test errors resulting from these five
approaches?
4
A6 Applied Solutions
6.8.9 (a)
library
(ISLR)
library
(dplyr)
Check to make sure we don’t have any null values
sum
(
is.na
(College))
## [1] 0
Split the data set into a training set and a test set.
set.seed
(
1
)
train = College %>%
sample_frac
(
0.5
)
test = College %>%
setdiff
(train)
6.8.9 (b)
Fit a linear model using least squares on the training set, and report the test error obtained.
lm_fit =
lm
(Apps~.,
data =
train)
lm_pred =
predict
(lm_fit, test)
mean
((test[,
"Apps"
] - lm_pred)^
2
)
## [1] 1108531
6.8.9 (c)
Fit a ridge regression model on the training set, with
λ
chosen by cross-validation. Report the test error
obtained, along with the number of non-zero coefficient estimates.
library
(glmnet)
# Build model matrices for
# test and training data
train_mat =
model.matrix
(Apps~.,
data =
train)
test_mat =
model.matrix
(Apps~.,
data =
test)
# Find best lambda using cross-validation,
# alpha = 0 --> use ridge regression
grid =
10
^
seq
(
4
, -
2
,
length=
100
)
mod_ridge =
cv.glmnet
(train_mat, train[,
"Apps"
],
alpha =
0
,
lambda =
grid,
thresh =
1e-12
)
1
lambda_best_ridge = mod_ridge$lambda.min
# Predict on test data, report error
ridge_pred =
predict
(mod_ridge,
newx =
test_mat,
s =
lambda_best_ridge)
mean
((test[,
"Apps"
] - ridge_pred)^
2
)
## [1] 1108512
6.8.9 (d)
Fit a lasso model on the training set, with
λ
chosen by cross-validation. Report the test error obtained, along
with the number of non-zero coefficient estimates.
# Find best lambda using cross-validation,
# alpha = 1 --> use lasso
mod_lasso =
cv.glmnet
(train_mat, train[,
"Apps"
],
alpha =
1
,
lambda =
grid,
thresh =
1e-12
)
lambda_best_lasso = mod_lasso$lambda.min
# Predict on test data, report error
lasso_pred =
predict
(mod_lasso,
newx =
test_mat,
s =
lambda_best_lasso)
mean
((test[,
"Apps"
] - lasso_pred)^
2
)
## [1] 1028718
predict
(mod_lasso,
newx =
test_mat,
s =
lambda_best_lasso,
type=
"coefficients"
)
## 19 x 1 sparse Matrix of class "dgCMatrix"
##
1
## (Intercept) -4.248125e+02
## (Intercept)
.
## PrivateYes
-4.955003e+02
## Accept
1.540306e+00
## Enroll
-3.900157e-01
## Top10perc
4.779689e+01
## Top25perc
-7.926581e+00
## F.Undergrad -9.846932e-03
## P.Undergrad
.
## Outstate
-5.231286e-02
## Room.Board
1.880308e-01
## Books
1.265938e-03
## Personal
.
## PhD
-4.137294e+00
## Terminal
-3.184316e+00
## S.F.Ratio
.
## perc.alumni -2.181304e+00
## Expend
3.193679e-02
## Grad.Rate
2.877667e+00
6.8.9 (e)
Results for OLS, Lasso, Ridge are comparable. Lasso reduces the
P.Undergrad
,
Personal
and
S.F.Ratio
variables to zero and shrinks coefficients of other variables. Below are the test
R
2
values for all models.
2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
test_avg =
mean
(test[,
"Apps"
])
lm_test_r2 =
1
-
mean
((test[,
"Apps"
] - lm_pred)^
2
) /
mean
((test[,
"Apps"
] - test_avg)^
2
)
ridge_test_r2 =
1
-
mean
((test[,
"Apps"
] - ridge_pred)^
2
) /
mean
((test[,
"Apps"
] - test_avg)^
2
)
lasso_test_r2 =
1
-
mean
((test[,
"Apps"
] - lasso_pred)^
2
) /
mean
((test[,
"Apps"
] - test_avg)^
2
)
barplot
(
c
(lm_test_r2,
ridge_test_r2,
lasso_test_r2),
ylim=
c
(
0
,
1
),
names.arg =
c
(
"OLS"
,
"Ridge"
,
"Lasso"
),
main =
"Test R-squared"
)
abline
(
h =
0.9
,
col =
"red"
)
OLS
Ridge
Lasso
Test R-squared
0.0
0.2
0.4
0.6
0.8
1.0
Since the test
R
2
values for all three models are above .90, they all predict the number of college applications
with high accuracy.
3
Related Documents
Related Questions
A hiring company considers hiring your professional services to consult with you about the experiment you are conducting. The company is interested in knowing what effect the temperature has on the resistance of the product (higher resistance is better for the product). The experimental data are:
Find the best-fitting least squares line and parabola. Conduct an analysis to choose which of the two models (linear or quadratic) best predicts the behavior of the experiment. Based on what you found in your analysis, could you suggest an optimal operating temperature to the company? Argue your answers.
arrow_forward
A large city hospital conducted a study to investigate the relationship between the number of unauthorized days that employees are absent per year and the distance (miles) between home and work for the employees. A sample of 10 employees was selected and the following data were collected.
Develop a scatter diagram for these data. Does a linear relationship appear reasonable? Explain.
Develop the least squares estimated regression equation that relates the distance to work to the number of days absent.
Predict the number of days absent for an employee that lives 5 miles from the hospital.
arrow_forward
What is the least-squares line approximation to a set of datapoints? How is the line determined?
arrow_forward
The quadratic model for the given data is wrong.
arrow_forward
Select the equation of the least squares line for the data: (51.00, 1.0), (48.75, 2.5), (52.50, .5), (46.50, 5.0), (45.00, 4.5), (41.25, 6.5), (43.50, 5.0).
a) ŷ = -28.956 − 0.54067x
b) ŷ = 28.956 − 0.59474x
c) ŷ = 0.54067x − 28.956
d) ŷ = 31.852 − 0.59474x
e) ŷ = 28.956 − 0.54067x
f) None of the above
arrow_forward
Interpreting technology: The following MINITAB output presents the least-squares regression line for predicting the score on a
final exam from the score on a midterm exam. Predict the final exam score for a student who scored 64 on the midterm. Round the answer to at least two decimal places.
arrow_forward
Explain the Formula for the Two Stage Least Squares estimator?
arrow_forward
True or false If false, explain briefly. a) We choose the linear model that passes through the most data points on the scatterplot. b) The residuals are the observed y-values minus the y-values predicted by the linear model. c) Least squares means that the square of the largest residual is as small as it could possibly be.
arrow_forward
The data in the table represent the weights of various domestic cars and their miles per gallon in the city for the 2008 model year. For these data,
the least-squares regression line is y = - 0.006x + 41.337. A twelfth car weighs 3,425 pounds and gets 12 miles per gallon.
(a) The coefficient of determination of the expanded data set is R =
%.
(Round to one decimal place as needed.)
Data Table
How does the addition of the twelfth car to the data set affect R?
Miles per
Weight
|(pounds), x
A. It increases R.
Gallon, y
Car 1
3,771
22
B. It decreases R2.
Car 2
3,990
19
C.
It does not affect R2.
3,534
Car 3
20
Car 4
3,172
24
(b) Is the point corresponding to the twelfth car influential? Is it an outlier?
Car 5
2,579
27
Car 6
3,730
20
A. No, it is not influential. Yes, it is an outlier.
Car 7
2,605
25
B. Yes, it is influential. No, it is not an outlier.
Car 8
3,777
19
C. No, it is not influential. No, it is not an outlier.
Car 9
3,308
19
D. Yes, it is influential. Yes, it is an outlier.
Car…
arrow_forward
The table gives the weight r (thousands of pounds) and available heat energy y (million BTU) of
a standard cord of various species of wood typically used for heating. Perform a complete
analysis of the data in analogy with the discussion in this section (that is, make a scatter plot, do
preliminary computations, find the least squares regression line, find SSE, se, and r, and so
on). In the hypothesis test, use as the alternative hypothesis B, > 0, and test at the 5% level of
significance. Use confidence level 95% for the confidence interval for B1. Construct 95%
confidence and predictions intervals at xp = 5 at the end.
x 3.37 3.50 4.29 4.00 4.64
y 23.6 17.5 20.1 21.6 28.1
x 4.99 4.94 5.48 3.26 4.16
y 25.3 27.0 30.7 18.9 20.7
arrow_forward
Write the equation of the least-squares regression line. Use the full accuracy shown in the MINITAB output (do not round
your answers).
arrow_forward
The data in the table represent the weights of various domestic cars and their miles per gallon in the city for the 2008 model year. For these data,
the least-squares regression line is y = - 0.006x + 43.875. A twelfth car weighs 3,425 pounds and gets 13 miles per gallon.
(a) Compute the coefficient of determination of the expanded data set. What effect does the addition of the twelfth car to the data set have on
R?
(b) Is the point corresponding to the twelfth car influential? Is it an outlier?
Data Table
Click the icon to view the data table.
.....
(a) The coefficient of determination of the expanded data set is R = %.
Weight
(pounds), x
Miles per
(Round to one decimal place as needed.)
Gallon, y
Car 1
3,770
20
Car 2
3,980
19
Car 3
3,530
19
Car 4
3,175
22
Car 5
2,580
27
Car 6
3,729
20
Car 7
2,607
26
Car 8
3,776
19
Car 9
3,311
22
Car 10
2,999
27
Car 11
2,755
27
arrow_forward
A pediatrician wants to determine the relationship that exists between achild’s height, x, and head circumference, y. She randomly selects 11 children from her practice, measures their heights and head circumferences, and conducts the least-squares regression analysis with the simple linear model using StatCrunch. The output is given below:
(a) Write down the equation of the least-squares regression line treating height as the explanatory variable and head circumference as the response variable.
(b) Interpret the slope and y-intercept, if appropriate.
(c) Use the regression equation to predict the head circumference of a child who is 25 inches tall. Assume that the regression model is applicable.(d) It is observed that one child who is 25 inches tall has a head circumference of 17.5 inches. Is the observed value above or below average among all children with heights of 25 inches?
arrow_forward
3. Companies in the U.S. car rental market vary greatly in terms of the size of the
fleet, the number of locations, and annual revenue. In 2011 Hertz had 320,000
cars in service and annual revenue of approximately $4.2 billion. The following
data show the number of cars in service (1000s) and the annual revenue ($
millions) for six smaller car rental companies (Auto Rental News website, August
7, 2012).
Company
U-Save Auto Rental System, Inc.
Payless Car Rental System, Inc.
ACE Rent A Car
Rent-A-Wreck of America
Triangle Rent-A-Car
Affordable/Sensible
Cars (1000s)
11.5
10.0
9.0
5.5
4.2
3.3
Revenue ($
millions)
118
135
100
37
40
32
arrow_forward
2a) • Using a Graphing calculator or spreadsheet program create a least squares regression line of the tuition for 4 years versus the average salary after 10 years, define any variables that you used. • Identify and interpret the correlation coefficient and coefficient of determination. • By looking at the least squares regression line, determine the college that you believe is the best value. Explain your reasoning. • Using the college (Harvard University) you chose, identify and interpret its residual based on the least squares regression model.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage
![Text book image](https://www.bartleby.com/isbn_cover_images/9781305652224/9781305652224_smallCoverImage.gif)
Trigonometry (MindTap Course List)
Trigonometry
ISBN:9781305652224
Author:Charles P. McKeague, Mark D. Turner
Publisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781305658004/9781305658004_smallCoverImage.gif)
Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning
Related Questions
- A hiring company considers hiring your professional services to consult with you about the experiment you are conducting. The company is interested in knowing what effect the temperature has on the resistance of the product (higher resistance is better for the product). The experimental data are: Find the best-fitting least squares line and parabola. Conduct an analysis to choose which of the two models (linear or quadratic) best predicts the behavior of the experiment. Based on what you found in your analysis, could you suggest an optimal operating temperature to the company? Argue your answers.arrow_forwardA large city hospital conducted a study to investigate the relationship between the number of unauthorized days that employees are absent per year and the distance (miles) between home and work for the employees. A sample of 10 employees was selected and the following data were collected. Develop a scatter diagram for these data. Does a linear relationship appear reasonable? Explain. Develop the least squares estimated regression equation that relates the distance to work to the number of days absent. Predict the number of days absent for an employee that lives 5 miles from the hospital.arrow_forwardWhat is the least-squares line approximation to a set of datapoints? How is the line determined?arrow_forward
- The quadratic model for the given data is wrong.arrow_forwardSelect the equation of the least squares line for the data: (51.00, 1.0), (48.75, 2.5), (52.50, .5), (46.50, 5.0), (45.00, 4.5), (41.25, 6.5), (43.50, 5.0). a) ŷ = -28.956 − 0.54067x b) ŷ = 28.956 − 0.59474x c) ŷ = 0.54067x − 28.956 d) ŷ = 31.852 − 0.59474x e) ŷ = 28.956 − 0.54067x f) None of the abovearrow_forwardInterpreting technology: The following MINITAB output presents the least-squares regression line for predicting the score on a final exam from the score on a midterm exam. Predict the final exam score for a student who scored 64 on the midterm. Round the answer to at least two decimal places.arrow_forward
- Explain the Formula for the Two Stage Least Squares estimator?arrow_forwardTrue or false If false, explain briefly. a) We choose the linear model that passes through the most data points on the scatterplot. b) The residuals are the observed y-values minus the y-values predicted by the linear model. c) Least squares means that the square of the largest residual is as small as it could possibly be.arrow_forwardThe data in the table represent the weights of various domestic cars and their miles per gallon in the city for the 2008 model year. For these data, the least-squares regression line is y = - 0.006x + 41.337. A twelfth car weighs 3,425 pounds and gets 12 miles per gallon. (a) The coefficient of determination of the expanded data set is R = %. (Round to one decimal place as needed.) Data Table How does the addition of the twelfth car to the data set affect R? Miles per Weight |(pounds), x A. It increases R. Gallon, y Car 1 3,771 22 B. It decreases R2. Car 2 3,990 19 C. It does not affect R2. 3,534 Car 3 20 Car 4 3,172 24 (b) Is the point corresponding to the twelfth car influential? Is it an outlier? Car 5 2,579 27 Car 6 3,730 20 A. No, it is not influential. Yes, it is an outlier. Car 7 2,605 25 B. Yes, it is influential. No, it is not an outlier. Car 8 3,777 19 C. No, it is not influential. No, it is not an outlier. Car 9 3,308 19 D. Yes, it is influential. Yes, it is an outlier. Car…arrow_forward
- The table gives the weight r (thousands of pounds) and available heat energy y (million BTU) of a standard cord of various species of wood typically used for heating. Perform a complete analysis of the data in analogy with the discussion in this section (that is, make a scatter plot, do preliminary computations, find the least squares regression line, find SSE, se, and r, and so on). In the hypothesis test, use as the alternative hypothesis B, > 0, and test at the 5% level of significance. Use confidence level 95% for the confidence interval for B1. Construct 95% confidence and predictions intervals at xp = 5 at the end. x 3.37 3.50 4.29 4.00 4.64 y 23.6 17.5 20.1 21.6 28.1 x 4.99 4.94 5.48 3.26 4.16 y 25.3 27.0 30.7 18.9 20.7arrow_forwardWrite the equation of the least-squares regression line. Use the full accuracy shown in the MINITAB output (do not round your answers).arrow_forwardThe data in the table represent the weights of various domestic cars and their miles per gallon in the city for the 2008 model year. For these data, the least-squares regression line is y = - 0.006x + 43.875. A twelfth car weighs 3,425 pounds and gets 13 miles per gallon. (a) Compute the coefficient of determination of the expanded data set. What effect does the addition of the twelfth car to the data set have on R? (b) Is the point corresponding to the twelfth car influential? Is it an outlier? Data Table Click the icon to view the data table. ..... (a) The coefficient of determination of the expanded data set is R = %. Weight (pounds), x Miles per (Round to one decimal place as needed.) Gallon, y Car 1 3,770 20 Car 2 3,980 19 Car 3 3,530 19 Car 4 3,175 22 Car 5 2,580 27 Car 6 3,729 20 Car 7 2,607 26 Car 8 3,776 19 Car 9 3,311 22 Car 10 2,999 27 Car 11 2,755 27arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:CengageTrigonometry (MindTap Course List)TrigonometryISBN:9781305652224Author:Charles P. McKeague, Mark D. TurnerPublisher:Cengage LearningElementary Linear Algebra (MindTap Course List)AlgebraISBN:9781305658004Author:Ron LarsonPublisher:Cengage Learning
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage
![Text book image](https://www.bartleby.com/isbn_cover_images/9781305652224/9781305652224_smallCoverImage.gif)
Trigonometry (MindTap Course List)
Trigonometry
ISBN:9781305652224
Author:Charles P. McKeague, Mark D. Turner
Publisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781305658004/9781305658004_smallCoverImage.gif)
Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning