a5-solution
pdf
keyboard_arrow_up
School
Rumson Fair Haven Reg H *
*We aren’t endorsed by this school
Course
101
Subject
Statistics
Date
Nov 24, 2024
Type
Pages
5
Uploaded by CoachRiverTiger30
Assignment 5: Linear Model Selection
SDS293 - Machine Learning
Due: 24 Oct 2017 by 11:59pm
Conceptual Exercises
6.8.1 (p. 259 ISLR)
We perform best subset, forward stepwise, and backward stepwise selection on a single data set.
For each approach, we obtain
p
+1 models, containing 0
,
1
,
2
, ..., p
predictors. Explain your answers:
(a) Which of the three models with
k
predictors has the smallest
training RSS
?
Solution:
Best subset selection has the smallest training RSS. Both forward and backward
selection determine models that depend on which predictors they pick first as they iterate
toward the
k
th
model, meaning that a poor choice early on cannot be undone.
(b) Which of the three models with k predictors has the smallest
test RSS
?
Solution:
Best subset selection
may
have the smallest test RSS because it considers more
models then the other methods. However, the other models might have better luck picking a
model that fits the test data better, as they would be less subject to overfitting. The outcome
will depend more heavily on the choice of test set / validation method than on the selection
method.
(c) True or False: the predictors in Model 1
are a subset of
the predictors in Model 2:
Model 1
Model 2
T/F
i.
Forward selection,
k
variables
Forward selection,
k
+ 1 variables
True
ii.
Backward selection,
k
variables
Backward selection,
k
+ 1 variables
True
iii.
Backward selection,
k
variables
Forward selection,
k
+ 1 variables
False
iv.
Forward selection,
k
variables
Backward selection,
k
+ 1 variables
False
v.
Best subset selection,
k
variables
Best subset selection,
k
+ 1 variables
False
Explain your reasoning.
1
Applied Exercises
6.8.8 parts a-d (p. 262-263 ISLR)
In this exercise, we will generate simulated data, and will then use this data to perform best subset
selection.
(a) Generate a predictor
X
of length n=100, as well as a noise vector
of length n=100.
Solution:
> set.seed(1)
> X=rnorm(100)
> eps=rnorm(100)
(b) Generate a response vector
Y
of length n=100 according to the model
Y
=
β
0
+
β
1
*
X
+
β
2
*
X
2
+
β
3
*
X
3
+
where
β
0
,
β
1
,
β
2
, and
β
3
are constants of your choice.
Solution:
Selecting
β
0
= 3
,
β
1
= 2
,
β
2
=
-
3
and
β
3
= 0
.
3
:
> beta0=3
> beta1=2
> beta2=-3
> beta3=0.3
> Y=beta0 + beta1 * X + beta2 * X
^
2 + beta3 * X
^
3 + eps
(c) Perform best subset selection in order to choose the best model containing the predictors
X, X
2
, ..., X
10
.
What is the best model obtained according to Cp, BIC, and adjusted
R
2
?
Show some plots to provide evidence for your answer, and report the coefficients of the best
model obtained.
Solution:
> library(leaps)
> data.full=data.frame(y=Y, x=X)
> mod.full=regsubsets(y
∼
poly(x, 10, raw=T), data=data.full, nvmax=10)
> mod.summary=summary(mod.full)
# Find the model size for best cp, BIC and adjr2
> min.cp=which.min(mod.summary
$
cp)
> min.bic=which.min(mod.summary
$
bic)
> max.adjr2=which.max(mod.summary
$
adjr2)
# Plot cp, BIC and adjr2
> plot(mod.summary
$
cp, xlab="Subset Size", ylab="Cp", pch=20, type="l")
> points(min.cp, mod.summary
$
cp[min.cp], pch=4, col="red", lwd=7)
> plot(mod.summary
$
bic, xlab="Subset Size", ylab="BIC", pch=20, type="l")
2
> points(min.bic, mod.summary
$
bic[min.bic], pch=4, col="red", lwd=7)
> plot(mod.summary
$
adjr2, xlab="Subset Size", ylab="adjr2", pch=20, type="l")
> points(max.adjr2, mod.summary
$
adjr2[max.adjr2], pch=4, col="red", lwd=7)
We find that all three criteria (Cp, BIC and Adjusted R2) criteria select 3-variable models.
The coefficients of the best 3-variable model are:
> coefficients(mod.full, id=3)
(Intercept)
poly(x, 10, raw=T)1
poly(x, 10, raw=T)2
poly(x, 10, raw=T)7
3.07627412
2.35623596
-3.16514887
0.01046843
(d) Repeat (c), using forward stepwise selection and also using backward stepwise selection. How
does your answer compare to the results in (c)?
Solution:
> mod.fwd=regsubsets(y
∼
poly(x, 10, raw=T), data=data.full, nvmax=10, method="forward")
> mod.bwd=regsubsets(y
∼
poly(x, 10, raw=T), data=data.full, nvmax=10, method="backward")
> fwd.summary=summary(mod.fwd)
> bwd.summary=summary(mod.bwd)
# Find best forward-selected model size
> min.cp.f=which.min(fwd.summary
$
cp)
> min.bic.f=which.min(fwd.summary
$
bic)
> max.adjr2.f=which.max(fwd.summary
$
adjr2)
# Find best backward-selected model size
> min.cp.b=which.min(bwd.summary
$
cp)
> min.bic.b=which.min(bwd.summary
$
bic)
> max.adjr2.b=which.max(bwd.summary
$
adjr2)
# Plot the statistics
> par(mfrow=c(3, 2))
# Forward Cp
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
> plot(fwd.summary
$
cp, xlab="Subset Size", ylab="Fwd Cp", pch=20, type="l")
> points(min.cp.f, fwd.summary
$
cp[min.cp.f], pch=4, col="red", lwd=7)
# Backward Cp
> plot(bwd.summary
$
cp, xlab="Subset Size", ylab="Bwd Cp", pch=20, type="l")
> points(min.cp.b, bwd.summary
$
cp[min.cp.b], pch=4, col="red", lwd=7)
# Forward BIC
> plot(fwd.summary
$
bic, xlab="Subset Size", ylab="Fwd BIC", pch=20, type="l")
> points(min.bic.f, fwd.summary
$
bic[min.bic.f], pch=4, col="red", lwd=7)
# Backward BIC
> plot(bwd.summary
$
bic, xlab="Subset Size", ylab="Bwd BIC", pch=20, type="l")
> points(min.bic.b, bwd.summary
$
bic[min.bic.b], pch=4, col="red", lwd=7)
# Forward Adj R
^
2
> plot(fwd.summary
$
adjr2, xlab="Subset Size", ylab="Fwd adjr2", pch=20, type="l")
> points(max.adjr2.f, fwd.summary
$
adjr2[max.adjr2.f], pch=4, col="red", lwd=7)
# Backward Adj R
^
2
> plot(bwd.summary
$
adjr2, xlab="Subset Size", ylab="Bwd adjr2", pch=20, type="l")
> points(max.adjr2.b, bwd.summary
$
adjr2[max.adjr2.b], pch=4, col="red", lwd=7)
We see that all statistics pick 3-variable models except backward selection with adjusted R2.
Here are the coefficients:
> coefficients(mod.fwd, id = 3)
4
(Intercept)
poly(x, 10)1
poly(x, 10)2
poly(x, 10)7
3.07627412
2.35623596
-3.16514887
0.01046843
> coefficients(mod.bwd, id = 3)
(Intercept)
poly(x, 10)1
poly(x, 10)2
poly(x, 10)9
3.078881355
2.419817953
-3.177235617
0.001870457
> coefficients(mod.bwd, id = 4)
(Intercept)
poly(x, 10)1
poly(x, 10)2
poly(x, 10)4
poly(x, 10)5
3.12902640
2.27105667
-3.32284363
0.04320229
0.05388957
Here forward stepwise picks X7 over X3. Backward stepwise with 3 variables picks X9 while
backward stepwise with 4 variables picks X4 and X7.
5
Related Documents
Related Questions
The quadratic model for the given data is wrong.
arrow_forward
Multiple linear regression
b)
arrow_forward
Is there a relationship between total team salary and the performance of football teams? For a recent season, a linear model predicting Wins
(out of 16 regular season games) from the total team Salary (SM) for 32 teams in a football league is Wins = -6.353 +0.105 Salary. Complete
parts a through h below.
a) What is the explanatory variable?
The explanatory variable is
because
b) What is the response variable?
The response variable is
because
c) What does the slope mean in this context?
in this league, team
(Type an integer or a decimal. Do not round.)
are, on average, about
higher for every
d) What does the y-intercept mean in this context? Is it meaningful?
V is
This
v meaningful because it
The y-intercept is the average
of a team in this league whose
is
(Type an integer or a decimal. Do not round.)
e) If one team spends $10 million more than another on salary, how many more games on average would the first team be predicted to win?
O game(s)
(Type an integer or a decimal. Do not…
arrow_forward
Can you answer A,B,C with clear answers. You can use the data in the second photo
arrow_forward
geBoard
Consumable Student Editic
Name: tyr poPrsrey Class:
Date:
Version: J
Algebra I EOC Review #3 WS Due 4-15-21(ALL WORK MUST BE SHOWN) In-person must
get approved by teacher before submitting in Classroom. Online may submit, then resubmit onc
Must show all work or explain steps(in detail) to receive any credit.
1. The data set shows the amount of funds raised and the number of participants in the fundraiser at the Family
House organization branches. Use a graphing calculator to find and graph an equation of the least-squares line
the data.(Linear Regression)
Family House Fundraiser
# of participants
Funds raised (S)
10
15
20
25
13
15
18
490 500 | 550 570 630 520 550 560
Yünu'a fomihy is staving at a campground th
arrow_forward
i www-awn.connectr
ChSky-igdoZUYR?1Vz8hg
Fast reactions: In a study of reaction times, the time to respond to a visual stimulus (x) and the time to respond to an auditory stimulus (y) were recorded for each of 6
subjects. Times were measured in thousandths of a second. The results are presented in the following table.
Visual
Auditory
125
134
115
124
125
126
105
105
95
138
100
152
Send data to Excel
Use the P-value method to test H:B, = 0 versus H, : B, <0. Can you conclude that visual response is useful in predicting auditory response? Use the a = 0.01 level of
significance and TI-84 Plus calculator.
2:16 PM
e.N
5/27/2021
O Search
远
arrow_forward
Please help on all parts of question 2 and all parts of question 3. Thank you!
arrow_forward
Multiple linear regression
a) please
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you

Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning

Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Related Questions
- The quadratic model for the given data is wrong.arrow_forwardMultiple linear regression b)arrow_forwardIs there a relationship between total team salary and the performance of football teams? For a recent season, a linear model predicting Wins (out of 16 regular season games) from the total team Salary (SM) for 32 teams in a football league is Wins = -6.353 +0.105 Salary. Complete parts a through h below. a) What is the explanatory variable? The explanatory variable is because b) What is the response variable? The response variable is because c) What does the slope mean in this context? in this league, team (Type an integer or a decimal. Do not round.) are, on average, about higher for every d) What does the y-intercept mean in this context? Is it meaningful? V is This v meaningful because it The y-intercept is the average of a team in this league whose is (Type an integer or a decimal. Do not round.) e) If one team spends $10 million more than another on salary, how many more games on average would the first team be predicted to win? O game(s) (Type an integer or a decimal. Do not…arrow_forward
- Can you answer A,B,C with clear answers. You can use the data in the second photoarrow_forwardgeBoard Consumable Student Editic Name: tyr poPrsrey Class: Date: Version: J Algebra I EOC Review #3 WS Due 4-15-21(ALL WORK MUST BE SHOWN) In-person must get approved by teacher before submitting in Classroom. Online may submit, then resubmit onc Must show all work or explain steps(in detail) to receive any credit. 1. The data set shows the amount of funds raised and the number of participants in the fundraiser at the Family House organization branches. Use a graphing calculator to find and graph an equation of the least-squares line the data.(Linear Regression) Family House Fundraiser # of participants Funds raised (S) 10 15 20 25 13 15 18 490 500 | 550 570 630 520 550 560 Yünu'a fomihy is staving at a campground tharrow_forwardi www-awn.connectr ChSky-igdoZUYR?1Vz8hg Fast reactions: In a study of reaction times, the time to respond to a visual stimulus (x) and the time to respond to an auditory stimulus (y) were recorded for each of 6 subjects. Times were measured in thousandths of a second. The results are presented in the following table. Visual Auditory 125 134 115 124 125 126 105 105 95 138 100 152 Send data to Excel Use the P-value method to test H:B, = 0 versus H, : B, <0. Can you conclude that visual response is useful in predicting auditory response? Use the a = 0.01 level of significance and TI-84 Plus calculator. 2:16 PM e.N 5/27/2021 O Search 远arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Linear Algebra: A Modern IntroductionAlgebraISBN:9781285463247Author:David PoolePublisher:Cengage LearningElementary Linear Algebra (MindTap Course List)AlgebraISBN:9781305658004Author:Ron LarsonPublisher:Cengage LearningBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt

Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning

Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt