homework_2
doc
keyboard_arrow_up
School
University of Houston *
*We aren’t endorsed by this school
Course
4322
Subject
Mathematics
Date
Feb 20, 2024
Type
doc
Pages
21
Uploaded by hongyumei411
MATH 4322 - Homework 2
Instructions
•
Due September 14, 2023 at 11:59 pm
•
Answer all questions fully
•
Submit the answers in one file, preferably PDF, then upload in Canvas.
•
These questions are from Introduction to Statistical Learning, 2nd edition
, chap-
ters 3 and 6.
Problem 1
The following output is based on predicting
sales
based on three media budgets,
TV
,
radio
, and
newspaper
.
Call:
lm(formula = sales ~ TV + radio + newspaper, data = Advertising)
Residuals:
Min
1Q
Median
3Q
Max
-8.8277 -0.8908 0.2418 1.1893 2.8292
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
2.938889
0.311908
9.422
<2e-16 ***
TV
0.045765
0.001395
32.809
<2e-16 ***
radio
0.188530
0.008611
21.893
<2e-16 ***
newspaper
-0.001037
0.005871
-0.177
0.86
---
Signif. codes:
0 '***'
0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.686 on 196 degrees of freedom
1
Multiple R-squared:
0.8972,
Adjusted R-squared:
0.8956
F-statistic: 570.3 on 3 and 196 DF,
p-value: < 2.2e-16
a.
Give the estimated model to predict sales.
Sales = 2.938889 + TV*0.045765 + radio*0.188530 – newspaper*0.001037
b.
Describe the null hypothesis to which the p-values given in the
Coefficients
table
correspond. Explain this in terms of the
sales
,
TV
,
radio
, and
newspaper
,
rather than in terms of the coe
icients of the linear model.
ff
Null hypotheses are accepted when the p-value of the coefficient or p-value of
overall model is less than or equal to 0.05 (5%). If the p-value associated with a
coefficient is less than 0.05, it is considered significant. In this data, all
coefficients except “newspaper” have p-values less than 0.05, which means
only “newspaper” does not significantly contribute to sales prediction. The
overall model’s p-value is also less than 0.05, indicating that the entire model is
suitable for predicting sales. The multiple R-squared value of 0.8972 means that
roughly 89.72% of the variation in sales can be explained by the predictors in
the model. Which means that the model is good and suitable.
c.
Are there any variables that may not be significant in predicting
sales
?
Only the p-value of “newspaper” is greater than 0.05, hence “newspaper is not significant in predicting “sales”.
Problem 2
Based on the previous problem, the following is the output from the full model:
a)
Determine the
AIC
for all three models.
Model 1: 2*(3+1) + 200*ln(556.8/200) = 212.778
Model 2: 2*(2+1) + 200*ln(556.9/200) = 210.814
Model 3: 2*(1+1) + 200*ln(2102.5/200) = 474.513
b)
Determine the 𝐶𝑝
for all three models.
Model 1: 556.8/2.8 + 2*(3+1) – 200 = 6.857
Model 2: 556.9/2.8 + 2*(2+1) – 200 = 4.893
Model 3: 2102.5/2.8 + 2*(1+1) – 200 = 554.893
c)
Determine the adjusted
𝑅^
2
for all three models.
SST = 3314.6 + 1545.6 + 0.1 + 556.8 = 5417.1 Model 1: 1 – (556.8/(200-3-1))/(5417.1/199) = 0.8956
Model 2: 1- (556.9/(200-2-1))/(5417.1/199) = 0.8962
Model 3: 1- (2102.5/(200-1-1))/(5417.1/199) = 0.6099
d)
Determine the
RSE
for all three models.
Model 1: 1.686
Model 2: 1.681
Model 3: 3.259
e)
Which model best fits to predict
sales
based on these statistics?
Model 2 is best fits to predict “sales” based on these statistics.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Problem 3
(a)
Which answer is correct, and why?
i.
For a fixed value of IQ and GPA, males earn more on average than females.
ii.
For a fixed value of IQ and GPA, females earn more on average than males.
iii.
For a fixed value of IQ and GPA, males earn more on average than females provided that the GPA is high enough.
iv.
For a fixed value of IQ and GPA, females earn more on average than males provided that the GPA is high enough.
iii is correct. Least square line: y = 50 + 20*GPA + 0.07*IQ + 35*Gender + 0.01*GPA*IQ – 10*GPA*Gender
For males: y = 50 + 20*GPA + 0.07*IQ + 0.01*GPA*IQ
For females: y = 85 + 10*GPA + 0.07*IQ + 0.01*GPA*IQ
50 + 20*GPA >= 85 + 10*GPA which equivalent to GPA >= 3.5. Which means
the starting salary for males is higher than females.
(b)
Predict the salary of a female with IQ of 110 and a GPA of 4.0.
For females: y = 85 + 10*GPA + 0.07*IQ + 0.01*GPA*IQ
Y = 85 + 10*4 + 0.07*110 + 0.01*4*110 = 137.1
Starting salary of $137,100
(c)
True or false: Since the coefficient for the GPA/IQ interaction term is very small, there is
very little evidence of an interaction effect. Justify your answer.
False,
the coefficient value of an interaction term alone does not provide conclusive evidence for or against the presence of an interaction effect. To draw meaningful conclusions about the existence or absence of an interaction effect, it is essential to either have or calculate a p-value associated with the coefficient of the interaction term.
Problem 4
We perform best subset, forward stepwise, and backward stepwise selection on a single
data set. For each approach, we obtain p + 1 models, containing 0, 1, 2, …, p predictors.
Answer true or false to the following statements.
(a)
The predictors in the k-variable model identified by forward stepwise are a subset of the predictors in the (k + 1)-variable model identified by forward stepwise selection.
True
(b)
The predictors in the k-variable model identified by backward stepwise are a subset of
the predictors in the (k + 1)-variable model identified by backward stepwise selection.
True
(c)
The predictors in the k-variable model identified by backward stepwise are a subset of
the predictors in the (k + 1)-variable model identified by forward stepwise selection.
False
(d)
The predictors in the k-variable model identified by forward stepwise are a subset of the predictors in the (k + 1)-variable model identified by backward stepwise selection.
False
(e)
The predictors in the k-variable model identified by best subset are a subset of the predictors in the (k + 1) - variable model identified by best subset selection.
False
Problem 5
This question involves the use of simple linear regression on the Auto
data set. This can be
found in the
ISLR2
package in
R
.
(a) Use the
lm()
function to perform a simple linear regression with mpg
as the
response and horsepower
(
hp
) as the predictor. Use the
summary()
function to print
the results. Comment on the output. For example:
i.
Is there a relationship between the predictor and the response?
Yes, since the p-value is 2.2e-16, there is a relationship between the predictor and the response.
ii.
How strong is the relationship between the predictor and the response?
R-squared is 0.6049, which shows that almost 60.49% of the variation in mpg (response) is because of the horsepower (predictor).
iii.
Is the relationship between the predictor and the response positive or negative?
Negative
iv.
What is the predicted mpg associated with a horsepower of 98? What are the
associated 95% confidence and prediction intervals? Give an interpretation of
these intervals.
> predict(auto.lm, data.frame(horsepower = 98), interval = "prediction")
24.151388 14.4938885 33.80889
> predict(auto.lm, data.frame(horsepower = 98), interval = "confidence")
24.151388 23.660958 24.641817
The predicted mpg associated with a horsepower of 98 is about 24.15. 95%
confidence interval of (13.660958, 24.641817) and prediction interval of
(14.4938885, 33.80889).
(b)
Plot the response and the predictor. Use the
abline()
function to display the least squares regression line.
> plot(Auto$horsepower,Auto$mpg)
> abline(auto.lm)
(c)
Use the plot() function to produce diagnostic plots of the least squares regression fit. Comment on any problems you see with the fit.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Problem 6
This question involves the use of multiple linear regression on the Auto
data set.
(a)
Produce a scatterplot matrix which includes all of the variables in the data set.
> pairs(Auto)
(b)
Compute the matrix of correlations between the variables using the function
cor()
. You will need to exclude the name variable,
cor()
which is qualitative.
(c)
Use the
lm()
function to perform a multiple linear regression with mpg
as the
response and all other variables except name as the predictors. Use the
summary()
function to print the results. Comment on the output. For instance:
i.
Is there a relationship between the predictors and the response?
The p-value is low which shows that there are relationships between some variables and “mpg” in the model.
ii.
Which predictors appear to have a statistically significant relationship to the re-
sponse?
Displacement, weight, year, and origin
iii.
What does the coe
icient for the year variable suggest?
ff
Coefficient for the year variable suggests that “mpg” increases by the coefficient for every one year.
(d)
Use the
plot()
function to produce diagnostic plots of the linear regression fit based
on the predictors that appear to have a statistically signifianct relationship to the
response. Comment on any problems you see with the fit. Do the residual plots
suggest any unusually large outliers? Does the leverage plot identify any observations
with unusually high leverage?
First graph shows that there is non-linear relationship between the response
and the predictors. There does appear to be outliers. And there on observation
that stands out as a potential leverage point in third graphs
(e)
Use the * and/or : symbols to fit linear regression models with interaction effects. Do any
interactions appear to be statistically significant?
Interaction between “displacement” and “weight” appear to be statistically significant. (f)
Try a few different transformations of the variables, such as log(X), √X, X^2. Comment on your findings.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Log(acceleration) is less significant than acceleration
Problem 7
This problem involves the
Boston
data set, from the
ISLR2
pakcage. We will now try to
predict per capita crime rate using the other variables in this data set. In other words, per
capita crime rate is the response, and the other variables are the predictors.
(a)
For each predictor, fit a simple linear regression model to predict the response. Describe
your results. In which of the models is there a statistically significant association between
the predictor and the response? Create some plots to back up your assertions.
Except for crim vs chas, all other models are statistically significant association
between the predictor and the response.
(b)
Fit a multiple regression model to predict the response using all of the predictors.
Describe your results. For which predictors can we reject the null hypothesis 𝐻
0 ∶
𝛽𝑗
= 0?
Only zn, dis, rad, and medv have a significant association with crim, all of them have p-value that below 0.05. Which means we can reject the null hypothesis.
(c)
How do your results from (a) compare to your results from (b)? Create a plot displaying
the univariate regression coefficients from (a) on the x-axis, and the multiple regression
coefficients from (b) on the y-axis. That is, each predictor is displayed as a single point in
the plot. Its coefficient in a simple linear regression model is shown on the x-axis, and its
coefficient estimate in the multiple linear regression model is shown on the y-axis.
Univariate and multiple regression coefficients exhibit a clear distinction. In a
simple regression model, the slope signifies the average impact of a predictor's
increase, disregarding other predictors in the dataset. Conversely, in multiple
regression, the slope represents the average impact of a predictor's increase
while holding all other predictors constant.
(d)
Is there evidence of non-linear association between any of the predictors and the
response? To answer this question, for each predictor X, fit a model of the form
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
There is no evidence of non-linear association between any of the predictors and the
response
Problem 8
This problem focuses on the collinearity
problem.
(a) Perform the following commands in R:
set.seed (
1
)
x1
=runif (
100
)
x2 =
0.5* x1
+
rnorm (
100
) /10
y
=
2+2* x1 +0.3* x2
+
rnorm (
100
)
The last line corresponds to creating a linear model in which 𝑦
is a function of x
₁
and x
₂
.
Write out the form of the linear model. What are the regression coefficients?
Linear model: y = β
₀
+ β
₁
x
₁
+ β
₂
x
₂
+ ε
Regression coefficients: β
₀ = 2, β
₁ = 2, β
₂ = 0.3
(b) What is the correlation between 𝑥
1 and 𝑥
2? Create a scatterplot displaying the relationship between the variables.
> cor(x1,x2)
[1] 0.8351212
(c
) Using this data, fit a least squares regression to predict y using 𝑥
1 and 𝑥
2. Describe the results obtained. What are 𝛽
0 ̂, 𝛽
1 ̂, and 𝛽
2 ̂? How do these relate to the true 𝛽
0, 𝛽
1, and 𝛽
2? Can you reject the null hypothesis 𝐻
0 ∶
𝛽
1 = 0? How about the null hypothesis 𝐻
0 ∶
𝛽
2 = 0?
βo = 2.1305, β1 = 1.4396 , and β2 = 1.0097; Only βo is close to true βo. P-value for 𝛽
1 is 0.0487 < 0.05, which it reject null hypothesis.
P-value for β2 is 0.3754 > 0.05, which it cannot reject null hypothesis.
(d
) Now fit a least squares regression to predict 𝑦
using only 𝑥
1. Comment on your results. Can you reject the null hypothesis 𝐻
0 ∶
𝛽
1 = 0?
We can reject the null hypothesis because the p-value is very low
(e) Now fit a least squares regression to predict 𝑦
using only 𝑥
2. Comment on your results. Can you reject the null hypothesis 𝐻
0 ∶
𝛽
1 = 0?
We can reject the null hypothesis because the p-value is very low
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(f) Do the results obtained in (c)–(e) contradict each other? Explain your answer.
No, the results obtained in (c) – (e) do not contradict each other. When both variables, x1 and x2, are included in the model, the coefficient of x2 does not show statistical significance. However, when either x1 or x2 is individually used to fit the model, both the coefficients of x1 and x2 exhibit statistical significance.
(g) Now suppose we obtain one additional observation, which was unfortunately miss =measured. x1=c(x1 , 0.1) x2=c(x2 , 0.8) y=c(y,6)
Re-fit the linear models from (c) to (e) using this new data. What effect does this new
observation have on the each of the models? In each model, is this observation an outlier? A
high-leverage point? Both? Explain your answers.
In this observation, it does not reject null hypothesis: 𝐻
0 ∶
𝛽
1 = 0 and it reject null hypothesis: 𝐻
0 ∶
𝛽
2 = 0. And in this observation, the new point is a leverage point but not an outlier.
Both (d) and this new observation are rejecting
null hypothesis: 𝐻
0 ∶
𝛽
1 = 0. R^2 is smaller than (d) and σ^2 is larger than (d), which means this observation is an outlier. But it does not have high leverage in the plot residual.
Both (e) and this new observation are rejecting
null hypothesis: 𝐻
0 ∶
𝛽
1 = 0. R^2 is not smaller than (e) and σ^2 is not larger than (d), which means this observation is not an outlier. But it has high leverage in the plot residual. Problem 9
In this exercise, we will generate simulated data, and will then use this data to perform best subset selection.
(a)
Use the
rnorm()
function to generate a predictor of length = 100, as well as a noise vector of length = 100.
> set.seed(1)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
> x = rnorm(100)
> noise = rnorm(100)
(b)
Generate a response vector 𝑌
of length n = 100 according to the model
where 𝛽
0, 𝛽
1, 𝛽
2, and 𝛽
3 are constants of your choice.
> b0 = 14
> b1 = 2
> b2 = -3
> b3 = 5
> Y = b0+b1*x+b2*x^2+b3*x^3+noise
(c)
> which.min(mod.s$cp)
The best model according to Cp has 4 [1] 4
variables.
> which.min(mod.s$bic)
The best model according to BIC has 3 [1] 3
variables.
> which.max(mod.s$adjr2)
The best model according to adjusted R^2 [1] 4
has 4 variables.
(d) Repeat (c), using forward stepwise selection and also using backwards stepwise selection. How does your answer compare to the results in (c)?
> mod.fss=regsubsets(y~poly(x,10,raw=TRUE),data=data.f,nvmax=10,method="forward")
> mod.bss=regsubsets(y~poly(x,10,raw=TRUE),data=data.f,nvmax=10,method="backward")
> fss.s=summary(mod.fss)
> bss.s=summary(mod.bss)
> min.cp.fss=which.min(fss.s$cp)
> min.bic.fss=which.min(fss.s$bic)
> max.adjr2.fss=which.max(fss.s$adjr2)
> min.cp.bss=which.min(bss.s$cp)
> min.bic.bss=which.min(bss.s$bic)
> max.adjr2.bss=which.max(bss.s$adjr2)
> plot(fss.s$cp,main="Forward Stepwise Selection Of Cp",pch=20)
> points(min.cp.fss,fss.s$cp[min.cp.fss],pch=4,col="blue", lwd=7)
> plot(bss.s$cp,main="Backward Stepwise Selection Of Cp",pch=20)
> points(min.cp.bss,bss.s$cp[min.cp.bss],pch=4,col="blue", lwd=7)
> plot(fss.s$bic,main="Forward Stepwise Selection Of BIC",pch=20)
> points(min.bic.fss,fss.s$bic[min.bic.fss],pch=4,col="blue", lwd=7)
> plot(bss.s$bic,main="Backward Stepwise Selection Of BIC",pch=20)
> points(min.bic.bss,bss.s$bic[min.bic.bss],pch=4,col="blue", lwd=7)
> plot(fss.s$adjr2,main="Forward Stepwise Selection Of Adjr2",pch=20)
> points(max.adjr2.fss,fss.s$adjr2[max.adjr2.fss],pch=4,col="blue", lwd=7)
> plot(bss.s$adjr2,main="Backward Stepwise Selection Of Adjr2",pch=20)
> points(max.adjr2.bss,bss.s$adjr2[max.adjr2.bss],pch=4,col="blue", lwd=7)
> which.min(fss.s$cp)
> which.max(fss.s$adjr2)
[1] 4
[1] 4
> which.min(bss.s$cp)
> which.max(bss.s$adjr2)
[1] 4
[1] 4
> which.min(fss.s$bic)
[1] 3
> which.min(bss.s$bic)
[1] 3
The best
model according to Cp, BIC, and Adjusted R^2 are same as (c).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help