Lichen_Jiang_3

docx

School

Johns Hopkins University *

*We aren’t endorsed by this school

Course

510.650

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

docx

Pages

Uploaded by DoctorArt6178

#Q1(a) set.seed(100) x = rnorm(100) x set.seed(200) e = rnorm(100) e #Q1(b) y = 3+2*x+x^2+0.5*x^3+e y #Q1(c) x2 = x^2 x3 = x^3 x4 = x^4 x5 = x^5 x6 = x^6 x7 = x^7 x8 = x^8 x9 = x^9 x10 = x^10 data_1= data.frame(x, x2, x3, x4,x5, x6, x7, x8, x9, x10, y) best_q1 = regsubsets(y ~ ., data = data_1, nvmax = 10) # Cp coef(best_q1, which.min(summary(best_q1)$rsq)) According to Cp, I can find that the best one is y = 3.957865+1.088975*x^3 plot(summary(best_q1)$rsq) # BIC coef(best_q1, which.min(summary(best_q1)$bic)) According to BIC, I can find that the best one is: y = 2.9949297+2.2118204*x+1.0370719*x^2+0.4715355*x^3

plot(summary(best_q1)$bic) # R^2 coef(best_q1, which.max(summary(best_q1)$adjr2)) plot(summary(best_q1)$adjr2) #Q1(d)forward best_q1_f = regsubsets(y ~ ., data = data_1, nvmax = 10,method = "forward") # Cp coef(best_q1_f), which.min(summary(best_q1_f)$rsq)) There’s no difference between the original method. plot(summary(best_q1_f)$rsq)

# BIC coef(best_q1_f), which.min(summary(best_q1_f)$bic)) There’s no difference between forward method and original method. plot(summary(best_q1_f)$bic) # R^2 coef(best_q1_f), which.max(summary(best_q1_f)$adjr2)) There’s some differences between forward method and original method. As the best method just has 8 parameters. plot(summary(best_q1_f)$adjr2)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

#Q1(d)backward best_q1_b = regsubsets(y ~ ., data = data_1, nvmax = 10,method = "backward") # Cp coef(best_q1_b, which.min(summary(best_q1_b)$rsq)) plot(summary(best_q1_b)$rsq) # BIC coef(best_q1_b, which.min(summary(best_q1_b)$bic)) The parameters have changed. And the number of parameters have increased. plot(summary(best_q1_b)$bic) # R^2 coef(best_q1_b, which.max(summary(best_q1_b)$adjr2)) There’s no difference between back forward method and original method. plot(summary(best_q1_b)$adjr2)

#Q2(a) library("MASS") head(Boston) reg.boston = regsubsets(medv~., data = Boston,nvmax = 13) summary(reg.boston) #Q2(b) Because of the result of summary(reg.boston), we should choose rm, ptratio and lstat. lm.boston = lm(medv~rm+ptratio+lstat,data = Boston) summary(lm.boston) The model is significant, about 68% variability can be explained by this linear model. #Q2(b) reg.boston.b = regsubsets(medv~., data = Boston,nvmax = 13,method = "backward") reg.boston.f = regsubsets(medv~., data = Boston,nvmax = 13,method = "forward") coef(reg.boston.b,7)

coef(reg.boston.f,7) coef(reg.boston,7) summary(reg.boston)$rsq[7] summary(reg.boston.f)$rsq[7] summary(reg.boston.b)$rsq[7] The best subset of the three methods are the same. While R 2 of backward are less than others, meaning that it loses much in terms of proportion of variability explained than others.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Lichen_Jiang_3

Related Documents