LabAssignment9

pdf

School

University of Oregon *

*We aren’t endorsed by this school

Course

370

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

3

Uploaded by ethanfarmen

Report
Lab Assignment 9 Directions: Keep in mind the following instructions while working on the exercises. Type your answers in a separate document and submit to Gradescope as a .pdf file. You will be graded on the proper use of notation, terminology, and the clarity of your work. Properly format any tables and figures that you include in your answer sheet. You are encouraged to work with your classmates, however the work you submit must be your own. If turned in late up to 24 hours past due, 20% of the total number of possible points will be deducted. Late assignments will not be accepted after 24 hours past due. Multiple Linear Regression The dataset StudentSurvey.xls contains data on introductory statistics students from an in-class survey over several years. For exercises 1–11, we are interested in estimating the GPA of students using a multiple regression model. 1. (1 point) Find a multiple linear regression model to predict GPA of students in college based on SAT and VerbalSAT scores and number of hours of TV per week. Write the estimated model. GPA <- StudentSurvey$GPA SAT <- StudentSurvey$SAT VerbalSAT <- StudentSurvey$VerbalSAT TV <- StudentSurvey$TV modelGPA <- lm(GPA SAT + VerbalSAT + TV) summary(modelGPA) 2. (2 points) What is the predicted GPA for a student who got 1140 on the SAT, 580 on the VerbalSAT, and regularly watches 6 hours of TV per week? 3. (2 points) Compute the residual of the second student that appears in this dataset, the female sophomore. 4. (2 points) Give the ANOVA table and interpret what the ANOVA says about the ef- fectiveness of this model. Since anova() in R does not provide the necessary ANOVA table, you may use the following steps to accomplish this. 1. Copy the following code for the simpleAnova function into the console or a script file and execute it. 2. Execute the command simpleAnova(modelGPA) . 1
Format the ANOVA table in your answer sheet using a table and include the row of totals. simpleAnova <- function(object, ...) { # Compute anova table tab <- anova(object, ...) # Obtain number of predictors p <- nrow(tab) - 1 # Add predictors row predictorsRow <- colSums(tab[1:p, 1:2]) predictorsRow <- c(predictorsRow, predictorsRow[2] / predictorsRow[1]) # F-quantities Fval <- predictorsRow[3] / tab[p + 1, 3] pval <- pf(Fval, df1 = p, df2 = tab$Df[p + 1], lower.tail = FALSE) predictorsRow <- c(predictorsRow, Fval, pval) # Simplified table tab <- rbind(predictorsRow, tab[p + 1, ]) row.names(tab)[1] <- "Predictors" return(tab) } 5. (2 points) Compute and interpret in context the coefficient of determination, R 2 . 6. (2 points) Include the table of coefficients with their corresponding standard errors, t - statistics, and p-values. summary(modelGPA) 7. (2 points) At 10% level, which variables are significant predictors of GPA in this model? 8. (2 points) At 5% level, which variables are significant predictors of GPA in this model? Which is the most significant? 9. (1 point) Interpret the coefficient of SAT in context. 10. (1 point) Interpret the intercept in context. 11. (2 points) Include a plot of residuals versus fitted values and a histogram of residuals. Use these plots to assess the conditions for inference on this regression model. plot(modelGPA$residuals modelGPA$fitted.values, xlab = "Fitted Values", ylab = "Residuals") 2
The dataset NutritionStudy.xls contains information on calories consumed in a day, fat grams consumed in a day, cholesterol consumed in mg per day, and age in years of 315 patients. Use this dataset for exercises 12–18. 12. (2 points) Create a model to predict calories consumed in a day based on fat grams consumed in a day, cholesterol consumed in mg per day, and age in years. Write the estimated model. 13. (2 points) What daily calorie consumption does the model predict for a 37 year old person who eats 35 grams of fat in a day and 200 mg of cholesterol? 14. (2 points) In this model, which variable is least significant? Which is most significant? 15. (1 point) Interpret the coefficient of Fat in context. 16. (1 point) Interpret the coefficient of Age in context. 17. (2 points) Create the ANOVA table, including the row of totals, and interpret what the ANOVA output says about the effectiveness of this model. 18. (1 point) Interpret R -squared for this model. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help