Data Analysis 8

pdf

School

Kauai Community College *

*We aren’t endorsed by this school

Course

MISC

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

4

Uploaded by AmbassadorEnergyFinch21

Report
ST 314 Data Analysis 08 The dataset ST314ExamData_DA8.csv represents the midterm and final exam grades for students in the ST314 online and campus courses, for two previous terms. Use this data to complete a multiple linear regression analysis in R and answer the following questions. Note: the data set used in this assignment is different from last week’s. Please make sure you are loading the csv file listed on the Data Analysis 8 Canvas page. Final = Final Exam Score out of 100 Midterm = Midterm Exam Score out 100 Term = Has two levels Fall 2021 and Spring 2022 Format = Has two levels Campus and Online Part 1. (6 points) Multivariate Visualization: It is reasonable to consider that more than just midterm score may influence final exam score. Investigate the individual relationships between final exam score and the above explanatory variables. Use the R script Multivariate_Exam_Analysis.R to help you get started with the code. A. Construct a scatterplot matrix including final and each of the explanatory variables. a. (1 point) Paste the plot. b. (1 point) Do any of the variables have a visual relationship with Final? Plotting the data like this makes it very hard to analyze the data because for most of the plots the data is all squished to one side or the other. We can see though that Midterm does have a slight positive relationship with Final scores. B. The scatterplot matrix is not all that helpful for the categorical variables Term and Format. a. Create a side by side boxplot that looks at the relationship between Term and Final. (1 point) Paste your plot.
ST 314 Data Analysis 08 Anakin Zingray (1 point) Describe the relationship. Visually does Term seem to have a relationship with Final? No, both have very similar mins, maxes, quartiles, medians, and outliers. b. Create a side by side boxplot that looks at the relationship between Format and Final. (1 point) Past your plot. (1 point) Describe the relationship. Visually does Format seem to have a relationship with Final? There seems to be more of a relationship here since the medians are different between formats, there are more low scores and outliers for campus. Seems that most of the students in the campus format performed better (as shown by the higher median),but there are also many students who scored lower than the rest shown by the outliers.
ST 314 Data Analysis 08 Anakin Zingray Part 2. (7 points) Fit a Model Fit a model that includes Term, Format and Midterm as explanatory variables for the response variable Final. A. (1 point) Provide the R output of the model. B. (2 points) State the least squares regression equation of your model. ࠵?࠵?࠵?࠵?࠵? = 70. 065 + 0. 2211 * ࠵?࠵?࠵?࠵?࠵?࠵?࠵? + 1. 2171 * ࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?22 − 1. 6413 * ࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵? C. The model without the variables Term and Format has an adjusted R 2 value of 0.1274. a. (1 point) Does including the variables Term and Format improve the fit of the model? No it does not. R 2 value of 0.1274 is lower than the R 2 value of my model output b. (1 point) Interpret the adjusted R 2 value for the model that includes all three explanatory variables. The R 2 value of 0.1358 means that 13.58% of the variation in the Final Exam scores is explained by these three variables collectively. Part 3. (10 Points) Model interpretation. Note: Model Interpretation can get tricky when there is more than two levels in a factor. For example, Term has three levels instead of two. The R output will designate this as VariableLevel, like “TermSpring 22”. In the model, the coefficient for TermSpring 22 is 1.2171 this means that while the other variables in the model are held constant, a student taking the exam in the spring 2022 will score 1.2171 points more on average than a fall 2021 student. We know to TermSpring 22 is compared to fall, because Fall 2021 is the variable not included in the output. Meaning, fall is represented when spring is at 0. A. (2 point) Interpret each of the individual t tests by stating which variables are significant at 0.05, when the other variables are in the model. All variables except Term seem to be significant since their p values are all less than 0.05. B. (2 points) Interpret in context the ࠵? ࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵? coefficient for Format. The coefficient for the Format variable is -1.6413. This means that on average, students in the Online Format category score 1.6413 points lower on the Final Exam than Campus students holding other variables constant. C. (2 points) Interpret in context the ࠵? ࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵? 22 coefficient for TermSpring22.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
ST 314 Data Analysis 08 Anakin Zingray LIke explained above, the coefficient for TermSpring 22 is 1.2171. This means that while the other variables in the model are held constant, a student taking the exam in the spring 2022 will, on average, score 1.2171 points more than a fall 2021 student. D. (2 point) Interpret in context the ࠵? ࠵?࠵?࠵?࠵?࠵?࠵?࠵? coefficient for the Midterm variable. The coefficient for the Midterm variable is 0.2211 which means that a one point increase in the midterm exam score is associated with a 0.2211 point increase in the final exam score when holding other variables constant, on average. E. (2 points) Calculate the 95% confidence interval for ࠵? ࠵?࠵?࠵?࠵?࠵?࠵?࠵? . Show work. Interpret the interval. [0.17808609, 0.2640902] Part 4. (2 points) Prediction. A. (2 points) Use the least squares regression equation to predict final exam score for a fall, online student with a midterm score of 85. ࠵?࠵?࠵?࠵?࠵? = 70. 065 + 0. 2211 * ࠵?࠵?࠵?࠵?࠵?࠵?࠵? + 1. 2171 * ࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?22 − 1. 6413 * ࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵?࠵? ࠵?࠵?࠵?࠵?࠵? = 70. 065 + 0. 2211 * (85) + 1. 2171 * (0) − 1. 6413 * (1) ࠵?࠵?࠵?࠵?࠵? = 87. 2172 Gradescope Page Matching (2 points) When you upload your PDF file to Gradescope, you will need to match each question on this assignment to the correct pages. Video instructions for doing this are available in the Start Here module on Canvas on the page “Submitting Assignments in Gradescope”. Failure to follow these instructions will result in a 2-point deduction on your assignment grade. Match this page to outline item “Gradescope Page Matching”.