Beginning Statistics, 2nd Edition
Beginning Statistics, 2nd Edition
2nd Edition
ISBN: 9781932628678
Author: Carolyn Warren; Kimberly Denley; Emily Atchley
Publisher: Hawkes Learning Systems
Question
Book Icon
Chapter 12.4, Problem 8E
To determine

The multiple regression equation and define each of the variables used in the multiple regression equation. Also check any one of the explanatory variables be eliminated from the model and mention it.

Expert Solution & Answer
Check Mark

Answer to Problem 8E

Solution:

The multiple regression equation for the given table is y^=-0.616+0.975x1+0.015x2 with the y-intercept, 0=-0.616, the coefficient of the first explanatory variable, college freshman’s high school GPAs, 1=0.975 and the coefficient of the second explanatory variable, college freshman’s ACT scores, 2=0.015 and also it needs to eliminate the second explanatory variable, college freshman’s ACT scores, as it is not statistically significant.

Explanation of Solution

Formula Used:

Multiple Regression Model:

A multiple regression model is a linear regression model using two or more explanatory variable to predict a response variable, given by

y^=b0+b1x1+b2x2++bkxk

Where x1, x2, , xk are the explanatory variables in the model and b1, b2, , bk, of the explanatory variables are the sample estimates of the corresponding population parameters 1, 2, , k. As before, the y-intercept of the multiple regression equation is b0, which is the sample estimate of the population parameter, 0.

Null and Alternative Hypothesis for an ANOVA test:

The null and alternative hypotheses for an ANOVA test to analyse the statistical significance of the linear relationship between the variables in a multiple regression model as follows.

H0: 1=2==k=0

Ha: At least one coefficient does not equal 0.

1, 2, ,k are the coefficients of the explanatory variables, and

k is the number of explanatory variables in the model.

Conclusions Using p-value:

If p-value Α, then reject the null hypothesis.

If p-value >Α, then fail to reject the null hypothesis.

The ANOVA table given as follows:

Coefficients Standard Error t Stat P-value
Intercept -0.616112354 0.345507101 -1.783211841 0.092415337
High School GPA 0.975068267 0.186279465 5.234437779 6.7297E-05
ACT Score 0.015000337 0.019048101 0.787497806 0.441832554

The block of information gives the coefficients of the multiple regression equation and the coefficients of the explanatory variables.

The first row of the coefficient column gives the predicted value for the y-intercept, 0=-0.616. The second row of that column gives the predicted value for the coefficient of the first explanatory variable, freshman’s high school GPAs, 1=0.975. The third row of that column gives the predicted value for the coefficient of the second explanatory variable, freshman’s ACT scores, 2=0.015.

Substituting the estimated coefficients into the multiple regression model, we have the following equation for predicting the college freshman GPA after their first year based on their high school GPAs, x1, and their ACT scores, x2.

y^=-0.616+0.975x1+0.015x2

Let’s consider the individual explanatory variables more closely. Look at the p-values for the explanatory variables.

P-value
0.092415337
6.7297E-05
0.441832554

The p-values test the null hypothesis that the coefficient of a particular explanatory variable equals 0.

The null and alternative hypotheses for the first explanatory variable would be as follows.

H0: 1=0

Ha: 10

The null and alternative hypotheses for the other explanatory variables are similar. A small p-value (such as one less than 0.05) indicates that there is sufficient evidence to support the claim that the coefficient is not 0, and therefore the linear relationship between this particular variable and the response variable is statistically significant.

If the p-value is not small (for instance, those greater than 0.05) then this particular variable may not be useful in predicting the value of the response variable.

In this case of predicting the college freshman GPA after their first year, notice that the p-values for the explanatory variables of college freshman’s high school GPAs is approximately 6.7297E-05, that is 0.0000673 which is smaller than 0.05. This indicates that these explanatory variables are very likely to be effective in predicting the response variable.

Similarly, the p-values for the explanatory variable of college freshman’s ACT scores is approximately 0.4418, which is much higher than 0.05.

Thus, including freshman’s ACT scores in the multiple regression model may not be useful. Need to recalculate the multiple regression model without this variable. However, that the p-values calculated in the ANOVA table measure the influence of each explanatory variable with all other variables taken.

Therefore, it needs to eliminate the second explanatory variable, college freshman’s ACT scores, as it is not statistically significant.

Final Statement:

The multiple regression equation for the given table is y^=-0.616+0.975x1+0.015x2 with the y-intercept, 0=-0.616, the coefficient of the first explanatory variable, college freshman’s high school GPAs, 1=0.975 and the coefficient of the second explanatory variable, college freshman’s ACT scores, 2=0.015 and also it needs to eliminate the second explanatory variable, college freshman’s ACT scores, as it is not statistically significant.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!
Students have asked these similar questions
3. A bag of Skittles contains five colors: red, orange, green, yellow, and purple. The probabilities of choosing each color are shown in the chart below. What is the probability of choosing first a red, then a purple, and then a green Skittle, replacing the candies in between picks? Color Probability Red 0.2299 Green 0.1908 Orange 0.2168 Yellow 0.1889 Purple 0.1736
Name: Quiz A 5.3-5.4 Sex Female Male Total Happy 90 46 136 Healthy 20 13 33 Rich 10 31 41 Famous 0 8 8 Total 120 98 218 Use the following scenario for questions 1 & 2. One question on the Census at School survey asks students if they would prefer to be happy, healthy, rich, or famous. Students may only choose one of these responses. The two-way table summarizes the responses of 218 high school students from the United States by sex. Preferred status 1. Define event F as a female student and event R as rich. a. Find b. Find or c. Find and 2. Define event F as a female student and event R as rich. a. Find b. Find c. Using your results from a and b, are these events (female student and rich) independent? Use the following scenario for questions 3 & 4. At the end of a 5k race, runners are offered a donut or a banana. The event planner examined each runner's race bib and noted whether Age Less than 30 years old At least 30 years old Total Choice Donut Banana 52 54 106 5 72 77 Total 57 126…
I need help with this problem and an explanation of the solution for the image described below. (Statistics: Engineering Probabilities)
Knowledge Booster
Background pattern image
Similar questions
SEE MORE QUESTIONS
Recommended textbooks for you
Text book image
MATLAB: An Introduction with Applications
Statistics
ISBN:9781119256830
Author:Amos Gilat
Publisher:John Wiley & Sons Inc
Text book image
Probability and Statistics for Engineering and th...
Statistics
ISBN:9781305251809
Author:Jay L. Devore
Publisher:Cengage Learning
Text book image
Statistics for The Behavioral Sciences (MindTap C...
Statistics
ISBN:9781305504912
Author:Frederick J Gravetter, Larry B. Wallnau
Publisher:Cengage Learning
Text book image
Elementary Statistics: Picturing the World (7th E...
Statistics
ISBN:9780134683416
Author:Ron Larson, Betsy Farber
Publisher:PEARSON
Text book image
The Basic Practice of Statistics
Statistics
ISBN:9781319042578
Author:David S. Moore, William I. Notz, Michael A. Fligner
Publisher:W. H. Freeman
Text book image
Introduction to the Practice of Statistics
Statistics
ISBN:9781319013387
Author:David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:W. H. Freeman