Loose Leaf for Statistical Techniques in Business and Economics
Loose Leaf for Statistical Techniques in Business and Economics
17th Edition
ISBN: 9781260152647
Author: Douglas A. Lind
Publisher: McGraw-Hill Education
bartleby

Videos

Question
Book Icon
Chapter 14, Problem 34DA

a.

To determine

Make the correlation matrix.

Find the independent variables those have strong or weak correlations with the dependent variable.

Explain whether there are any problems with multicollinearity.

Also explain whether it is surprising that the correlation coefficient for ERA is negative.

a.

Expert Solution
Check Mark

Answer to Problem 34DA

The correlation matrix is obtained as,

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  1

There are no problems of multicollinearity.

Explanation of Solution

Multiple linear regression model:

A multiple linear regression model is given as y^=a+b1x1+b2x2+b3x3+...+bkxk where y is the response or dependent variable, and x1,x2,...,xk are the k quantitative independent variables where k is a positive integer.

Here, a is the intercept term of the regression model, that is, the value of predicted value of y when X’s are 0 and bi’s are the slopes, that is, the amount of change of the predicted value of y for one unit increase in xi when all other independent variables are constant.

In the given problem the predicted dependent variable y is the number of won games. The team batting average (BA), the team earned run average (ERA), the number of home runs (HR) and whether the team plays in the American or the National League, are denoted as x1,x2,x3andx4, respectively.

Step by step procedure to obtain the correlation matrix using MINITAB software is given below:

  • Choose Stat > Basic Statistics > Correlation.
  • Select the columns of Wins, BA, ERA, HR and League under Variables tab.
  • Click OK.

The MINITAB output is obtained.

According to the obtained output there is a strong correlation between the independent variable “ERA”, and the dependent variable “Wins”. That is, –0.818. Hence, it can be sad that if the team earned run average is decreased due to better pitching then the number of Wins is increased and vice versa. This is not surprising.

Multicollinearity:

In a multiple regression model, when there is high correlation between two or more independent variables, then multicollinearity occurs.

The correlation coefficients between the independent variables are moderate, which does not indicate any presence of multicollinearity.

b.

To determine

Find the regression equation and explain the procedure of the selection of the variables to include in the equation.

Explain the correlation analysis.

Prove that the regression equation shows a significant relationship.

Give the regression equation and interpret the practical interpretation of it.

Find and interpret R2.

Explain whether the number of wins affected by whether the team plays in the National or the American League.

b.

Expert Solution
Check Mark

Explanation of Solution

Calculation:

Step by step procedure to obtain the regression equation using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Wins.
  • Under Continuous predictors, enter the columns of BA, ERA, HR, and League.
  • Click OK.

Output using MINITAB software is given below:

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  2

Now, it is known that if the p-value of an independent variable is less than the level of significance then there is significant relation between the dependent variable and that independent variable. Otherwise, there is no significant relationship.

Consider that, the level of significance is α=0.05. Hence, the p-value corresponding to “League” is 0.416, which is greater than 0.05. Hence, it can be concluded that there is no significant relationship between the dependent variable the “Wins” and the independent variable “League”.

According to the correlation analysis one can omit that independent variable, which has the lowest correlation with the dependent variable.

Hence, in this case one can omit “League” from the regression analysis. Moreover, it can be said that the number of wins is not affected by whether the team plays in the National or the American League.

Thus, for revised regression analysis the dependent variable is “Wins” (y) and the independent variables are “BA”(x1), “ERA”(x2) and “HR”(x3).

Step by step procedure to obtain the regression equation using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Wins.
  • Under Continuous predictors, enter the columns of BA, ERA, and HR.
  • Click OK.
  • Choose Graphs.
  • Under Residual plot select Histogram of residuals and Residual Versus fit.
  • Click OK.
  • Click OK.

Output using MINITAB software is given below:

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  3

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  4

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  5

Hence, the revised regression equation is Wins^=58.4+337BA19.42ERA+0.0838HR_.

Hence, it can be said that if batting average is increased by 1 unit then 0.337 matches will be won more. However, if the team home run is increased by 1 unit the inly 0.0838 matches can be won. In addition, for 1 unit decrease in team earned run average the number of match won increased by 19.42.

The coefficient of determination (R2) value is 79.21%. Hence, it can be said that 79.21% variation in the number of wins can be explained by the team earned run average, team batting average and the team home runs.

c.

To determine

Perform a global test on the set of the independent variables and interpret.

c.

Expert Solution
Check Mark

Explanation of Solution

Consider that y is dependent variable and xi's are the independent variables where βi's are the corresponding population regression coefficient for all i=1,2,3.

State the hypotheses:

Null hypothesis:

H0:β1=β2=β3=0.

That is, the model is not significant.

Alternative hypothesis:

H1:At least one βi is not equal to 0.

That is, the model is significant.

In case of global test the F test statistic is defined as,

F=SSRkSSEnk1, where SSR, SSE, n and k are the regression sum of square, error sum of square, sample size and the number of independent variables.

According to the output in Part (a) the value of F statistic is 33.02 with numerator degrees of freedom 3 and denominator degrees of freedom 26.

Consider, the level of significance is α=0.05.

Decision rule:

  • If p-valueα, then reject the null hypothesis.
  • Otherwise failed to reject the null hypothesis.

Conclusion:

Here, p-value corresponding to the global test is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that any of the regression coefficient differ from 0 at 0.05 significance level.

d.

To determine

Perform a hypothesis test on individual variables.

Explain whether any of the independent variables will be deleted.

d.

Expert Solution
Check Mark

Answer to Problem 34DA

There is no need to delete any independent variables.

Explanation of Solution

For independent variable x1:

Consider that β1 is the population regression coefficient of independent variable x1.

State the hypotheses:

Null hypothesis:

H0:β1=0.

That is, there is no significant relationship between y and x1.

Alternative hypothesis:

H1:β10.

That is, there is significant relationship between y and x1.

In case of individual regression coefficient test the t test statistic is defined as,

t=bisbi, where bi and sbi are the ith regression coefficient and the standard deviation of the ith regression coefficient.

According to the output in Part (a) the t statistic value corresponding to x1 is 2.88 with 26 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “BA”(x1) is 0.008.

Hence, p-value(=0.008)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x1.

For independent variable x2:

Consider that β2 is the population regression coefficient of independent variable x2.

State the hypotheses:

Null hypothesis:

H0:β2=0.

That is, there is no significant relationship between y and x2.

Alternative hypothesis:

H1:β20.

That is, there is significant relationship between y and x2.

According to the output in Part (a) the value of t test statistic corresponding to x2 is –9.25 with 26 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “ERA”(x2) is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x2.

For independent variable x3:

Consider that β2 is the population regression coefficient of independent variable x3.

State the hypotheses:

Null hypothesis:

H0:β3=0.

That is, there is no significant relationship between y and x3.

Alternative hypothesis:

H1:β30.

That is, there is significant relationship between y and x3.

According to the output in Part (a) the value of t test statistic corresponding to x3 is 2.83 with 26 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “HR”(x3) is 0.009.

Hence, p-value(=0.009)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x3.

As there are significant relationship between the dependent variable and all of the independent variables, then there is no need to delete any independent variables.

e.

To determine

Draw a histogram or a stem-and-leaf display of the residuals from the final regression equation developed in Part (b).

Explain whether it is reasonable to conclude that the normality assumption has been met.

e.

Expert Solution
Check Mark

Explanation of Solution

Histogram:

From Part (b), the histogram is obtained as,

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  6

Assumption of normality from histogram:

  • The majority of the observation in the middle and centered on the mean of 0.
  • There are lower frequencies on the tails of the distributions.

According to the given histogram, the most of the observations are centered on the mean of 0 and there are less frequencies on the tails of the distributions. It can be considered as somehow symmetric.

Hence, the normality assumptions appear.

e.

To determine

Plot the residuals against the fitted value.

e.

Expert Solution
Check Mark

Explanation of Solution

From Part (b), the residual plot is obtained as,

Loose Leaf for Statistical Techniques in Business and Economics, Chapter 14, Problem 34DA , additional homework tip  7

Assumption for residual analysis for the regression model:

  • The plot of the residuals vs. the observed values of the predictor variable should fall roughly in a horizontal band and symmetric about x-axis.
  • For a normal probability plot, residuals should be roughly linear.
  • There should not be any observable pattern.

According to the given residual plot, the points are roughly in a horizontal band and more or less symmetric about x-axis. Moreover, there is no particular pattern in the residual plot. A complete haphazard and random nature has observed. The variability among the residuals is more or less same thorough the whole plot.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Chapter 14 Solutions

Loose Leaf for Statistical Techniques in Business and Economics

Ch. 14 - The following regression output was obtained from...Ch. 14 - A study by the American Realtors Association...Ch. 14 - The manager of High Point Sofa and Chair, a large...Ch. 14 - Prob. 10ECh. 14 - Prob. 11ECh. 14 - A real estate developer wishes to study the...Ch. 14 - Prob. 13CECh. 14 - Prob. 14CECh. 14 - Prob. 15CECh. 14 - Prob. 16CECh. 14 - The district manager of Jasons, a large discount...Ch. 14 - Suppose that the sales manager of a large...Ch. 14 - The administrator of a new paralegal program at...Ch. 14 - Prob. 20CECh. 14 - Prob. 21CECh. 14 - A regional planner is studying the demographics of...Ch. 14 - Great Plains Distributors, Inc. sells roofing and...Ch. 14 - Prob. 24CECh. 14 - Prob. 25CECh. 14 - Prob. 26CECh. 14 - An investment advisor is studying the relationship...Ch. 14 - Prob. 28CECh. 14 - Prob. 29CECh. 14 - The director of special events for Sun City...Ch. 14 - Prob. 31CECh. 14 - Prob. 32CECh. 14 - Prob. 33DACh. 14 - Prob. 34DACh. 14 - Prob. 35DACh. 14 - Prob. 1PCh. 14 - Quick-print firms in a large downtown business...Ch. 14 - The following ANOVA output is given. a. Compute...Ch. 14 - Prob. 1CCh. 14 - Prob. 2CCh. 14 - Prob. 3CCh. 14 - In a scatter diagram, the dependent variable is...Ch. 14 - What level of measurement is required to compute...Ch. 14 - If there is no correlation between two variables,...Ch. 14 - Which of the following values indicates the...Ch. 14 - Under what conditions will the coefficient of...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Prob. 1.9PTCh. 14 - In a multiple regression equation, what is the...Ch. 14 - Prob. 1.11PTCh. 14 - Prob. 1.12PTCh. 14 - For a dummy variable, such as gender, how many...Ch. 14 - What is the term given to a table that shows all...Ch. 14 - If there is a linear relationship between the...Ch. 14 - Given the following regression analysis output: a....Ch. 14 - Given the following regression analysis output. a....
Knowledge Booster
Background pattern image
Statistics
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Text book image
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Text book image
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Text book image
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Finite Math: Markov Chain Example - The Gambler's Ruin; Author: Brandon Foltz;https://www.youtube.com/watch?v=afIhgiHVnj0;License: Standard YouTube License, CC-BY
Introduction: MARKOV PROCESS And MARKOV CHAINS // Short Lecture // Linear Algebra; Author: AfterMath;https://www.youtube.com/watch?v=qK-PUTuUSpw;License: Standard Youtube License
Stochastic process and Markov Chain Model | Transition Probability Matrix (TPM); Author: Dr. Harish Garg;https://www.youtube.com/watch?v=sb4jo4P4ZLI;License: Standard YouTube License, CC-BY