Statistical Techniques in Business and Economics
Statistical Techniques in Business and Economics
16th Edition
ISBN: 9780077639723
Author: Lind
Publisher: Mcgraw-Hill Course Content Delivery
bartleby

Videos

Textbook Question
Book Icon
Chapter 14, Problem 18CE

Suppose that the sales manager of a large automotive parts distributor wants to estimate the total annual sales for each of the company’s regions. Five factors appear to be related to regional sales: the number of retail outlets in the region, the number of automobiles in the region registered as of April 1, the total personal income recorded in the first quarter of the year, the average age of the automobiles (years), and the number of sales supervisors in the region. The data for each region were gathered for last year. For example, see the following table. In region 1 there were 1,739 retail outlets stocking the company’s automotive parts, there were 9,270,000 registered automobiles in the region as of April 1, and so on. The region’s sales for that year were $37,702,000.

Chapter 14, Problem 18CE, Suppose that the sales manager of a large automotive parts distributor wants to estimate the total , example  1

  1. a. Consider the following correlation matrix. Which single variable has the strongest correlation with the dependent variable? The correlations between the independent variables outlets and income and between outlets and number of automobiles are fairly strong. Could this be a problem? What is this condition called?

Chapter 14, Problem 18CE, Suppose that the sales manager of a large automotive parts distributor wants to estimate the total , example  2

  1. b. The output for all five variables is shown below. What percent of the variation is explained by the regression equation?

Chapter 14, Problem 18CE, Suppose that the sales manager of a large automotive parts distributor wants to estimate the total , example  3

  1. c. Conduct a global test of hypothesis to determine whether any of the regression coefficients are not zero. Use the .05 significance level.
  2. d. Conduct a test of hypothesis on each of the independent variables. Would you consider eliminating “outlets” and “bosses”? Use the .05 significance level.
  3. e. The regression has been rerun below with “outlets” and “bosses” eliminated. Compute the coefficient of determination. How much has R2 changed from the previous analysis?

Chapter 14, Problem 18CE, Suppose that the sales manager of a large automotive parts distributor wants to estimate the total , example  4

  1. f. Following is a histogram of the residuals. Does the normality assumption appear reasonable? Why?

Chapter 14, Problem 18CE, Suppose that the sales manager of a large automotive parts distributor wants to estimate the total , example  5

  1. g. Following is a plot of the fitted values of y (i.e., y ^ ) and the residuals. What do you observe? Do you see any violations of the assumptions?

Chapter 14, Problem 18CE, Suppose that the sales manager of a large automotive parts distributor wants to estimate the total , example  6

a.

Expert Solution
Check Mark
To determine

Find the single variable that has the strongest correlation with the dependent variable.

Explain whether the fairly strong correlations between outlets and income and outlets and number of automobiles, will be any problem.

Provide the name of the condition.

Answer to Problem 18CE

The single variable that has the strongest correlation with the dependent variable, is “income”.

The name of the condition is multicollinearity.

Explanation of Solution

Multiple linear regression model:

A multiple linear regression model is given as y^=a+b1x1+b2x2+b3x3+...+bkxk where y is the response or dependent variable, and x1,x2,...,xk are the k quantitative independent variables where k is a positive integer.

Here, a is the intercept term of the regression model, that is, the value of predicted value of y when X’s are 0 and bi’s are the slopes, that is, the amount of change of the predicted value of y for one unit increase in xi when all other independent variables are constant.

In the given problem the predicted dependent variable y is the annual sales. The number of retail outlets, the number of automobiles registered, personal income, the average of automobiles and the number of supervisors, are defined as x1,x2,x3,x4andx5, respectively.

Correlation:

The correlation between two variables measures the linear relationship between those two variables.

According to the given output there is a strongest correlation between the independent variable “income” and the dependent variable “sales”. The correlation coefficient between “income” and “sales” is 0.964.

Thus, it implies that as the personal income increases the annual sales also increase.

Multicollinearity:

In a multiple regression model, when there is high correlation between two or more independent variables, then multicollinearity occurs.

The correlation between the independent variables outlets and income and between outlets and number of automobiles are fairly strong, such as, 0.825 and 0.775, respectively.

These correlations can occur multicollinearity in the regression model.

Due to this multicollinearity the standard errors will be high and there will be no exact estimate of the partial regression coefficient. Moreover, there will be difficulty to measure the relative significance of independent variables.

b.

Expert Solution
Check Mark
To determine

Find the percent of the variation that is explained by the regression equation.

Answer to Problem 18CE

The approximate value of coefficient of multiple determination is 99.43%, that is, 99.43% of the variation is explained by the regression equation.

Explanation of Solution

Calculation:

According to an ANOVA table the coefficient of multiple determination is defined as,

R2=SSRSStotal ,

Where SSR is the regression sum of squares and SS total is the total sum of square.

According to the output the SSR and SS total are 1,593.91 and 1,602.89, respectively.

Hence, the coefficient of multiple determination is,

R2=1,593.811,602.89=0.9943.

Thus, the approximate value of coefficient of multiple determination is 99.43%.

Hence, 99.43% of the variation is explained by the regression equation.

c.

Expert Solution
Check Mark
To determine

Perform a global hypothesis test to check whether any of the regression coefficients is not zero at 0.05 significance level.

Answer to Problem 18CE

There is strong evidence that at least any of the regression coefficient is not 0 at 0.05 significance level.

Explanation of Solution

Calculation:

Consider that y is dependent variable and xi's are the independent variables where βi's are the corresponding population regression coefficient for all i=1,2,3,4,5.

State the hypotheses:

Null hypothesis:

H0:β1=β2=β3=β4=β5=0.

That is, the model is not significant.

Alternative hypothesis:

H1:At least one βi is not equal to 0.

That is, the model is significant.

In case of global test the F test statistic is defined as,

F=SSRkSSEnk1, where SSR, SSE, n and k are the regression sum of square, error sum of square, sample size and the number of independent variables.

According to the output, the value of F statistic is 140.36 with numerator degrees of freedom 5 and denominator degrees of freedom 5.

The level of significance is α=0.05.

Decision rule:

  • If p-valueα, then reject the null hypothesis.
  • Otherwise failed to reject the null hypothesis.

Conclusion:

Here, p-value corresponding to the global test is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that at least any of the regression coefficient is not 0 at 0.05 significance level.

d.

Expert Solution
Check Mark
To determine

Perform individual tests of each independent variable at 0.05 significance level.

Explain whether the independent variables “outlets” and “bosses” will be eliminated.

Answer to Problem 18CE

There is no significant relation between y and x1,x4andx5, whereas there is significant relation between y and x2,andx3.

The independent random variables “the number of retail outlets”, “average age of automobiles” and “number of supervisors” can be eliminated.

Explanation of Solution

Calculation:

For independent variable x1:

Consider that β1 is the population regression coefficient of independent variable x1.

State the hypotheses:

Null hypothesis:

H0:β1=0.

That is, there is no significant relationship between y and x1.

Alternative hypothesis:

H1:β10.

That is, there is significant relationship between y and x1.

In case of individual regression coefficient test the t test statistic is defined as,

t=bisbi, where bi and sbi are the ith regression coefficient and the standard deviation of the ith regression coefficient.

According to the given information the t statistic value corresponding to x1 is –0.24 with 4 degrees of freedom.

The level of significance is α=0.05.

Decision rule:

  • If p-valueα, then reject the null hypothesis.
  • Otherwise failed to reject the null hypothesis.

Conclusion:

Here, p-value corresponding to the outlets (x1) is 0.823

Hence, p-value(=0.823)>α(=0.05).

That is, the p-value is greater than the level of significance.

Therefore, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x1.

For independent variable x2:

Consider that β2 is the population regression coefficient of independent variable x2.

State the hypotheses:

Null hypothesis:

H0:β2=0.

That is, there is no significant relationship between y and x2.

Alternative hypothesis:

H1:β20.

That is, there is significant relationship between y and x2.

According to the given ANOVA table the value of t test statistic corresponding to x2 is 3.15 with 4 degrees of freedom.

Conclusion:

Here, p-value corresponding to the automobiles (x2) is 0.035.

Hence, p-value(=0.035)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x2.

For independent variable x3:

Consider that β3 is the population regression coefficient of independent variable x3.

State the hypotheses:

Null hypothesis:

H0:β3=0.

That is, there is no significant relationship between y and x3.

Alternative hypothesis:

H1:β30.

That is, there is significant relationship between y and x3.

According to the given ANOVA table the value of t test statistic corresponding to x3 is 9.35 with 4 degrees of freedom.

Conclusion:

Here, p-value corresponding to the income (x3) is 0.001.

Hence, p-value(=0.001)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x3.

For independent variable x4:

Consider that β4 is the population regression coefficient of independent variable x4.

State the hypotheses:

Null hypothesis:

H0:β4=0.

That is, there is no significant relationship between y and x4.

Alternative hypothesis:

H1:β40.

That is, there is significant relationship between y and x4.

According to the given ANOVA table the value of t test statistic corresponding to x4 is 2.32 with 4 degrees of freedom.

Conclusion:

Here, p-value corresponding to the age (x4) is 0.081.

Hence, p-value(=0.081)>α(=0.05).

That is, the p-value is greater than the level of significance.

Therefore, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x4.

For independent variable x5:

Consider that β4 is the population regression coefficient of independent variable x5.

State the hypotheses:

Null hypothesis:

H0:β4=0.

That is, there is no significant relationship between y and x5.

Alternative hypothesis:

H1:β40.

That is, there is significant relationship between y and x5.

According to the given ANOVA table the value of t test statistic corresponding to x5 is 2.32 with 4 degrees of freedom.

Conclusion:

Here, p-value corresponding to the bosses (x5) is 0.864.

Hence, p-value(=0.864)>α(=0.05).

That is, the p-value is greater than the level of significance.

Therefore, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x5.

As there are no significant relationship between the dependent variable and the independent variables x1and x5, it is better to eliminate these variables.

Hence, it can be said that there is no significant relationship between the annual sales and the number of retail outlets and the number of supervisors. Thus, it is better to omit these independent random variables “the number of retail outlets” and “the number of supervisors”.

Moreover, there is no significant relationship between the dependent variable and the independent variable x4, it is better to eliminate this variable.

Hence, it can be said that there is no significant relationship between the annual sales and the average age of automobiles. Thus, it is better to omit this independent random variable “average age of automobiles” also.

e.

Expert Solution
Check Mark
To determine

Find the coefficient of determination.

Find the change of R2 from the previous analysis.

Answer to Problem 18CE

The approximate value of coefficient of multiple determination is 99.43%, that is, 99.43% of the variation is explained by the regression equation.

Explanation of Solution

Calculation:

According to an ANOVA table the coefficient of multiple determination is defined as,

R2=SSRSStotal ,

Where SSR is the regression sum of squares and SS total is the total sum of square.

According to the output after eliminating “outlets” and “bosses”, the SSR and SS total are 1,593.66 and 1,602.89, respectively.

Hence, the coefficient of multiple determination is,

R2=1,593.661,602.89=0.9942.

Thus, the approximate value of coefficient of multiple determination is 99.42%.

Hence, there is only 0.01%(=99.4399.42)  change of R2 from the previous analysis.

f.

Expert Solution
Check Mark
To determine

Explain whether the normality assumptions appear reasonably.

Explanation of Solution

Assumption of normality from histogram:

  • The majority of the observation in the middle and centered on the mean of 0.
  • There are lower frequencies on the tails of the distributions.

According to the given histogram, the most of the observations are centered on the mean of 0 and there are less frequencies on the tails of the distributions.

Hence, the normality assumptions appear reasonably.

g.

Expert Solution
Check Mark
To determine

Explain about the residual plot and also explain whether any assumptions are violated.

Explanation of Solution

Assumption for residual analysis for the regression model:

  • The plot of the residuals vs. the observed values of the predictor variable should fall roughly in a horizontal band and symmetric about x-axis.
  • For a normal probability plot, residuals should be roughly linear.
  • There should not be any observable pattern.

According to the given residual plot, the points are roughly in a horizontal band and more or less symmetric about x-axis. Moreover, there is no particular pattern in the residual plot. A complete haphazard and random nature has observed.

Hence, the assumptions of the residual plot are not violated.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!
Students have asked these similar questions
Please could you explain why 0.5 was added to each upper limpit of the intervals.Thanks
28. (a) Under what conditions do we say that two random variables X and Y are independent? (b) Demonstrate that if X and Y are independent, then it follows that E(XY) = E(X)E(Y); (e) Show by a counter example that the converse of (ii) is not necessarily true.
1. Let X and Y be random variables and suppose that A = F. Prove that Z XI(A)+YI(A) is a random variable.

Chapter 14 Solutions

Statistical Techniques in Business and Economics

Ch. 14 - The following regression output was obtained from...Ch. 14 - A study by the American Realtors Association...Ch. 14 - The manager of High Point Sofa and Chair, a large...Ch. 14 - Prob. 10ECh. 14 - Prob. 11ECh. 14 - A real estate developer wishes to study the...Ch. 14 - Prob. 13CECh. 14 - Prob. 14CECh. 14 - Prob. 15CECh. 14 - Prob. 16CECh. 14 - The district manager of Jasons, a large discount...Ch. 14 - Suppose that the sales manager of a large...Ch. 14 - The administrator of a new paralegal program at...Ch. 14 - Prob. 20CECh. 14 - Prob. 21CECh. 14 - A regional planner is studying the demographics of...Ch. 14 - Great Plains Distributors, Inc. sells roofing and...Ch. 14 - Prob. 24CECh. 14 - Prob. 25CECh. 14 - Prob. 26CECh. 14 - An investment advisor is studying the relationship...Ch. 14 - Prob. 28CECh. 14 - Prob. 29CECh. 14 - The director of special events for Sun City...Ch. 14 - Prob. 31CECh. 14 - Prob. 32CECh. 14 - Refer to the Real Estate data, which report...Ch. 14 - Prob. 34DECh. 14 - Refer to the Buena School District bus data....Ch. 14 - Prob. 1PCh. 14 - Quick-print firms in a large downtown business...Ch. 14 - The following ANOVA output is given. a. Compute...Ch. 14 - Prob. 1CCh. 14 - Prob. 2CCh. 14 - Prob. 3CCh. 14 - In a scatter diagram, the dependent variable is...Ch. 14 - What level of measurement is required to compute...Ch. 14 - If there is no correlation between two variables,...Ch. 14 - Which of the following values indicates the...Ch. 14 - Under what conditions will the coefficient of...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Prob. 1.9PTCh. 14 - In a multiple regression equation, what is the...Ch. 14 - Prob. 1.11PTCh. 14 - Prob. 1.12PTCh. 14 - For a dummy variable, such as gender, how many...Ch. 14 - What is the term given to a table that shows all...Ch. 14 - If there is a linear relationship between the...Ch. 14 - Given the following regression analysis output: a....Ch. 14 - Given the following regression analysis output. a....
Knowledge Booster
Background pattern image
Statistics
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Text book image
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Text book image
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Text book image
Functions and Change: A Modeling Approach to Coll...
Algebra
ISBN:9781337111348
Author:Bruce Crauder, Benny Evans, Alan Noell
Publisher:Cengage Learning
Text book image
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Text book image
College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
Whiteboard Math: The Basics of Factoring; Author: Whiteboard Math;https://www.youtube.com/watch?v=-VKAYqzRp4o;License: Standard YouTube License, CC-BY
Factorisation using Algebraic Identities | Algebra | Mathacademy; Author: Mathacademy;https://www.youtube.com/watch?v=BEp1PaU-qEw;License: Standard YouTube License, CC-BY
How To Factor Polynomials The Easy Way!; Author: The Organic Chemistry Tutor;https://www.youtube.com/watch?v=U6FndtdgpcA;License: Standard Youtube License