STATISTICAL TECHNIQUES-ACCESS ONLY
STATISTICAL TECHNIQUES-ACCESS ONLY
16th Edition
ISBN: 9780077639648
Author: Lind
Publisher: MCG
bartleby

Concept explainers

bartleby

Videos

Question
Book Icon
Chapter 14, Problem 20CE

a.

To determine

Make the correlation matrix.

Find the independent variable that has the strongest correlation with the dependent variable.

Explain whether the strong correlation between some independent variables indicate any problem.

a.

Expert Solution
Check Mark

Answer to Problem 20CE

The correlation matrix is obtained as,

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  1

The independent variable that has the strongest correlation with the dependent variable is “Years of experience”.

Explanation of Solution

Multiple linear regression model:

A multiple linear regression model is given as y^=a+b1x1+b2x2+b3x3+...+bkxk where y is the response or dependent variable, and x1,x2,...,xk are the k quantitative independent variables where k is a positive integer.

Here, a is the intercept term of the regression model, that is, the value of predicted value of y when X’s are 0 and bi’s are the slopes, that is, the amount of change of the predicted value of y for one unit increase in xi when all other independent variables are constant.

Dummy variable:

A dichotomous variable is defined as a dummy variable, where one outcome is defined as 1 another as 0.

In the given problem the predicted dependent variable y is the salary. The years of experience, the Principal’s rating, and the Master’s Degree, are defined as x1,x2,andx3, respectively.

The independent random variable x3 is defined as dummy variable.

Hence,

x2={1,Master degree0,No master degree.

Step by step procedure to obtain the correlation matrix using MINITAB software is given below:

  • Choose Stat > Basic Statistics > Correlation.
  • Select Years, Rating and Masters under Variables tab.
  • Click OK.

The MINITAB output is obtained.

According to the obtained output there is a strongest correlation between the independent variable “Years” and the dependent variable “Salary”. The correlation coefficient between “Years” and “Salary” is 0.868.

Thus, it implies that as the years of experience increases the salary also increases.

Multicollinearity:

In a multiple regression model, when there is high correlation between two or more independent variables, then multicollinearity occurs.

The correlation between the independent variables “Rating” and “Masters” is 0.458, which is maximum among the correlation between the independent variables.

Thus, there is no chance of occurrence of multicollinearity in the regression model.

b.

To determine

Find the regression equation.

Find the estimated salary for a teacher with 5 years’ experience, a rating by the principal of 60 and no master’s degree.

b.

Expert Solution
Check Mark

Answer to Problem 20CE

The regression equation is y^=19.92+0.8994x1+0.1539x20.67x3_.

The estimated salary for a teacher with 5 years’ experience, a rating by the principal of 60 and no master’s degree is $33,651.

Explanation of Solution

Calculation:

Step by step procedure to obtain the regression equation using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Salary.
  • Under Continuous predictors, enter the columns of Years, Rating and Masters.
  • Click OK.

Output using MINITAB software is given below:

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  2

Thus, the regression equation is y^=19.92+0.8994x1+0.1539x20.67x3_.

Now, substitute x1=5,  x2=60 and x3=0 in the obtained regression equation.

Hence,

y^=19.92+0.8994(5)+0.1539(60)0.67(0)=19.92+4.497+9.2340=33.651

Thus, the estimated salary for a teacher with 5 years’ experience, a rating by the principal of 60 and no master’s degree is $33,651.

c.

To determine

Perform a global hypothesis test to check whether any of the regression coefficients differ from zero at 0.05 level of significance.

c.

Expert Solution
Check Mark

Answer to Problem 20CE

There is strong evidence that any of the regression coefficient differ from 0 at 0.05 significance level.

Explanation of Solution

Calculation:

Consider that y is dependent variable and xi's are the independent variables where βi's are the corresponding population regression coefficient for all i=1,2,3.

State the hypotheses:

Null hypothesis:

H0:β1=β2=β3=0.

That is, the model is not significant.

Alternative hypothesis:

H1:At least one βi is not equal to 0.

That is, the model is significant.

In case of global test the F test statistic is defined as,

F=SSRkSSEnk1, where SSR, SSE, n and k are the regression sum of square, error sum of square, sample size and the number of independent variables.

According to the output in Part (b) the value of F statistic is 52.72 with numerator degrees of freedom 3 and denominator degrees of freedom 16.

The level of significance is α=0.05.

Decision rule:

  • If p-valueα, then reject the null hypothesis.
  • Otherwise failed to reject the null hypothesis.

Conclusion:

Here, p-value corresponding to the global test is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that any of the regression coefficient differ from 0 at 0.05 significance level.

d.

To determine

Perform individual tests of each independent variable at 0.05 significance level.

Explain whether any of the independent variable will be eliminated.

d.

Expert Solution
Check Mark

Answer to Problem 20CE

There is no significant relation between y and x3, whereas there is significant relation between y and x1andx2.

The independent random variable “Master’s degree” can be eliminated.

Explanation of Solution

Calculation:

For independent variable x1:

Consider that β1 is the population regression coefficient of independent variable x1.

State the hypotheses:

Null hypothesis:

H0:β1=0.

That is, there is no significant relationship between y and x1.

Alternative hypothesis:

H1:β10.

That is, there is significant relationship between y and x1.

In case of individual regression coefficient test the t test statistic is defined as,

t=bisbi, where bi and sbi are the ith regression coefficient and the standard deviation of the ith regression coefficient.

According to the output in Part (a) the t statistic value corresponding to x1 is 10.26 with 16 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “Years of experience”(x1) is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x1.

For independent variable x2:

Consider that β2 is the population regression coefficient of independent variable x2.

State the hypotheses:

Null hypothesis:

H0:β2=0.

That is, there is no significant relationship between y and x2.

Alternative hypothesis:

H1:β20.

That is, there is significant relationship between y and x2.

According to the output in Part (a) the value of t test statistic corresponding to x2 is 4.9 with 16 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “Principal’s rating”(x2) is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x2.

For independent variable x3:

Consider that β3 is the population regression coefficient of independent variable x3.

State the hypotheses:

Null hypothesis:

H0:β3=0.

That is, there is no significant relationship between y and x3.

Alternative hypothesis:

H1:β30.

That is, there is significant relationship between y and x3.

According to the output in Part (a) the value of t test statistic corresponding to x3 is –0.59 with 5 degrees of freedom.

Conclusion:

Here, p-value corresponding to the income (x3) is 0.59.

Hence, p-value(=0.59)>α(=0.05).

That is, the p-value is greater than the level of significance.

Therefore, fail to reject the null hypothesis.

Hence, it can be concluded that there is no significant relationship between y and x3.

As there are no significant relationship between the dependent variable and the independent variable x3, it is better to eliminate this variable.

Hence, it can be said that there is no significant relationship between the salary and the Master’s degree. Thus, it is better to omit the independent random variable “Master’s degree”.

e.

To determine

Perform the regression analysis by omitting the insignificant random variable.

e.

Expert Solution
Check Mark

Explanation of Solution

Calculation:

The regression analysis is performed after omitting the independent variable Master’s degree.

Step by step procedure to obtain the regression equation using MINITAB software:

  • Choose Stat > Regression > Regression > Fit Regression Model.
  • Under Responses, enter the column of Salary.
  • Under Continuous predictors, enter the columns of Years, and Rating
  • Choose Graphs.
  • Under Residual plot select Histogram of residuals and Residual Versus fit.
  • Click OK.
  • Choose Storage.
  • Under Regression storage select Residuals.
  • Click OK
  • Click OK.

Output using MINITAB software is given below:

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  3

Residuals
–1.2802
2.726697
–0.0664
0.711748
–2.0206
0.676788
2.725551
2.531216
0.795113
–4.26789
0.300234
–0.41968
–1.14257
0.284365
–2.38135
–2.67229
1.368873
–2.06188
–0.81143
5.00372

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  4

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  5

Thus, the regression equation is y^=20.12+0.8926x1+0.14642_.

Global test:

State the hypotheses:

Null hypothesis:

H0:β1=β2=0.

That is, the model is not significant.

Alternative hypothesis:

H1:At least one βi is not equal to 0.

That is, the model is significant.

In case of global test the F test statistic is defined as,

F=SSRkSSEnk1, where SSR, SSE, n and k are the regression sum of square, error sum of square, sample size and the number of independent variables.

According to the output in the value of F statistic is 82.31 with numerator degrees of freedom 2 and denominator degrees of freedom 17.

Conclusion:

Here, p-value corresponding to the global test is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that any of the regression coefficient differ from 0 at 0.05 significance level.

Individual test:

For independent variable x1:

Null hypothesis:

H0:β1=0.

That is, there is no significant relationship between y and x1.

Alternative hypothesis:

H1:β10.

That is, there is significant relationship between y and x1.

According to the output the t statistic value corresponding to x1 is 10.5 with 17 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “Years of experience”(x1) is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x1.

For independent variable x2:

Null hypothesis:

H0:β2=0.

That is, there is no significant relationship between y and x2.

Alternative hypothesis:

H1:β20.

That is, there is significant relationship between y and x2.

According to the output the value of t test statistic corresponding to x2 is 5.28 with 17 degrees of freedom.

Conclusion:

Here, p-value corresponding to the “Principal’s rating”(x2) is 0.

Hence, p-value(=0)<α(=0.05).

That is, the p-value is less than the level of significance.

Therefore, reject the null hypothesis.

Hence, it can be concluded that there is significant relationship between y and x2.

f.

To determine

Find the residuals for the equation of Part (e).

Find a stem-and-leaf plot or a histogram to verify whether the distribution of the residuals is approximately normal.

f.

Expert Solution
Check Mark

Explanation of Solution

Calculation:

From Part (e), the residuals are obtained as,

Residuals
–1.2802
2.726697
–0.0664
0.711748
–2.0206
0.676788
2.725551
2.531216
0.795113
–4.26789
0.300234
–0.41968
–1.14257
0.284365
–2.38135
–2.67229
1.368873
–2.06188
–0.81143
5.00372

Histogram:

From Part (e), the histogram is obtained as,

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  6

Assumption of normality from histogram:

  • The majority of the observation in the middle and centered on the mean of 0.
  • There are lower frequencies on the tails of the distributions.

According to the given histogram, the most of the observations are centered on the mean of 0 and there are less frequencies on the tails of the distributions. However, it is roughly symmetric.

Hence, the normality assumptions appear somehow.

g.

To determine

Plot the residual plot.

Explain whether the plot reveals any violations of assumptions of regression.

g.

Expert Solution
Check Mark

Explanation of Solution

From Part (e), the residual plot is obtained as,

STATISTICAL TECHNIQUES-ACCESS ONLY, Chapter 14, Problem 20CE , additional homework tip  7

Assumption for residual analysis for the regression model:

  • The plot of the residuals vs. the observed values of the predictor variable should fall roughly in a horizontal band and symmetric about x-axis.
  • For a normal probability plot, residuals should be roughly linear.
  • There should not be any observable pattern.

According to the given residual plot, the points are roughly in a horizontal band and more or less symmetric about x-axis. Moreover, there is no particular pattern in the residual plot. A complete haphazard and random nature has observed.

Hence, the assumptions of the residual plot are not violated.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Chapter 14 Solutions

STATISTICAL TECHNIQUES-ACCESS ONLY

Ch. 14 - The following regression output was obtained from...Ch. 14 - A study by the American Realtors Association...Ch. 14 - The manager of High Point Sofa and Chair, a large...Ch. 14 - Prob. 10ECh. 14 - Prob. 11ECh. 14 - A real estate developer wishes to study the...Ch. 14 - Prob. 13CECh. 14 - Prob. 14CECh. 14 - Prob. 15CECh. 14 - Prob. 16CECh. 14 - The district manager of Jasons, a large discount...Ch. 14 - Suppose that the sales manager of a large...Ch. 14 - The administrator of a new paralegal program at...Ch. 14 - Prob. 20CECh. 14 - Prob. 21CECh. 14 - A regional planner is studying the demographics of...Ch. 14 - Great Plains Distributors, Inc. sells roofing and...Ch. 14 - Prob. 24CECh. 14 - Prob. 25CECh. 14 - Prob. 26CECh. 14 - An investment advisor is studying the relationship...Ch. 14 - Prob. 28CECh. 14 - Prob. 29CECh. 14 - The director of special events for Sun City...Ch. 14 - Prob. 31CECh. 14 - Prob. 32CECh. 14 - Refer to the Real Estate data, which report...Ch. 14 - Prob. 34DECh. 14 - Refer to the Buena School District bus data....Ch. 14 - Prob. 1PCh. 14 - Quick-print firms in a large downtown business...Ch. 14 - The following ANOVA output is given. a. Compute...Ch. 14 - Prob. 1CCh. 14 - Prob. 2CCh. 14 - Prob. 3CCh. 14 - In a scatter diagram, the dependent variable is...Ch. 14 - What level of measurement is required to compute...Ch. 14 - If there is no correlation between two variables,...Ch. 14 - Which of the following values indicates the...Ch. 14 - Under what conditions will the coefficient of...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Given the following regression equation, = 7 ...Ch. 14 - Prob. 1.9PTCh. 14 - In a multiple regression equation, what is the...Ch. 14 - Prob. 1.11PTCh. 14 - Prob. 1.12PTCh. 14 - For a dummy variable, such as gender, how many...Ch. 14 - What is the term given to a table that shows all...Ch. 14 - If there is a linear relationship between the...Ch. 14 - Given the following regression analysis output: a....Ch. 14 - Given the following regression analysis output. a....
Knowledge Booster
Background pattern image
Statistics
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Text book image
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Text book image
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Text book image
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Text book image
College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
Text book image
College Algebra
Algebra
ISBN:9781938168383
Author:Jay Abramson
Publisher:OpenStax
Mod-01 Lec-01 Discrete probability distributions (Part 1); Author: nptelhrd;https://www.youtube.com/watch?v=6x1pL9Yov1k;License: Standard YouTube License, CC-BY
Discrete Probability Distributions; Author: Learn Something;https://www.youtube.com/watch?v=m9U4UelWLFs;License: Standard YouTube License, CC-BY
Probability Distribution Functions (PMF, PDF, CDF); Author: zedstatistics;https://www.youtube.com/watch?v=YXLVjCKVP7U;License: Standard YouTube License, CC-BY
Discrete Distributions: Binomial, Poisson and Hypergeometric | Statistics for Data Science; Author: Dr. Bharatendra Rai;https://www.youtube.com/watch?v=lHhyy4JMigg;License: Standard Youtube License