Note: Exercises marked * are based on optional material. Instructions for Data Sets: Choose one of the data sets A – K below or as assigned by your instructor. Only the first three and last three observations are shown for each data set. In each data set, the dependent variable ( response ) is the first variable. Choose the independent variables ( predictors ) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., MegaStat or Minitab) to perform the necessary regression calculations and to obtain the required graphs. Write a concise report answering questions 13.25 through 13.41 (or a subset of these questions assigned by your instructor). Label sections of your report to correspond to the questions. Insert tables and graphs in your report as appropriate. You may work with a partner if your instructor allows it. If you did not already do so, request leverage statistics. Are any observations influential? Explain.

Question

Want to see more full solutions like this?

Answer 1

Textbook Question

Chapter 13, Problem 38CE

Note: Exercises marked * are based on optional material.

Instructions for Data Sets: Choose one of the data sets A–K below or as assigned by your instructor. Only the first three and last three observations are shown for each data set. In each data set, the dependent variable (response) is the first variable. Choose the independent variables (predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., MegaStat or Minitab) to perform the necessary regression calculations and to obtain the required graphs. Write a concise report answering questions 13.25 through 13.41 (or a subset of these questions assigned by your instructor). Label sections of your report to correspond to the questions. Insert tables and graphs in your report as appropriate. You may work with a partner if your instructor allows it.

If you did not already do so, request leverage statistics. Are any observations influential? Explain.

Expert Solution & Answer

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Students have asked these similar questions

The average miles per gallon for a sample of 40 cars of model SX last year was 32.1, with a population standard deviation of 3.8. A sample of 40 cars from this year’s model SX has an average of 35.2 mpg, with a population standard deviation of 5.4. Find a 99 percent confidence interval for the difference in average mpg for this car brand (this year’s model minus last year’s).Find a 99 percent confidence interval for the difference in average mpg for last year’s model minus this year’s. What does the negative difference mean?

A special interest group reports a tiny margin of error (plus or minus 0.04 percent) for its online survey based on 50,000 responses. Is the margin of error legitimate? (Assume that the group’s math is correct.)

Suppose that 73 percent of a sample of 1,000 U.S. college students drive a used car as opposed to a new car or no car at all. Find an 80 percent confidence interval for the percentage of all U.S. college students who drive a used car.What sample size would cut this margin of error in half?

Answer 2

Textbook Question

Chapter 13, Problem 38CE

Note: Exercises marked * are based on optional material.

Instructions for Data Sets: Choose one of the data sets A–K below or as assigned by your instructor. Only the first three and last three observations are shown for each data set. In each data set, the dependent variable (response) is the first variable. Choose the independent variables (predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., MegaStat or Minitab) to perform the necessary regression calculations and to obtain the required graphs. Write a concise report answering questions 13.25 through 13.41 (or a subset of these questions assigned by your instructor). Label sections of your report to correspond to the questions. Insert tables and graphs in your report as appropriate. You may work with a partner if your instructor allows it.

If you did not already do so, request leverage statistics. Are any observations influential? Explain.

Expert Solution & Answer

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 3

Textbook Question

Chapter 13, Problem 38CE

Note: Exercises marked * are based on optional material.

Instructions for Data Sets: Choose one of the data sets A–K below or as assigned by your instructor. Only the first three and last three observations are shown for each data set. In each data set, the dependent variable (response) is the first variable. Choose the independent variables (predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., MegaStat or Minitab) to perform the necessary regression calculations and to obtain the required graphs. Write a concise report answering questions 13.25 through 13.41 (or a subset of these questions assigned by your instructor). Label sections of your report to correspond to the questions. Insert tables and graphs in your report as appropriate. You may work with a partner if your instructor allows it.

If you did not already do so, request leverage statistics. Are any observations influential? Explain.

Expert Solution & Answer

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 4

Textbook Question

Chapter 13, Problem 38CE

Note: Exercises marked * are based on optional material.

Instructions for Data Sets: Choose one of the data sets A–K below or as assigned by your instructor. Only the first three and last three observations are shown for each data set. In each data set, the dependent variable (response) is the first variable. Choose the independent variables (predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., MegaStat or Minitab) to perform the necessary regression calculations and to obtain the required graphs. Write a concise report answering questions 13.25 through 13.41 (or a subset of these questions assigned by your instructor). Label sections of your report to correspond to the questions. Insert tables and graphs in your report as appropriate. You may work with a partner if your instructor allows it.

If you did not already do so, request leverage statistics. Are any observations influential? Explain.

Expert Solution & Answer

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 5

Textbook Question

Answer 6

Textbook Question

Answer 7

Expert Solution & Answer

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Answer 8

Expert Solution & Answer

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Answer 9

Expert Solution & Answer

Answer 10

Expert Solution & Answer

Answer 11

To determine

Find leverage statistics.

Identify any of the observations are influential.

Answer 12

Answer to Problem 38CE

The leverage statistics are,

Observation	Sales/SqFt	Predicted	Residual	Leverage
1	702	505.5378	196.4622	0.0659
2	210	388.0933	–178.093	0.0818
3	365	419.7986	–54.7986	0.0789
4	443	419.2017	23.79828	0.0458
5	399	336.1339	62.86605	0.0798
6	265	524.3932	–259.393	0.0816
7	572	365.8614	206.1386	0.0655
8	642	491.9392	150.0608	0.0935
9	461	422.9892	38.0108	0.0225
10	639	458.365	180.635	0.0703
11	484	502.2794	–18.2794	0.0715
12	581	466.1341	114.8659	0.0478
13	268	432.9745	–164.974	0.0586
14	573	497.5596	75.44042	0.2737
15	586	525.2306	60.76944	0.1168
16	369	398.5007	–29.5007	0.0584
17	351	498.6047	–147.605	0.0985
18	458	429.3871	28.61286	0.0535
19	987	614.5091	372.4909	0.1820
20	357	454.3592	–97.3592	0.0429
21	406	417.2942	–11.2942	0.0250
22	681	391.1612	289.8388	0.0493
23	368	492.4983	–124.498	0.2093
24	304	460.1672	–156.167	0.0604
25	394	415.2689	–21.2689	0.0913
26	562	486.68	75.31997	0.0580
27	495	423.7816	71.21836	0.0942
28	310	388.496	–78.496	0.1363
29	373	422.8679	–49.8679	0.0227
30	236	345.16	–109.16	0.1516
31	413	406.2904	6.709589	0.1565
32	625	543.4075	81.59252	0.1197
33	274	397.4102	–123.41	0.0526
34	543	558.9323	–15.9323	0.1372
35	179	297.105	–118.105	0.0794
36	375	361.7308	13.26922	0.0837
37	329	433.9038	–104.904	0.0659
38	297	430.0182	–133.018	0.0682
39	323	455.7566	–132.757	0.0800
40	469	404.899	64.101	0.0291
41	353	497.4495	–144.449	0.0837
42	380	491.0586	–111.059	0.0696
43	398	408.7628	–10.7628	0.0353
44	312	318.6083	–6.60827	0.0574
45	452	432.4409	19.55915	0.0731
46	699	362.4679	336.5321	0.0617
47	367	347.5704	19.42961	0.0801
48	432	380.8856	51.11438	0.0736
49	367	355.4863	11.51368	0.0922
50	401	381.559	19.44102	0.0432
51	414	481.2256	–67.2256	0.0375
52	481	428.1006	52.89939	0.0183
53	538	415.7548	122.2452	0.0271
54	330	359.279	–29.279	0.0356
55	250	438.5112	–188.511	0.0532
56	292	396.9591	–104.959	0.0582
57	517	411.7635	105.2365	0.0231
58	552	470.1005	81.89945	0.0275
59	387	361.7699	25.23009	0.0832
60	427	408.3022	18.69777	0.0631
61	454	497.6884	–43.6884	0.0887
62	512	441.1052	70.89483	0.0793
63	345	375.7731	–30.7731	0.1071
64	234	334.17	–100.17	0.0622
65	348	333.4539	14.54613	0.1051
66	348	458.6665	–110.666	0.1285
67	295	315.655	–20.655	0.1077
68	361	376.5859	–15.5859	0.0450
69	468	232.9942	235.0058	0.2319
70	404	393.7594	10.24059	0.1052
71	246	373.6202	–127.62	0.1022
72	340	403.9505	–63.9505	0.1144
73	401	413.2786	–12.2786	0.0619
74	327	316.5622	10.43785	0.1045

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

The influential observation is 23.

Answer 13

Explanation of Solution

Calculation

The given information is that, the dataset of ‘Noodles & Company Sales, Seating, and Demographic data’ contains n=74 observations. The response variable is ‘annual sales per square foot’, there are k=5 predictor variables ‘Interior Seat Count, Patio Seat Count, Median HH Income, Median Age of Population, % with Bachelor's Degree’. The considered level of significance is 0.05.

Software procedure:

Step by step procedure to obtain regression output using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• In Options> Residuals chooseDiagnostics and influential residuals.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 1

Influential observation:

The influential observation has a great effect on the parameters of the regression line when it is removed from the data set.

The influential observations can be identified using the leverage values. If the observation have the high leverage value, that is any leverage statistic is greater than value of 2(k+1)n, k denotes the number of predictors and n denotes the number of observations, then remove the observation from the data set redo the regression analysis, if the regression statistic changes significantly then the observation is considered as influential observation.

Substitute, n=74,k=5 in the formula,

2(k+1)n=2(5+1)74=1274=0.1622

The leverage statistics greater than 0.1622 are, 0.274 corresponding to observation 14, 0.182 corresponding to observation 19, 0.209 corresponding to observation 23 and 0.232 corresponding to observation 69

The observations 14, 19, 23 and 69 are considered to have higher leverage values.

Regression conclusion including all observations:

Let β1 is the parameter for the predictor seats-inside, β2 is the parameter for the predictor seats-Patio, β3 is the parameter for the predictor median income, β4 is the parameter for the predictor median age of population, Let β5 is the parameter for the predictor % with Bachelor's Degree.

The p-value for predictor seats-inside is 0.0733.

The p-value for predictor seats-patio is 0.2350.

The p-value for predictor MedIncome is 0.0589.

The p-value for predictor MedAge is 0.9972.

The p-value for predictor BachDeg% is 0.0015.

Null hypothesis:

H0:βj=0 ; j=1,2,3,4,5

The predictor variable j is not related to annual sales.

Alternative hypothesis:

H1:βj≠0

The predictor variable j is related to annual sales.

Rejection rules:

• If p-value is less than the level of significance then the null hypothesis is rejected. The predictor is significant.
• If p-value is greater than the level of significance then the null hypothesis is not rejected. The predictor is not significant.

Conclusion for seats-inside:

The p-value for predictor seats-inside is 0.0733.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0733)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-inside is not related to annual sales.

The predictor seats-inside is not significant.

Conclusion for seats-patio:

The p-value for predictor seats-patio is 0.2350.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.2350)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable seats-patio is not related to annual sales.

The predictor seats-patio is not significant.

Conclusion for median income:

The p-value for predictor median income is 0.0589.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.0589)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median income is not related to annual sales.

The predictor median income is not significant.

Conclusion for median age:

The p-value for predictor median age of population is 0.9972.

The level of significance is 0.05.

The p-value is greater than the level of significance.

That is, p-value(=0.9972)>α(=0.05).

The null hypothesis is not rejected.

The predictor variable median age of population is not related to annual sales.

The predictor median age of population is not significant.

Conclusion for ‘% with Bachelor's Degree’:

The p-value for predictor ‘% with Bachelor's Degree’ is 0.0015.

The level of significance is 0.05.

The p-value is less than the level of significance.

That is, p-value(=0.0015)<α(=0.05).

The null hypothesis is rejected.

The predictor variable ‘% with Bachelor's Degree’ is related to annual sales.

The predictor ‘% with Bachelor's Degree’of population is significant.

The p-value for ‘% with Bachelor's Degree’ indicates predictor significance at α=0.05. All the other predictor variables are not significant.

Regression analysis by removing the observation 14:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 2

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0015 is significant at α=0.05 by removing observation 13. All the remaining predictors are not significant. Hence removing 14 did not change the significance of the predictors.

Regression analysis by removing the observation 19:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 3

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0016 is significant at α=0.05 by removing observation 19. All the remaining predictors are not significant. Hence removing 19 did not change the significance of the predictors.

Regression analysis by removing the observation 23:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 4

It is clear that the predictor variables ‘MedIncome’ with p-value 0.0496 and ‘BachDeg%’ with p-value 0.0016 are significant at α=0.05 because p-value is less than level of significance, by removing observation 23. All the remaining predictors are not significant. Hence removing 23has changed the significance for predictor ‘Median income’.

Regression analysis by removing the observation 69:

Software procedure:

Step by step procedure to obtain regression equation using MegaStat software is given as,

• Choose MegaStat >Correlation/Regression>Regression Analysis.
• SelectInput ranges, enter the variable range for ‘Seats-Inside, Seats-Patio, MedIncome, MedAge, BachDeg%’ as the column of X, Independent variable(s)
• Enter the variable range for ‘Sales/SqFt’ as the column of Y, Dependent variable.
• Click OK.

Output using MegaStatsoftware is given below:

Loose-leaf For Applied Statistics In Business And Economics, Chapter 13, Problem 38CE , additional homework tip 5

It is clear that the predictor variable ‘BachDeg%’ with p-value 0.0017 is significant at α=0.05 by removing observation 69. All the remaining predictors are not significant. Hence removing 69 did not change the significance of the predictors.

The significance for the regression statistics has changed when the observation 23 is removed from the data set. Hence, the influential observation is 23.

Answer 14

Ch. 13.1 - Observations are taken on net revenue from sales...Ch. 13.1 - Observations are taken on sales of a certain...Ch. 13.1 - Prob. 3SE Ch. 13.1 - A regression model to predict Y, the...Ch. 13.2 - Refer to the ANOVA table below. (a) State the...Ch. 13.2 - Refer to the ANOVA table below. (a) State the...Ch. 13.2 - Prob. 7SE Ch. 13.2 - Refer to the ANOVA table below. (a) State the...Ch. 13.3 - Observations are taken on net revenue from sales...Ch. 13.3 - Observations are taken on sales of a certain...

Videos

Answer to Problem 38CE

Explanation of Solution

Want to see more full solutions like this?

Chapter 13 Solutions