(a) To calculate: The sum of squared errors, SSE.

Question

Want to see more full solutions like this?

Answer 1

Question

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

To determine

(b)

To calculate:

The standard error of estimate, Se.

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Students have asked these similar questions

Problem 3. Pricing a multi-stock option the Margrabe formula The purpose of this problem is to price a swap option in a 2-stock model, similarly as what we did in the example in the lectures. We consider a two-dimensional Brownian motion given by W₁ = (W(¹), W(2)) on a probability space (Q, F,P). Two stock prices are modeled by the following equations: dX = dY₁ = X₁ (rdt+ rdt+0₁dW!) (²)), Y₁ (rdt+dW+0zdW!"), with Xo xo and Yo =yo. This corresponds to the multi-stock model studied in class, but with notation (X+, Y₁) instead of (S(1), S(2)). Given the model above, the measure P is already the risk-neutral measure (Both stocks have rate of return r). We write σ = 0₁+0%. We consider a swap option, which gives you the right, at time T, to exchange one share of X for one share of Y. That is, the option has payoff F=(Yr-XT). (a) We first assume that r = 0 (for questions (a)-(f)). Write an explicit expression for the process Xt. Reminder before proceeding to question (b): Girsanov's theorem…

Problem 1. Multi-stock model We consider a 2-stock model similar to the one studied in class. Namely, we consider = S(1) S(2) = S(¹) exp (σ1B(1) + (M1 - 0/1 ) S(²) exp (02B(2) + (H₂- M2 where (B(¹) ) +20 and (B(2) ) +≥o are two Brownian motions, with t≥0 Cov (B(¹), B(2)) = p min{t, s}. " The purpose of this problem is to prove that there indeed exists a 2-dimensional Brownian motion (W+)+20 (W(1), W(2))+20 such that = S(1) S(2) = = S(¹) exp (011W(¹) + (μ₁ - 01/1) t) 롱) S(²) exp (021W (1) + 022W(2) + (112 - 03/01/12) t). where σ11, 21, 22 are constants to be determined (as functions of σ1, σ2, p). Hint: The constants will follow the formulas developed in the lectures. (a) To show existence of (Ŵ+), first write the expression for both W. (¹) and W (2) functions of (B(1), B(²)). as (b) Using the formulas obtained in (a), show that the process (WA) is actually a 2- dimensional standard Brownian motion (i.e. show that each component is normal, with mean 0, variance t, and that their…

The scores of 8 students on the midterm exam and final exam were as follows. Student Midterm Final Anderson 98 89 Bailey 88 74 Cruz 87 97 DeSana 85 79 Erickson 85 94 Francis 83 71 Gray 74 98 Harris 70 91 Find the value of the (Spearman's) rank correlation coefficient test statistic that would be used to test the claim of no correlation between midterm score and final exam score. Round your answer to 3 places after the decimal point, if necessary. Test statistic: rs =

Answer 2

Question

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

To determine

(b)

To calculate:

The standard error of estimate, Se.

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 3

Question

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

To determine

(b)

To calculate:

The standard error of estimate, Se.

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 4

Question

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

To determine

(b)

To calculate:

The standard error of estimate, Se.

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!

Answer 5

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

To determine

(b)

To calculate:

The standard error of estimate, Se.

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).

Answer 6

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

Answer 7

Chapter 12.CR, Problem 9CR

To determine

(a)

To calculate:

The sum of squared errors, SSE.

Answer 8

Expert Solution

Answer to Problem 9CR

Solution:

The required SSE is 16.7246.

Answer 9

Expert Solution

Answer 10

Expert Solution

Answer 11

Expert Solution

Answer 12

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

The least Squares regression line is the line for which the average variation from the data is the smallest, also called the line of best fit, given by

y^=b0+b1x.

Where b1 is the slope of the least-squares regression line for paired data from a sample,

And b0 is the y-intercept for the regression line.

Formula used:

The equation of least-squares regression line is given by,

y^=b0+b1x

Where b1 is the slope of the least-squares regression line given as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

And b0 is y-intercept given as,

b0=∑yin−b1∑xin

Where n is the number of data pairs in the sample,

xi is the ith value of the explanatory variable,

And yi is the ith value of response variable.

The sum of squared errors (SSE) for a regression line is calculated as,

SSE=∑(yi−y^i)2

Where, yi is the ith value of response variable,

And y^i is the predicted value of yi, using the least-squares regression model.

Calculation:

Thread Count	Price(in Dollars)	xiyi	xi2	yi2
150	18	2700	22500	324
200	21	4200	40000	441
225	25	5625	50625	625
250	28	7000	62500	784
275	30	8250	75625	900
300	31	9300	90000	961
350	35	12250	122500	1225
400	45	18000	160000	2025
∑xi=2150	∑yi=233	∑xiyi=67325	∑xi2=623750	∑yi2=7285

Let xi be the thread counts of various bed sheets,

And yi be the price of the bed sheets.

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Where, ∑xi=x1+x2+....+x8.

Substitute 150 for x1, 200 for x2 ….., 400 for x8 in the above formula.

∑xi=150+200+.....+350+400=2150

Proceed in the same manner to calculate ∑yi,∑xiyi,∑xi2and∑yi2 for the rest of the data and refer table for the rest of the values calculated.

∑yi=18+21+.......+34+45=233

∑xiyi=2700+4200+.......+12250+18000=67325

∑xi2=22500+40000+.......+122500+160000=623750

∑yi2=324+441+625+.......+1225+2025=7285

The slope of the least-squares regression line is calculated as,

b1=n∑xiyi−(∑xi)(∑yi)n∑xi2−(∑xi)2

Substitute 2150 for ∑xi, 233 for ∑yi, 67325 for ∑xiyi, 623750 for ∑xi2 and 8 for n in the above formula.

b1=(8×67325)−(233×2150)8(623750)−(2150)2=0.1024

The y-intercept of regression line is calculated as,

b0=∑yin−b1∑xin

Substitute 2150 for ∑xi, 233 for ∑yi, 8 for n and 0.1024for b1.

b0=2338−(0.1024)21508=1.6050

The equation of least-squares regression line is given by,

y^=b0+b1x

Substitute 28.6514 for b0 and 0.1024 for b1 in the above formula.

y^=1.6050+0.1024x.

Number of years xi	Annual income yi	Predicted value y^i	yi−y^i	(yi−y^i)2
150	18	16.965	1.035	1.071225
200	21	22.085	-1.085	1.177225
225	25	24.645	0.355	0.126025
250	28	27.205	0.795	0.632025
275	30	29.765	0.235	0.055225
300	31	32.325	-1.325	1.755625
350	35	37.445	-2.445	5.978025
400	45	42.565	2.435	5.929225

The predicted values are calculated as,

y^=1.6050+0.1024x

The predicted value y1 is calculated as,

y^1=1.6050+0.1024x1

Substitute 150 for x1 in the above formula.

y^1=1.6050+0.1024(150)=16.965

Proceed in the same manner to calculate y^1 for the rest of the data and refer table for the rest of the values calculated.

The residual is calculated as, yi−y^i,

Substitute 18 for y1 and 16.965for y^1.

y1−y^1=18−16.965=1.035

Square both sides of the equation.

(y1−y^1)2=(1.035)2=1.0712

Proceed in the same manner to calculate (yi−y^i)2 for all the 1≤i≤n for the rest data and refer table for the rest of the (yi−y^i)2 values calculated. Then the value of ∑(yi−y^i)2 is calculated as,

SSE=∑(yi−y^ i)2=1.0712+1.177+0.1260+......+5.92922=16.7246

Conclusion:

Thus, the SSE is 16.7246

Answer 13

To determine

(b)

To calculate:

The standard error of estimate, Se.

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

Answer 14

To determine

(b)

To calculate:

The standard error of estimate, Se.

Answer 15

Expert Solution

Answer to Problem 9CR

Solution:

The required standard error of estimate is 1.6696.

Answer 16

Expert Solution

Answer 17

Expert Solution

Answer 18

Expert Solution

Answer 19

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The standard error of estimate, which is used to measure by how much the sample data points deviate from regression line is given by,

Se=∑(yi−y^ i)2n−2=SSEn−2

Where, yi is the ith value of response variable,

y^i is the predicted value of yi, using the least-squares regression model,

n is the number of data pairs in the sample,

And SSE is the sum of squared errors.

Calculation:

The standard error of estimate is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

Substitute 5868.153 for SSE and 8 for n in the above formula.

Se=16.72468−2=1.6696

Conclusion:

Thus, the standard error of estimate is 1.6696.

Answer 20

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

Answer 21

To determine

(c)

The 95% prediction interval for the price of 350-thread count sheets.

Answer 22

Expert Solution

Answer to Problem 9CR

Solution:

The required prediction interval is. (33.0776,41.8124).

Answer 23

Expert Solution

Answer 24

Expert Solution

Answer 25

Expert Solution

Answer 26

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula used:

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

With degree of freedom df=n−2.

Where, xi of response variable,

x0 is the fixed value,

Se is the standard error of estimate,

n is the number of data pairs in the sample,

SSE is the sum of squared errors,

And t -distribution is applied with degree of freedom of df=n−2

Then the prediction interval for an individual y-value is,

(y^−E,y^+E).

Calculation:

It is given that the level of prediction is 0.95 then the level of significance is calculated as,

α=1−0.95=0.05

Then,

tα/2=2.447

The mean of the number of years of post high school education is calculated as,

x¯=∑xin

Substitute 2150 for ∑xi and 8 for n in the above formula of mean.

x¯=21508=268.75

The margin of error of a prediction interval for an individual y-value is calculated as,

E=tα/2Se1+1n+n(x0−x¯)2n(∑xi2)−(∑xi)2

Substitute 2.447 for tα/2, 1.6696 for Se, 8 for n, 350 for x0, 623750 for ∑xi2, 2150 for ∑xi and 268.75 for x¯ in the above formula.

E=2.447×1.6696×1+18+8(350−268.75)28(623750)−(2150)2=2.447×1.6696×1.0690=4.3674

The y^ is calculated by substituting the value of x0 in the regression line equation.

The regression line is,

y^=1.6050+0.1024x

Substitute 350 for x in the above formula.

y^=1.6050+0.1024(350)=37.4450

The prediction interval is,

(y^−E,y^+E)=(37.4450−4.3674,37.4450+4.3674)=(33.0776,41.8124)

Conclusion:

The required prediction interval is. (33.0776,41.8124)

Answer 27

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

Answer 28

To determine

(d)

The 95% confidence interval for the y-intercept of the regression line.

Answer 29

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (−3.7304,6.9140).

Answer 30

Expert Solution

Answer 31

Expert Solution

Answer 32

Expert Solution

Answer 33

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST =∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R	0.983094904
R Square	0.96647559
Adjusted R Square	0.960888189
Standard Error	1.66955532
Observations	8

ANOVA
	df	SS	MS	F	Significance F
Regression	1	482.1505102	482.1505102	172.9740696	1.19253E-05
Residual	6	16.7244898	2.787414966
Total	7	498.875

	Coefficients	Standard Error	t Stat	P-value	Lower 95%	Upper 95%
Intercept	1.591836735	2.17509095	0.73184835	0.491845191	-3.730419077	6.914092547
Slope	0.10244898	0.007789635	13.15196067	1.19253E-05	0.083388428	0.121509531

RESIDUAL OUTPUT
Observation	Predicted y	Residuals	Standard Residuals
1	16.95918367	1.040816327	0.673359012
2	22.08163265	-1.081632653	-0.699765248
3	24.64285714	0.357142857	0.231054563
4	27.20408163	0.795918367	0.514921598
5	29.76530612	0.234693878	0.151835856
6	32.32653061	-1.326530612	-0.858202663
7	37.44897959	-2.448979592	-1.584374147
8	42.57142857	2.428571429	1.571171029

The confidence interval of the y-intercept can be constructed by adding and subtracting the margin of error to the point estimate by using Microsoft excel.

Referring regression statistics, R2, coefficient of determination is 0.9995 which implies that 99.95% of the response variable is explained by the explanatory variable.

Standard error is the standard error of estimate, Se calculated by the formula mentioned in concept.

The lower 95% and the upper 95% gives the confidence interval of the y-intercept.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the y-intercept of the regression line, β0 is. (−3.7304,6.9140)

Conclusion:

Thus, the 95% confidence interval for the y-intercept of the regression line is.

Answer 34

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).

Answer 35

To determine

(e)

Construct a 95% confidence interval for the slope of the regression line.

Answer 36

Expert Solution

Answer to Problem 9CR

Solution:

The required confidence interval is (0.0833,0.1215).

Answer 37

Expert Solution

Answer 38

Expert Solution

Answer 39

Expert Solution

Answer 40

Explanation of Solution

Given Information:

The following data is collected on the number of years of post high-school education and the annual incomes of eight people ten years after graduation from high school.

Thread Count	150	200	225	250	275	300	350	400
Price (in Dollars)	18	21	25	28	30	31	35	45

Formula Used:

The correlation coefficient measures the linear relationship between a response variable and explanatory variable calculated as,

r=n(∑xiyi)−(∑xi)(∑yi)n∑xi2−(∑xi)2n∑yi2−(∑yi)2

Coefficient of determination measures the proportion of variation in the response variable caused by explanatory variable which is simply the square of r, the correlation coefficient.

The standard error of estimate, Se which is used to calculate by how much the sample data points deviates from regression line which is calculated as,

Se=∑(yi−y^ i)2n−2=SSEn−2

In ANOVA,

Grand Mean is the weighted mean of the k sample means, one from each of the k populations, equivalent to the mean of all the sample data combined, given by

x¯¯=∑i=1knix¯i∑i=1kni

Sum of Squares among Treatments (SST) is the measures the variation between the sample means and the grand mean, given by,

SST=∑i=1kni(x¯i−x¯¯)2

Sum of Squares for Error (SSE) is the measures the variation in the sample data resulting from the variability within each sample,

SSE =∑j=1n1(x1j−x¯1)2+∑j=1n2(x2j−x¯2)2+...+∑j=1nk(xkj−x¯k)2

Total Variation, it is the sum of the squared deviations from the grand mean for all of the data values in each sample, given by

Total Variation=∑j=1n1(x1j−x¯ ¯)2+∑j=1n2(x2j−x¯ ¯)2+...+∑j=1nk(xkj−x¯ ¯)2=SST + SSE

Mean Square for Treatments (MST) found by dividing the sum of squares among treatments by its degrees of freedom, given by

MST=SSTDFTwith DFT=k−1

Mean Square for Error (MSE) found by dividing the sum of squares for error by its degrees of freedom, given by

MST=SSEDFEwith DFE=nT−k

Test Statistic for an ANOVA Test is used when independent, simple random samples are taken from populations with variances that are unknown and assumed to be equal, where all of the k population distributions are approximately normal, given by,

F=MSTMSEwith df1=DFT=k−1 and df2=DFE=nT−k

Calculation:

To generate the regression table in excel follow the given steps:

1. Under data tab, choose data analytics and then select regression.

2. Select the input Y range and enter the range of the given yi and select the input X range and enter the range of the given xi data.

3.Choose 95% confidence interval and click OK.

The following table will appear.

Regression Statistics
Multiple R		0.983094904
R Square		0.96647559
Adjusted R Square		0.960888189
Standard Error		1.66955532
Observations		8
ANOVA
	df		SS				MS			F			Significance F
Regression			1		482.1505102				482.1505102			172.9740696			1.19253E-05
Residual			6		16.7244898				2.787414966
Total			7		498.875
	Coefficients			Standard Error				t Stat			P-value			Lower 95%		Upper 95%
Intercept	1.591836735			2.17509095				0.731848356			0.491845191			-3.730419077		6.914092547
Slope	0.10244898			0.007789635				13.15196067			1.19253E-05			0.083388428		0.121509531
RESIDUAL OUTPUT

Observation			Predicted y			Residuals				Standard Residuals
1			16.95918367			1.040816327				0.673359012
2			22.08163265			-1.081632653				-0.699765248
3			24.64285714			0.357142857				0.231054563
4			27.20408163			0.795918367				0.514921598
5			29.76530612			0.234693878				0.151835856
6			32.32653061			-1.326530612				-0.858202663
7			37.44897959			-2.448979592				-1.584374147
8			42.57142857			2.428571429				1.571171029

The lower 95% and the upper 95% gives the confidence interval of the slope.

The intercept given in the row of the table above is the b0 of the regression line coming out to be 1.59 and the slope which is 0.1024 given in the table above is b1 of the regression line.

So the regression line is,

y^=1.59+0.1024x

The lower and the upper endpoints for a 95% confidence interval for the slope of the regression line, β1 is. (0.0833,0.1215).

Conclusion:

Thus, the 95% confidence interval for the slope of the regression line is (0.0833,0.1215).