Concept explainers
Refer back to the data in Exercise 4, in which y = ammonium concentration (mg/L) and x = transpiration (ml/h). Summary quantities include n = 13, Σxi = 303.7, Σyi = 52.8, Sxx = 1585.230769, Sv = −341.959231. and Syy = 77.270769.
- a. Obtain the equation of the estimated regression line and use it to calculate a point prediction of ammonium concentration for a future observation made when ammonium concentration is 25 ml/h.
- b. What happens if the estimated regression line is used to calculate a point estimate of true average concentration when transpiration is 45 ml/h? Why does it not make sense to calculate this point estimate?
- c. Calculate and interpret s.
- d. Do you think the simple linear regression model does a good job of explaining observed variation in concentration? Explain.
a.
Find the interval estimate for the slope of the population regression.
Answer to Problem 35E
The 95% confidence interval for the slope of the population regression is
Explanation of Solution
Given info:
The summary statistics of the data correspond to the variables motion sickness dose
Calculation:
Linear regression model:
In a linear equation
A linear regression model is given as
Y-intercept:
In a linear equation
The general formula to obtain y-intercept is,
Slope:
In a linear equation
The general formula to obtain slope is,
The slope coefficient of the simple linear regression is,
Thus, the point estimate of the slope is
Total sum of square: (SST)
The total variation in the observed values of the response variable is defined as the total sum of squares. The formula for total sum of square is
The total sum of square is obtained as ,
Therefore, the total sum of squares is
Regression sum of square: (SSR)
The variation in the observed values of the response variable explained by the regression is defined as the regression sum of squares. The formula for regression sum of square is
The regression sum of squares is obtained as is,
Error sum of square: (SSE)
The variation in the observed values of the response variable which is not explained by the regression is defined as the error sum of squares. The formula for error sum of square is
The general formula to obtain error sum of square is,
The error sum of squares is obtained as,
Therefore, the error sum of squares is
Estimate of error standard deviation:
The general formula for the estimate of error standard deviation is,
The estimate of error standard deviation is obtained as,
Thus, the estimate of error standard deviation is
Error sum of square: (SSE)
The variation in the observed values of the response variable that is not explained by the regression is defined as the regression sum of squares. The formula for error sum of square is
Estimate of error standard deviation of slope coefficient:
The general formula for the estimate of error standard deviation of slope coefficient is,
The defining formula for
The estimate of error standard deviation of slope coefficient is,
Thus, the estimate of error standard deviation of slope coefficient is
Confidence interval:
The general formula for the confidence interval for the slope of the regression line is,
Where,
Since, the level of confidence is not specified. The prior confidence level 95% can be used.
Critical value:
For 95% confidence level,
Degrees of freedom:
The sample size is
The degrees of freedom is,
From Table A.5 of the t-distribution in Appendix A, the critical value corresponding to the right tail area 0.025 and 15 degrees of freedom is 2.131.
Thus, the critical value is
The 95% confidence interval is,
Thus, the 95% confidence interval for the slope of the population regression is
Interpretation:
There is 95% confident, that the expected change in % reported nausea associated with 1 unit increase in motion sickness dose lies between 0.632 and 2.440.
b.
Test whether there is enough evidence to conclude that the predictor variable motion sickness dose is useful for predicting the value of the response variable % reported nausea.
Answer to Problem 35E
There is sufficient evidence to conclude that the predictor variable motion sickness dose is useful for predicting the value of the response variable % reported nausea.
Explanation of Solution
Calculation:
From part (a), the slope coefficient of the regression line is
The test hypotheses are given below:
Null hypothesis:
That is, there is no useful relationship between the variables motion sickness dose
Alternative hypothesis:
That is, there is useful relationship between the variables motion sickness dose
T-test statistic:
The test statistic is,
Degrees of freedom:
The sample size is
The degrees of freedom is,
Thus, the degree of freedom is 15.
Level of significance:
Here, level of significance is not given.
So, the prior level of significance
For the level of significance
From Table A.5 of the t-distribution in Appendix A, the critical value corresponding to the right tail area 0.025 and 15 degrees of freedom is 2.131.
Thus, the critical value is
From part (a), the estimate of error standard deviation of slope coefficient is
Test statistic under null hypothesis:
Under the null hypothesis, the test statistic is obtained as follows:
Thus, the test statistic is 3.6226.
Decision criteria for the classical approach:
If
Conclusion:
Here, the test statistic is 3.6226 and critical value is 2.131.
The t statistic is greater than the critical value.
That is,
Based on the decision rule, the null hypothesis is rejected.
Hence, there is a linear relationship between the predictor variable % reported nausea and the response variable motion sickness dose.
Therefore, there is sufficient evidence to conclude that the predictor variable motion sickness dose is useful for predicting the value of the response variable % reported nausea.
c.
Check whether it is plausible to estimate the expected % reported nausea when the motion sickness dose is 5.0 using the obtained regression line.
Answer to Problem 35E
No, it is not plausible to estimate the expected % reported nausea when the motion sickness dose is 5.0 using the obtained regression line.
Explanation of Solution
Calculation:
Linear regression model:
A linear regression model is given as
Y-intercept:
In a linear equation
The general formula to obtain y-intercept is,
The y-intercept of the regression model is obtained as follows:
Thus, the y-intercept of the regression model is
From part (a), the slope coefficient of the regression line is
Therefore, the regression equation of the variables motion sickness dose
Predicted value of % reported nausea when the motion sickness dose is 5.0:
The predicted value of % reported nausea when the motion sickness dose is 5.0 is obtained as follows:
Thus, the predicted value of % reported nausea for 5.0 motion sickness dose is –7.947.
Here, the % reported nausea is resulted as a negative value, which is not possible in reality.
Thus, the predicted value is a flaw.
Moreover, it is given that the range of the values of the variable motion sickness dose is 6.0 to 17.6.
The value 5.0 is outside the range of the variable motion sickness dose. That is, the observation 5.0 is not available.
Hence, the regression line may not give good estimate of expected % reported nausea when the motion sickness dose is 5.0.
Therefore, it is not plausible to estimate the expected % reported nausea when the motion sickness dose is 5.0 using the obtained regression line.
d.
Find the interval estimate for the slope of the population regression after eliminating the observation
Comment whether the observation
Answer to Problem 35E
The 95% confidence interval for the slope of the population regression after eliminating the observation
Yes, the observation
Explanation of Solution
Calculation:
Linear regression model:
In a linear equation
A linear regression model is given as
Here, the observation
That is, the value 6.0 has to be removed from the variable motion sickness dose
The results of the summary statistics after eliminating the observation
Sample size:
Sum of the variable:
Sum of squares of the variable:
Y-intercept:
In a linear equation
The general formula to obtain y-intercept is,
Slope:
In a linear equation
The general formula to obtain slope is,
The slope coefficient of the simple linear regression is,
Thus, the point estimate of the slope is
Total sum of square: (SST)
The total variation in the observed values of the response variable is defined as the total sum of squares. The formula for total sum of square is
The total sum of square is obtained as ,
Therefore, the total sum of squares is
Regression sum of square: (SSR)
The variation in the observed values of the response variable explained by the regression is defined as the regression sum of squares. The formula for regression sum of square is
The regression sum of squares is obtained as is,
Error sum of square: (SSE)
The variation in the observed values of the response variable which is not explained by the regression is defined as the error sum of squares. The formula for error sum of square is
The general formula to obtain error sum of square is,
The error sum of squares is obtained as,
Therefore, the error sum of squares is
Estimate of error standard deviation:
The general formula for the estimate of error standard deviation is,
The estimate of error standard deviation is obtained as,
Thus, the estimate of error standard deviation is
Error sum of square: (SSE)
The variation in the observed values of the response variable that is not explained by the regression is defined as the regression sum of squares. The formula for error sum of square is
Estimate of error standard deviation of slope coefficient:
The general formula for the estimate of error standard deviation of slope coefficient is,
The defining formula for
The estimate of error standard deviation of slope coefficient is,
Thus, the estimate of error standard deviation of slope coefficient is
Confidence interval:
The general formula for the confidence interval for the slope of the regression line is,
Where,
Since, the level of confidence is not specified. The prior confidence level 95% can be used.
Critical value:
For 95% confidence level,
Degrees of freedom:
The sample size is
The degrees of freedom is,
From Table A.5 of the t-distribution in Appendix A, the critical value corresponding to the right tail area 0.025 and 14 degrees of freedom is 2.145.
Thus, the critical value is
The 95% confidence interval is,
Thus, the 95% confidence interval for the slope of the population regression is
Interpretation:
There is 95% confident, that the expected change in % reported nausea associated with 1 unit increase in motion sickness dose lies between 0.3719 and 2..7301.
Comparison:
The 95% confidence interval for the slope of the population regression with the observation
The 95% confidence interval for the slope of the population regression after eliminating the observation
Here, by observing both the intervals it is clear that the
Want to see more full solutions like this?
Chapter 12 Solutions
PROBABILITY & STATS FOR ENGINEERING &SCI
- Table 6 shows the population, in thousands, of harbor seals in the Wadden Sea over the years 1997 to 2012. a. Let x represent time in years starting with x=0 for the year 1997. Let y represent the number of seals in thousands. Use logistic regression to fit a model to these data. b. Use the model to predict the seal population for the year 2020. c. To the nearest whole number, what is the limiting value of this model?arrow_forwardWhat does the y -intercept on the graph of a logistic equation correspond to for a population modeled by that equation?arrow_forwardA regression was run to determine if there is a relationship between hours of TV watched per day (x) and number of situps a person can do (y). The results of the regression were: уах+ b a = -1.098 b = 37.154 r2 = 0.444889 r = -0.667 Use this to predict the number of situps a person who watches 11 hours of TV can do. situps = [one decimal accuracy]arrow_forward
- A regression was run to determine if there is a relationship between hours of TV watched per day (x) and number of situps a person can do (y). The results of the regression were: y=ax+b a=-0.83 b=22.809 r2=0.972196 r=-0.986 Use this to predict the number of situps a person who watches 5.5 hours of TV can do (to one decimal place)arrow_forwardAn agent for a real estate company in a large city would like to be able to predict the monthly rental cost for apartments, based on the size of the apartment, as defined by square footage. A sample of eight apartments in a neighborhood was selected, and the information gathered revealed the data shown below. For these data, the regression coefficients are b, = 89.7175 and b, = 1.0703. Complete parts (a) through (d). Monthly Rent (S) Size (Square Feet) 900 1,450 850 1,500 2,000 900 1,825 1,300 o 850 1,350 950 1,200 1,900 700 1.350 1.050 ..... a. Determine the coefficient of determination, r, and interpret its meaning. 2= 0.843 (Round to three decimal places as needed.) What is the meaning of ? O A. r measures the proportion of variation in apartment size that can be explained by the variation in monthly rent. O B. r measures the proportion of variation in apartment size that cannot be explained by the variation in monthly rent. O C. measures the proportion of variation in monthly rent…arrow_forwardA set of X and Y scores has MX=4, SSX=10, MY=5, SSY=40 and SP=20. What is the regression equation for predicitng Y from X?arrow_forward
- Q1) Interpret the following regression line y = 10.50 – 0.18xarrow_forwardA regression was run to determine if there is a relationship between hours of study per week (xx) and the final exam scores (yy).The results of the regression were: y=ax+b a=5.218 b=34.15 r2=0.3969 r=0.63 Use this to predict the final exam score of a student who studies 9.5 hours per week, and please round your answer to a whole number.arrow_forwardWe collected teacher ratings for 25 courses taught by an instructor over a six-vear period. The students' ratings of the instructor are on a scale of 1 to 9. We found that The linear regression equation is: Average Rating = 7,88 -0.068 Numher of Students 1. Interpret the slope of this model including units: The average rating decreases per each additional student 0.068 ed teacher rating for a class size of 15 students using the given 7.88 prediction equation is Next page CS Scanned with CamScannerarrow_forward
- The prelim grades (x) and midterm grades (y) of a sample of 10 MMW students is modeled by the regression line y = 12.0623 + 0.7771x. Estimate the prelim grade if the midterm grade is 83.arrow_forwardA set of X and Y scores has MX = 4, SSX = 10, MY = 5, SSY = 40, and SP = 20. Which is the regression equation for predicting Y from X?arrow_forwardA regression was run to determine if there is a relationship between hours of TV watched per day (x) and number of situps a person can do (y).The results of the regression were:y=a+bx b=-0.736 a=32.667 r2=0.576081 r=-0.759 Use this to predict the number of situps a person who watches 6 hours of TV can do.arrow_forward
- Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:CengageFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage Learning