Prac SLR2

pdf

School

Palm Beach State College *

*We aren’t endorsed by this school

Course

2023

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

11

Uploaded by mebe4u23

Report
Practice Simple Linear Regression 2 Page 1 of 11 © 2022 Radha Bose Florida State University Department of Statistics 1. The scatterplot below shows the weights (in lbs) and heights (in inches) of some members of FSU’s Dept of Statistics. Which of the graphs below it is the correct residual plot? A B C D
Practice Simple Linear Regression 2 Page 2 of 11 © 2022 Radha Bose Florida State University Department of Statistics 2. Four different models fitted to the same data set gave rise to the four residual plots shown below. Which residual plot corresponds to the model that is most appropriate for the data? (Admittedly, none of the four appear to be appropriate, but if you had to choose one of the four, which should you choose?) 3. Data compiled from the National Aeronautics and Space Administration, NASA, http://solarsystem.nasa.gov/index.cfm . A scatterplot and residual plot are given below. The subjects are these planets, listed here in order of equatorial circumference: Mercury, Mars, Venus, Earth, Neptune, Uranus, Saturn, Jupiter. DAY = sidereal rotation period (time for one full turn about the axis), in hours EQUATOR = equatorial circumference, in km (a) Which planet has the smallest positive residual? ___________________________________ (b) Which planet has the largest (in magnitude) negative residual? _______________________ (c) Use the residual plot to help you sketch in the least squares regression line on the scatterplot. (d) The slope (or the slant) of the regression line suggests that we should expect larger planets to spin faster / slower about their axes. (e) If R 2 =13.77%, what is the linear correlation coefficient of the data? (Note: none of the DAY or EQUATOR values are actually zero, they are all positive numbers.) 0 1000 2000 3000 4000 5000 6000 7000 0 100000 200000 300000 400000 500000 DAY EQUATOR -5000 0 5000 0 100000 200000 300000 400000 500000 DAY Residuals EQUATOR
Practice Simple Linear Regression 2 Page 3 of 11 © 2022 Radha Bose Florida State University Department of Statistics 4. (Data from STA 2171-0001 SuC22, collected on May 9th, 2022.) The graphs below and the Excel output on the next page are for the heights (in inches) and femur lengths (in inches) of some FSU students. The questions are on the page after the Excel output. 55 60 65 70 75 80 14 15 16 17 18 19 20 21 22 23 24 Height (inches) Femur Length (inches) -6 -4 -2 0 2 4 6 14 16 18 20 22 24 Residuals Femur
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Practice Simple Linear Regression 2 Page 4 of 11 © 2022 Radha Bose Florida State University Department of Statistics
Practice Simple Linear Regression 2 Page 5 of 11 © 2022 Radha Bose Florida State University Department of Statistics (a) What are the following? Subjects, Response with units, Predictor with units, Slope with units, Y-intercept with units, Regression Equation, Corr(Y, Ŷ), R 2 , r XY (b) How many students was data collected from? (c) For how many students was the predicted response greater than the real response? Each dot that you can see on the graph represents exactly one subject. (d) Interpret the standard error in words, in context. (e) What is the estimated standard deviation of all heights that go with a femur length of 19.5 inches? (f) The output provides a 95% confidence interval for the slope. Identify the interval and explain what it tells us. (g) The output provides a 95% confidence interval for the y-intercept. Identify the interval and explain what it tells us. (h) Report an F-test for a linear relationship. Write out the full procedure taught in this class and use a 5% significance level. If we were making an error here, which type would it be? Write out in words, in context, what making such an error would mean. Explain the power of the test in words, in context. Very briefly , explain the p-value in words, in context.
Practice Simple Linear Regression 2 Page 6 of 11 © 2022 Radha Bose Florida State University Department of Statistics 5. (Data gathered by Matthew St.Amant and Jeffrey Davis for their project in my SU07 STA 2122 class.) The graphs below and the Excel output on the next page are for the regression analysis of carpet prices and gas prices recorded at 13 irregular intervals in the year 2006. The questions are on the page after the Excel output. $0.90 $1.10 $1.30 $1.50 $1.70 $1.90 $2.10 $2.30 $2.20 $2.30 $2.40 $2.50 $2.60 $2.70 $2.80 $2.90 $3.00 Carpet Price ($/sq.yd) Gas Price ($/gal) -$0.80 -$0.60 -$0.40 -$0.20 $0.00 $0.20 $0.40 $0.60 $0.80 $2.20 $2.30 $2.40 $2.50 $2.60 $2.70 $2.80 $2.90 $3.00 Residuals Gas Price ($/gal)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Practice Simple Linear Regression 2 Page 7 of 11 © 2022 Radha Bose Florida State University Department of Statistics
Practice Simple Linear Regression 2 Page 8 of 11 © 2022 Radha Bose Florida State University Department of Statistics (a) The subjects are time intervals . What are the following? Response with units, Predictor with units, Slope with units, Y-intercept with units, Regression Equation, Corr(Y, Ŷ), R 2 , r XY (b) How many times were carpet and gas prices recorded? (c) For how many subjects was the predicted response less than the real response? Each dot that you can see on the graph represents exactly one subject. (d) Interpret the standard error in words, in context. (e) What is the estimated standard deviation of all carpet prices when the gas price was $2.50? (f) The output provides a 95% confidence interval for the slope. Identify the interval and explain what it tells us. (g) The output provides a 95% confidence interval for the y-intercept. Identify the interval and explain what it tells us. (h) Report an F-test for a linear relationship. Write out the full procedure taught in this class and use a 5% significance level. If we were making an error here, which type would it be? Write out in words, in context, what making such an error would mean. Explain the power of the test in words, in context. Very briefly , explain the p-value in words, in context.
Practice Simple Linear Regression 2 Page 9 of 11 © 2022 Radha Bose Florida State University Department of Statistics SOLUTIONS 1. Residual Plot A 2. Residual Plot D corresponds to the model that is most appropriate for the data. A has a pattern (right-arrow kind of shape), B has an outlier and C has a funnel shape. 3. (a) Saturn (b) Mars (c) Insert a line that passes just above the first data point from the left and just under the second data point from the right, because that’s how the x-axis runs on the residual plot. (d) The slope of the line is negative, so larger EQUATOR (size) goes with smaller DAY (time), therefore larger planets appear to take a shorter time, which means larger planets spin faster . (e) In the simple linear case,|r|=sqrt(R 2 ), so |r|=sqrt(0.1377)=0.371. Note that since the slope of the line is negative, r needs to be negative, so r=-0.371 . 4. (a) Subjects: FSU students, Response: Height (inches), Predictor: Femur length (inches) Slope with units: 1.551 inches/inch, Y-intercept with units: 37.481 inches Regression Equation: ŷ = 37.481 + 1.551x, or predictedheight = 37.481 + 1.551femurlength Corr(Y, Ŷ) = 0.788, R 2 = 62.1%, r XY = 0.788 (b) n = 18 students (c) 10, because there are 10 points below the regression line on the scatterplot (or below the x-axis on the residual plot) (d) It is the estimated standard deviation of all the heights that go with a single femur length. (e) 3.083 inches (the standard error!) (f) (0.908, 2.194) inches/inch The average height is expected to go up by between 0.908 and 2.194 inches for every inch increase in femur length. -1000 0 1000 2000 3000 4000 5000 6000 7000 0 100000 200000 300000 400000 500000 DAY EQUATOR
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Practice Simple Linear Regression 2 Page 10 of 11 © 2022 Radha Bose Florida State University Department of Statistics (g) (25.178, 49.783) inches The average height is predicted to be between 25.178 and 49.783 inches when the femur length is zero inches. (h) Step 1. H 0 : There is no linear relationship between height and femur length. Step 2. H a : There is a linear relationship between height and femur length. Step 3. It is assumed that for each femur length, the heights are independent and Normally distributed with mean µ Y|X and standard deviation σ, where µ Y|X depends linearly on femur length and σ is the same for all femur lengths. Step 4. P-value = 0.0001 >F(1, 16) F-statistic = 26.2 Step 5. We reject the null hypothesis because the p-value is less than the 5% significance level. We therefore conclude that there is a linear relationship between height and femur length. Type I Error We would be accepting that there was a linear relationship between height and femur length, when there wasn’t. Power of the test The probability of accepting that there is a linear relationship between height and femur length, when that is indeed the case. P-value The P-value here is the highest probability of getting an F-statistic of 26.2 or more, if there is no linear relationship between height and femur length. 5. (a) Response: Carpet Price (dollars per square yard) Predictor: Gas Price (dollars per gallon) Slope with units: -0.050 dollars per square yard PER dollars per gallon Y-intercept with units: 1.665 dollars per square yard Regression Equation: ŷ = 1.665–0.050x, or predictedcarpetprice = 1.665–0.050gasprice Corr(Y, Ŷ) = 0.033, R 2 = 0.1%, r XY = -0.033 (b) n = 13 times (c) 5, because there are 5 points above the regression line on the scatterplot (or above the x-axis on the residual plot) (d) It is the estimated standard deviation of all the carpet prices that go with a single gas price. (e) 0.419 dollars per square yard (the standard error!)
Practice Simple Linear Regression 2 Page 11 of 11 © 2022 Radha Bose Florida State University Department of Statistics (f) (-1.044,0.945) dollars per square yard PER dollars per gallon The average carpet price is expected to go down by as much as 1.044 dollars per square yard, or go up by as much as 0.945 dollars per square yard, for every dollar per gallon increase in gas price. (g) (-0.971, 4.302) dollars per square yard The average carpet price is predicted to be between -0.971 and 4.302 dollars per square yard when gas price is zero dollars per gallon. (h) Step 1. H 0 : Carpet price does not depend linearly on gas price. Step 2. H a : Carpet price depends linearly on gas price. Step 3. It is assumed that for each gas price, the carpet prices are independent and Normally distributed with mean µ Y|X and standard deviation σ, where µ Y|X depends linearly on gas price and σ is the same for all gas prices. Step 4. P-value = 0.915 >F(1, 11) F-statistic = 0.012 Step 5. We cannot reject the null hypothesis because the p-value is greater than the 5% significance level. We therefore cannot conclude that carpet price depends linearly on gas price. Type II Error We would not be accepting that carpet price depends linearly on gas price, when it actually did. Power of the test The probability of accepting that carpet price depends linearly on gas price, when that is indeed the case. P-value The P-value here is the highest probability of getting an F-statistic of 0.012 or more, if carpet price did not depend linearly on gas price.