Unit+2+Test+Review

rtf

School

Rutgers University *

*We aren’t endorsed by this school

Course

115

Subject

Statistics

Date

Jan 9, 2024

Type

rtf

Pages

5

Uploaded by MateScience1736

Report
AP Statistics Name: _____________________________ Unit 2 Test Review Period: ___________ Part I - Multiple Choice Practice ______ 1) The equation of the least squares regression line for a set of points in a scatterplot is given by = 2.2 + 0.81 ? . The point (5, 7) is one point on this scatterplot. Which of the following is the residual for the point (5, 7)? (A) 0.71 (B) 0.75 (C) 4.05 (D) 6.25 (E) 7.87 ______ 2) The correlation between the heights of fathers and the heights of their (fully grown) sons is r = 0.52. This value was based on both variables being measured in inches. If fathers' heights were measured in feet (one foot equals 12 inches), and sons' heights were measured in furlongs (one furlong equals 7920 inches), the correlation between heights of fathers and heights of sons would be (A) much smaller than 0.52 (B) slightly smaller than 0.52 (C) unchanged: equal to 0.52 (D) slightly larger than 0.52 (E) much larger than 0.52 ______ 3) Use the following MINITAB output to predict the value of y when x equals 2. Predictor Constant X (A) 5.60 (B) 9.87 (C) 10.53 (D) 7.60 (E) 1.33 Coef 0.6667 4.9333 StDev 0.5783 0.09320 T P 1.15 0.282 52.93 0.00 ______ 4) The following 5 data points were used to model the relationship between x and y. X -5 0 8 4 6 Y 10 3 -1 4 0 What is the equation of the least-squares regression line? (A) y = 4.32 - 0.99x (B) y = 5.23 - 0.99x (C) y = 5.23 - 0.78x (D) y = 5.23 + 0.78x (E) y = 4.32 + 0.99x ______ 5) The “least-squares” method of determining a regression equation of the form = ?? + ? where the y-values are represented on the vertical axis and the x-values are presented on the horizontal axis: (A) minimizes the sum of the squares of the horizontal distances between the regression line and the data points. (B) minimizes the absolute value of the correlation coefficient of the data set.
(C) minimizes the coefficient of determination of the regression model. (D) minimizes the sum of the squares of the vertical distances between the regression line and the data points. (E) creates an equation whose predictions match the actual data. ______ 6) All but one of the following statements contains an error. Which statement could be correct? (A) There is a correlation of 0.54 between the position a football player plays and his weight. (B) We found a correlation of r = –0.63 between gender and political party preference. (C) The correlation between the distance traveled by a hiker and the time spent hiking is r = 0.9 meters per second. (D) We found a high correlation between the height and age of children: r = 1.12. (E) The correlation between mid-August soil moisture and the per-acre yield of tomatoes is r = 0.53. ______ 7) A linear model was constructed for a set of bivariate data using least squares regression techniques. Given the residual plot shown, what conclusion should be drawn? (A) A linear model is not a good fit for the data. (B) A linear model was a good fit for the data. (C) The correlation between the original variables is close to 1. (D) The data was drawn from a population that was not normally distributed. (E) The study was poorly designed. ______ 8) There is an approximate linear relationship between the height of females and their age (from 5 to 18 years) described by predicted height = 50.3 + 6.01(age) , where height is measured in centimeters and age in years. Which of the following is not correct? (A) The estimated slope is 6.01, which implies that female children between the ages of 5 and 18 increase in height by about 6 cm for each year they grow older. (B) The estimated height of a female child who is 10 years old is about 110 cm. (C) The estimated intercept is 50.3 cm. We can conclude from this that the typical height of female children at birth is 50.3 cm. (D) The average height of female children when they are 5 years old is about 50% of the average height when they are 18 years old. (E) My niece is about 8 years old and is about 115 cm tall. She is taller than average for girls her age. ______ 9) The regression line for a set of data is = 2 ? + ? . This line passes through the point (3, 4). If and are the sample means of the x and y values respectively, then = (A) (B) − 3 (C) − 4 (D) 2 − 2 (E) 2 − 10 ______ 10) For the model ln = 1.03 + 3.2 ? , predict y when ? = 2 . Round to two decimal places.
(A) 0.87 (B) 1685.81 (C) 2.01 (D) 7.43 (E) is undefined for x = 2 Part II - Free Response Practice 11) The busiest season for Walmart is the Christmas holiday and weekends see a tremendous number of customers. Last year, Walmart conducted a study as to the amount of waiting in time in checkout lanes its customers had to wait. On Saturdays and Sundays of its holiday season, it opened a different number of checkout lanes for customers between 1PM and 4PM, its busiest times. The measurement was the average wait time for a customer to go through the lane and complete the transaction. A different number of lanes was opened each day. The data is below. a) What is the explanatory variable? _____________________________________________ What is the response variable? _______________________________________________ b) For answer a, make a scatterplot on your calculator and draw it below. Make sure to label your axis! c) There is a clear outlier on your scatterplot. Circle it. d) Give a reason that would justify eliminating the outlier. e) Generate the least squares regression line that describes the data (with the outlier eliminated). Use your calculator and round to two decimal places. ________________________ f) Give the meaning of the slope in the LSRL. g) Describe the data’s form, direction, and strength. h) Find the value of the correlation coefficient and interpret it. i) Find the value of the coefficient of determination and interpret it. j) What is the predicted average waiting time if Walmart opens 9 lanes? k) What is the predicted average waiting time if Walmart opens 7 lanes? l) What is the residual when Walmart opens 7 lanes? 12) On the same days of the previous problem, Walmart will need to hire extra staff to man the checkout lanes. They will have to pay out more money for the workers. The chart below describes how much money in pay it paid out during those hours on these specific days. (The outlier is eliminated in the data - do not use it!) a) What is the explanatory variable? _______________________________ What is the response variable? _________________________________ b) Make a scatterplot on your calculator and draw it below. c) Generate the least squares regression line that describes the data and draw it on the graph above. (2 decimal places) d) Give the meaning of the slope in the LSRL. e) Describe the data in terms of form, direction, and strength.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
f) Find the value of the correlation coefficient and interpret it. g) Find the value of the coefficient of determination and interpret it. h) What is the predicted extra salary if Walmart opens 9 lanes? i) What is the predicted extra salary if Walmart opens 7 lanes? j) What is the residual for Walmart opening 7 lanes? k) If Walmart will pay no more than $500 in extra salary, find the maximum lanes it can open. (Remember to round to the nearest lane that would make the statement true). l) If Walmart will pay no more than $250 in extra salary, how the maximum lanes it can open. (Remember to round to the nearest lane that would make the statement true). m) What is the predicted extra salary if Walmart opens no lanes? n) What is the predicted extra salary if Walmart opens 50 lanes? p) Why do the last two problems and answers make little sense for this problem? q) A Walmart manager needs to decide how many lanes to open. Based on the data and the information on the previous page, explain the dilemma he has in making a decision. 13) A study was performed to determine the effect of temperature on a pond’s algae level. Temperature was measured in degrees F, and algae was measured in parts per million. Consider the computer output below. Predictor Constant Temp Coef Stdev 42.8677 5.75 0.4762 0.5911 t-ratio P 77.4 0.00 11.72 0.00 a) Write the equation of the least squares regression line. Identify the variables you use. b) Interpret the slope of the least-squares regression line. c) One of the observations saw a temperature of 80°F produced an algae level of 75.82 million. Calculate the residual. 14) Does washing your hands lead to fewer colds? A study was done to help answer this question. People chosen to be a part of the study were asked to keep track of the number of colds/flu they have over a period of one year. At the end of the year, they were given a questionnaire and one of the questions asked them to estimate the number of times they wash their hands a day. This did not include showers or baths. The results are in the table below. # of times per day washing 0 1 2 3 4 5 6 7 8 9 10 hands Average number of colds 5.52 5.71 5.54 5.13 5.54 4.93 4.03 3.72 2.18 2.12 1.5 a) Calculate the following summary statistics using your calculator: = _______ = ________ ? ? = __________ ? ? = ____________ ? = _________ b) Find the slope of the regression line using the summary statistics above. SHOW YOUR WORK.
c) Find the y-intercept of the regression line using the summary statistics above (and the slope you calculated in part b). SHOW YOUR WORK. d) What is the residual for someone who washes his hands 5 times? e) Create a residual plot on your calculator and sketch the result below (does not have to be exact). What does the residual plot tell you about the appropriateness of a linear model for this data?