
Concept explainers
Imports and exports: The following table presents the U.S. imports and exports (in billions of dollars) for each of 29 months.
- Compute the least-squares regression line for predicting exports (y) from imports (x).
- Compute the coefficient of determination.
- The months with the two lowest exports are January and February 2011 Remove these points and compute the least-squares regression line. Is the result noticeably different?
- Compute the coefficient of determination for the data set with January and February 2011 removed.
- Two economists decide to study the relationship between imports and exports. One uses data from January 2011 through May 2013 and the other used data from March 2011 through May 2013. For which data set will the proportion of variance explained by the least-squares regression line be greater?
(a)
>
The least squares regression line for the given data set.
Answer to Problem 26E
Explanation of Solution
Given information:
The following table presents the U.S. imports and exports (in billions of dollars) for each of
months:
Concepts Used:
The equation for least-square regression line:
Where
The correlation coefficient of a data is given by:
Where,
The standard deviations are given by:
Calculation:
The mean of
The mean of
The data can be represented in tabular form as:
x | y | ![]() |
![]() |
![]() |
![]() |
![]() |
215.9 | 168.1 | -10.21724 | 104.39202 | -13.18276 | 173.78512 | 134.69143 |
211.8 | 166.6 | -14.31724 | 204.98340 | -14.68276 | 215.58340 | 210.21660 |
217.7 | 174.3 | -8.41724 | 70.84995 | -6.98276 | 48.75892 | 58.77556 |
218.1 | 175.9 | -8.01724 | 64.27616 | -5.38276 | 28.97409 | 43.15488 |
223.6 | 176.2 | -2.51724 | 6.33650 | -5.08276 | 25.83444 | 12.79453 |
224.2 | 173.2 | -1.91724 | 3.67581 | -8.08276 | 65.33099 | 15.49660 |
224.9 | 179.5 | -1.21724 | 1.48168 | -1.78276 | 3.17823 | 2.17005 |
224.6 | 179.9 | -1.51724 | 2.30202 | -1.38276 | 1.91202 | 2.09798 |
225.7 | 181.2 | -0.41724 | 0.17409 | -0.08276 | 0.00685 | 0.03453 |
226.6 | 180.5 | 0.48276 | 0.23306 | -0.78276 | 0.61271 | -0.37788 |
226.1 | 178.3 | -0.01724 | 0.00030 | -2.98276 | 8.89685 | 0.05143 |
230.5 | 179.1 | 4.38276 | 19.20857 | -2.18276 | 4.76444 | -9.56650 |
230.9 | 179.5 | 4.78276 | 22.87478 | -1.78276 | 3.17823 | -8.52650 |
225.8 | 182.1 | -0.31724 | 0.10064 | 0.81724 | 0.66788 | -0.25926 |
234.3 | 186.5 | 8.18276 | 66.95754 | 5.21724 | 27.21961 | 42.69143 |
230.9 | 184.3 | 4.78276 | 22.87478 | 3.01724 | 9.10375 | 14.43074 |
230.5 | 184.2 | 4.38276 | 19.20857 | 2.91724 | 8.51030 | 12.78556 |
227.6 | 185.2 | 1.48276 | 2.19857 | 3.91724 | 15.34478 | 5.80832 |
226.8 | 183.4 | 0.68276 | 0.46616 | 2.11724 | 4.48271 | 1.44556 |
226.1 | 182.1 | -0.01724 | 0.00030 | 0.81724 | 0.66788 | -0.01409 |
228.4 | 186.8 | 2.28276 | 5.21099 | 5.51724 | 30.43995 | 12.59453 |
225.3 | 182.7 | -0.81724 | 0.66788 | 1.41724 | 2.00857 | -1.15823 |
231.6 | 185.2 | 5.48276 | 30.06064 | 3.91724 | 15.34478 | 21.47729 |
227.0 | 188.7 | 0.88276 | 0.77926 | 7.41724 | 55.01547 | 6.54763 |
229.4 | 186.7 | 3.28276 | 10.77650 | 5.41724 | 29.34650 | 17.78350 |
231.0 | 187.1 | 4.88276 | 23.84133 | 5.81724 | 33.84030 | 28.40419 |
222.3 | 185.2 | -3.81724 | 14.57133 | 3.91724 | 15.34478 | -14.95306 |
227.7 | 187.6 | 1.58276 | 2.50512 | 6.31724 | 39.90754 | 9.99867 |
232.1 | 187.1 | 5.98276 | 35.79340 | 5.81724 | 33.84030 | 34.80315 |
|
|
|
|
|
Hence, the standard deviation is given by:
And,
Consider,
Putting the values in the formula,
Putting the values to obtain b1,
Putting the values to obtain b0,
Hence, the least-square regression line is given by:
Therefore, the least squares regression line for the given data set is
(b)
>
The coefficient of determination.
Answer to Problem 26E
Explanation of Solution
Given information:
Same as part
Calculation:
From part
The coefficient of determination is given by:
Where
Putting the values to obtain Coefficient of Determination,
Therefore, the Coefficient of Determination is
(c)
>
The least squares regression line for the given data set by excluding the outlier points and to check if the result is noticeably different.
Answer to Problem 26E
The result is noticeably different.
Explanation of Solution
Given information:
Same as part
The months with two lowest exports are January and February
Concepts used:
The equation for least-square regression line:
Where
The correlation coefficient of a data is given by:
Where,
The standard deviations are given by:
Calculation:
The months with two lowest exports are January and February
Excluding the outlier,
The mean of
The mean of
The data can be represented in tabular form as:
x | y | ![]() |
![]() |
![]() |
![]() |
![]() |
217.7 | 174.3 | -8.41724 | 70.84995 | -6.98276 | 48.75892 | 58.77556 |
218.1 | 175.9 | -8.01724 | 64.27616 | -5.38276 | 28.97409 | 43.15488 |
223.6 | 176.2 | -2.51724 | 6.33650 | -5.08276 | 25.83444 | 12.79453 |
224.2 | 173.2 | -1.91724 | 3.67581 | -8.08276 | 65.33099 | 15.49660 |
224.9 | 179.5 | -1.21724 | 1.48168 | -1.78276 | 3.17823 | 2.17005 |
224.6 | 179.9 | -1.51724 | 2.30202 | -1.38276 | 1.91202 | 2.09798 |
225.7 | 181.2 | -0.41724 | 0.17409 | -0.08276 | 0.00685 | 0.03453 |
226.6 | 180.5 | 0.48276 | 0.23306 | -0.78276 | 0.61271 | -0.37788 |
226.1 | 178.3 | -0.01724 | 0.00030 | -2.98276 | 8.89685 | 0.05143 |
230.5 | 179.1 | 4.38276 | 19.20857 | -2.18276 | 4.76444 | -9.56650 |
230.9 | 179.5 | 4.78276 | 22.87478 | -1.78276 | 3.17823 | -8.52650 |
225.8 | 182.1 | -0.31724 | 0.10064 | 0.81724 | 0.66788 | -0.25926 |
234.3 | 186.5 | 8.18276 | 66.95754 | 5.21724 | 27.21961 | 42.69143 |
230.9 | 184.3 | 4.78276 | 22.87478 | 3.01724 | 9.10375 | 14.43074 |
230.5 | 184.2 | 4.38276 | 19.20857 | 2.91724 | 8.51030 | 12.78556 |
227.6 | 185.2 | 1.48276 | 2.19857 | 3.91724 | 15.34478 | 5.80832 |
226.8 | 183.4 | 0.68276 | 0.46616 | 2.11724 | 4.48271 | 1.44556 |
226.1 | 182.1 | -0.01724 | 0.00030 | 0.81724 | 0.66788 | -0.01409 |
228.4 | 186.8 | 2.28276 | 5.21099 | 5.51724 | 30.43995 | 12.59453 |
225.3 | 182.7 | -0.81724 | 0.66788 | 1.41724 | 2.00857 | -1.15823 |
231.6 | 185.2 | 5.48276 | 30.06064 | 3.91724 | 15.34478 | 21.47729 |
227.0 | 188.7 | 0.88276 | 0.77926 | 7.41724 | 55.01547 | 6.54763 |
229.4 | 186.7 | 3.28276 | 10.77650 | 5.41724 | 29.34650 | 17.78350 |
231.0 | 187.1 | 4.88276 | 23.84133 | 5.81724 | 33.84030 | 28.40419 |
222.3 | 185.2 | -3.81724 | 14.57133 | 3.91724 | 15.34478 | -14.95306 |
227.7 | 187.6 | 1.58276 | 2.50512 | 6.31724 | 39.90754 | 9.99867 |
232.1 | 187.1 | 5.98276 | 35.79340 | 5.81724 | 33.84030 | 34.80315 |
|
|
|
|
|
Hence, the standard deviation is given by:
And,
Consider,
Putting the values in the formula,
Putting the values to obtain
Putting the values to obtain
Hence, the least-square regression line is given by:
Therefore, the least squares regression line for the given data set by removing the outlier is
Hence the result is noticeably different.
(d)
>
The coefficient of determination for the data set with the outlier removed.
Answer to Problem 26E
Explanation of Solution
Given information:
Same as part
The months with two lowest exports are January and February
Calculation:
From part
The coefficient of determination is given by:
Where
Plugging the values to obtain Coefficient of Determination,
Therefore, the Coefficient of Determination is
(e)
>
To calculate:
To check for which data set will the proportion of variance explained by the least-squares regression line be greater.
Answer to Problem 26E
The proportion of variance explained by the least-squares regression line is greater for the data from January
Explanation of Solution
Given information:
Same as part
Two economists decide to study the relationship between imports and exports. One uses data from January
Calculation:
From previous parts of this exercise,
The Coefficient of Determination is
The Coefficient of Determination without the outliers is
Here the coefficient of determination decreased without the outliers.
Hence, the proportion of variance explained is less without the outlier.
Therefore, the proportion of variance explained by the least-squares regression line is greater for the data from January
Want to see more full solutions like this?
Chapter 4 Solutions
Elementary Statistics (Text Only)
- Apply STATA commands & submit the output for each question only when indicated below İ. ii. iii. iv. V. Apply the command summarize on variables bwght and faminc. What is the average birthweight of babies and family income of the respondents? Include the output of this code. Apply the tab command on the variable called male. How many of the babies and what share of babies are male? Include the output of this code. Find the summary statistics (i.e. use the sum command) of the variables bwght and faminc if the babies are white. Include the output of this code. Find the summary statistics (i.e. use the sum command) of the variables bwght and faminc if the babies are male but not white. Include the output of this code. Using your answers to previous subparts of this question: What is the difference between the average birthweight of a baby who is male and a baby who is male but not white? What can you say anything about the difference in family income of the babies that are male and male…arrow_forwardA public health researcher is studying the impacts of nudge marketing techniques on shoppers vegetablesarrow_forwardThe director of admissions at Kinzua University in Nova Scotia estimated the distribution of student admissions for the fall semester on the basis of past experience. Admissions Probability 1,100 0.5 1,400 0.4 1,300 0.1 Click here for the Excel Data File Required: What is the expected number of admissions for the fall semester? Compute the variance and the standard deviation of the number of admissions. Note: Round your standard deviation to 2 decimal places.arrow_forward
- A pollster randomly selected four of 10 available people. Required: How many different groups of 4 are possible? What is the probability that a person is a member of a group? Note: Round your answer to 3 decimal places.arrow_forwardWind Mountain is an archaeological study area located in southwestern New Mexico. Potsherds are broken pieces of prehistoric Native American clay vessels. One type of painted ceramic vessel is called Mimbres classic black-on-white. At three different sites the number of such sherds was counted in local dwelling excavations. Test given. Site I Site II Site III 63 19 60 43 34 21 23 49 51 48 11 15 16 46 26 20 31 Find .arrow_forwardRothamsted Experimental Station (England) has studied wheat production since 1852. Each year many small plots of equal size but different soil/fertilizer conditions are planted with wheat. At the end of the growing season, the yield (in pounds) of the wheat on the plot is measured. Suppose for a random sample of years, one plot gave the following annual wheat production (in pounds): 4.46 4.21 4.40 4.81 2.81 2.90 4.93 3.54 4.16 4.48 3.26 4.74 4.97 4.02 4.91 2.59 Use a calculator to verify that the sample variance for this plot is . Another random sample of years for a second plot gave the following annual wheat production (in pounds): 3.89 3.81 3.95 4.07 4.01 3.73 4.02 3.78 3.72 3.96 3.62 3.76 4.02 3.73 3.94 4.03 Use a calculator to verify that the sample variance for this plot is . Suppose that we test the claim using that the population variance of annual wheat production for the first plot is larger…arrow_forward
- It is thought that prehistoric Native Americans did not take their best tools, pottery, and household items when they visited higher elevations for their summer camps. It is hypothesized that archaeological sites tend to lose their cultural identity and specific cultural affiliation as the elevation of the site increases. Let x be the elevation (in thousands of feet) for an archaeological site in the southwestern United States. Let y be the percentage of unidentified artifacts (no specific cultural affiliation) at a given elevation. Suppose that the following data were obtained for a collection of archaeological sites in New Mexico: x 5.50 6.00 6.75 7.00 7.75 y 37 38 92 70 99 Find the equation of the least squares line . Round a and b to three decimal places.arrow_forwardA fitness trainer wants to estimate the effect of fitness activities on muscle mass for different weight categories of club members. They choose the most popular fitness classes at the gym: yoga, circuit training, and high-intensity interval training (HIIT). Suppose that the weights of club members are separated into three levels: under 155 pounds, 155 – 200 pounds, and over 200 pounds. Draw a flow chart showing the design of this experiment.arrow_forwardThe systolic blood pressure of individuals is thought to be related to both age and weight. Let the systolic blood pressure, age, and weight be represented by the variables x1, x2, and x3, respectively. Suppose that Minitab was used to generate the following descriptive statistics, correlations, and regression analysis for a random sample of 15 individuals. Descriptive Statistics Variable N Mean Median TrMean StDev SE Mean x 1 15 154.14 154.34 154.14 3.842 0.992000 x 2 15 59.69 60.19 59.69 1.462 0.377487 x 3 15 205.55 204.75 205.55 4.558 1.176871 Variable Minimum Maximum Q1 Q3 x 1 125 178 141.803 167.244 x 2 41 80 47.754 78.415 x 3 126 240 140.395 224.008 Correlations (Pearson) x 1 x 2 x 2 0.892 x 3 0.839 0.567 Regression Analysis The regression equation is x 1 = 0.883 + 1.257x2 + 0.871x3 Predictor Coef StDev T P Constant 0.883 0.635 1.39 0.095 x 2 1.257 0.635 1.98 0.036 x 3 0.871 0.419 2.08 0.030 S = 0.428 R-sq = 92.7 %…arrow_forward
- According to health professionals, a person’s weight is expected to increase with age. To examine that statement, a nutritionist collected data from 11 random females from different age categories between the ages of 21 and 43. In the following table, x is the age of a person and y is the weight in pounds. x, age 21 24 27 29 31 33 35 38 40 42 43 y, weight in lb 121.4 122.3 130.3 131.7 133.3 134.6 136.7 138.4 140.3 142.0 145.1 Select the correct graph of the least-squares line on a scatter diagram.arrow_forwardLet x be a random variable that represents the percentage of successful free throws a professional basketball player makes in a season. Let y be a random variable that represents the percentage of successful field goals a professional basketball player makes in a season. A random sample of n = 6 professional basketball players gave the following information. x 82 69 73 84 74 64 y 42 48 46 46 46 42 Verify that ∑x =446, ∑y =270, ∑x2 =33,442, ∑y2 =12,180, ∑xy =20,070, and r = 0, and find the critical value for a test using a 5% level of significance claiming that ρis not equal than zero. Round your answer to three decimal places.arrow_forwardLet x be a random variable that represents the percentage of successful free throws a professional basketball player makes in a season. Let y be a random variable that represents the percentage of successful field goals a professional basketball player makes in a season. A random sample of n = 6 professional basketball players gave the following information. x 75 72 75 81 74 81 y 46 39 42 47 49 50 Verify that Se ࣈ 3.591,a ࣈ –10.145, bࣈ0.729, and , and find the predicted percentage of successful field goals for a player with x= 88%successful free throws. Round your answer to the nearest tenth of a percentarrow_forward
- Linear Algebra: A Modern IntroductionAlgebraISBN:9781285463247Author:David PoolePublisher:Cengage LearningFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage LearningBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillElementary Linear Algebra (MindTap Course List)AlgebraISBN:9781305658004Author:Ron LarsonPublisher:Cengage Learning





