Concept explainers
Commute times: Every morning, Tania leaves for work a few minutes after 7:00 A.M. For eight days, she keeps track of the time she leaves (the number of minutes after 7:00) and the number of minutes it takes her to get to work. Following are the results.
- Construct a
scatterplot of the length of commute (y) versus the time leaving (x). - Compute the least-squares regression line for predicting the length of commute from the time leaving.
- Compute the coefficient of determination.
- Which point is an outlier?
- Remove the outlier and compute the least-squares regression line for predicting the length of commute from the time leaving.
- Is the outlier influential? Explain.
- Compute the coefficient of determination for the data set with the outlier removed. Is the relationship stronger. weaker; or about equally strong without the outlier?
a.
To Graph:a scatter plot using the length of commute time
Explanation of Solution
Given information: T leaves her home every day few minutes after
Graph:The scatter plot shows the number of minutes after
Interpretation:Each of the data in the table contributes an ordered pair of the form (number of minutes after
We use a scatter plot for this example because to understand the relationship between the two variables as ordered pairs it is useful. The points tend to cluster around the straight line. Therefore, we conclude that the variable on the
Now consider the three points
Therefore, we conclude that the variable on the
Because for positive linear relationship the large value of data associates with large values of data in the plot, while for negative linear relationship the large value of data associates with small values of data in the pot. And in this case, it is difficult to say that because the large values of data associate with both small and large and small values of data associate with both small and large at the same time.
Therefore it is good to measure how strong the linear relationship is, to know this we can calculate the correlation coefficient.
b.
To Calculate: the least square regression line
Answer to Problem 12RE
When two variables have a linear relationship, the points on a scatter plot tend to cluster around a straight line called the least square regression line. It is simplified to be
Explanation of Solution
Given information: T leaves her home every day few minutes after
Formulas Used:
Sample mean:
Sample variance:
Correlation Coefficient:
The least-square regression line:
Calculation: Using the below table for calculation.
The sample means and the sample variances can be calculated as shown.
Now, one can use these to calculate the correlation coefficient as shown.
Finally, to calculate the least square regression line as shown, theser can be used.
Where
c.
To Calculate: the coefficient of determination
Answer to Problem 12RE
The coefficient of determination is
Explanation of Solution
Given information: T leaves her home every day few minutes after
Formulas Used:
Sample mean:
Sample variance:
Correlation Coefficient:
Calculation: the correlation coefficient can be calculated as shown by using the formula.
The correlation coefficient
indicates a positive linear association. Here the value of the correlation coefficient close to zero.
Therefore, one can conclude that the positive linear relationship is weak.
Also,
To calculate the coefficient of determination we need to square the correlation coefficient.
Therefore, the coefficient of determination is
d.
To Find: the outlier point
Explanation of Solution
Given information: Tania leaves her home every day few minutes after
Graph: The scatter plot shows the number of minutes after
Interpretation: An outlier is a value that considerably larger or considerably smaller than most of the values in a data set. It may be resulting from an error in the process of sampling.
So in the given data set, an outlier point can be detected in the ordered pair
Because it is much larger than the other ordered pairs.
e.
To Calculate: the least square regression line without the outlier point.
Answer to Problem 12RE
The least square regression line without theoutlier point is
Explanation of Solution
Given information: T leaves her home every day few minutes after
Formulas Used:
Sample mean:
Sample variance:
Correlation Coefficient:
The least square regression line:
Calculation: the sample means and the sample variances can be calculated as shown without the outlier point.
Now, one can use these to calculate the correlation coefficient without the outlier as shown.
Finally, one can use these to calculate the least square regression line as shown.
Where
f.
To Show: the outlier is influential
Answer to Problem 12RE
No, the outlier is not that much influential.
Explanation of Solution
Given information: T leaves her home every day few minutes after
The least-square regression line with the outlier is
The least-square regression line without the outlier is
The two least square regression lines are so much close and have no huge difference.
Therefore we can conclude that the outlier is not that much influential. Here, the outlier cannot be a result of an error. It is just random data measured along with the other data in the sampling process.
g.
To Find: the coefficient of determination without the outlier and discuss its strength.
Answer to Problem 12RE
Coefficient of determination is
Explanation of Solution
Given information: T leaves her home every day few minutes after
Formula Used:Sample mean:
Sample variance:
Correlation Coefficient:
Calculation: to calculate the correlation coefficient without the outlier as shown.
The correlation coefficient
indicates a positive linear association. It is
The value is so close to zero .because of the ten to the power is minus thirty-one.
Therefore, one can conclude that the positive linear relationship is very weak without the outlier.
Also,
One can calculate the coefficient of determination by squaring the correlation coefficient.
Therefore, the coefficient of determination is
Want to see more full solutions like this?
Chapter 4 Solutions
Elementary Statistics ( 3rd International Edition ) Isbn:9781260092561
- A box contains 14 large marbles and 10 small marbles. Each marble is either green or white. 9 of the large marbles are green, and 4 of the small marbles are white. If a marble is randomly selected from the box, what is the probability that it is small or white? Express as a fraction or a decimal number rounded to four decimal places.arrow_forwardCan I get help with this step please? At a shooting range, instructors can determine if a shooter is consistently missing the target because of the gun sight or because of the shooter's ability. If a gun's sight is off, the variance of the distances between the shots and the center of the shot pattern will be small (even if the shots are not in the center of the target). A student claims that it is the sight that is off, not his aim, and wants the instructor to confirm his claim. If a skilled shooter fires a gun at a target multiple times, the distances between the shots and the center of the shot pattern, measured in centimeters (cm), will have a variance of less than 0.33. After the student shoots 28 shots at the target, the instructor calculates that the distances between his shots and the center of the shot pattern, measured in cm, have a variance of 0.25. Does this evidence support the student's claim that the gun's sight is off? Use a 0.025 level of significance. Assume that the…arrow_forwardThe National Academy of Science reported that 38% of research in mathematics is published by US authors. The mathematics chairperson of a prestigious university wishes to test the claim that this percentage is no longer 38%. He has no indication of whether the percentage has increased or decreased since that time. He surveys a simple random sample of 279 recent articles published by reputable mathematics research journals and finds that 123 of these articles have US authors. Does this evidence support the mathematics chairperson's claim that the percentage is no longer 38 % ? Use a 0.02 level of significance. Compute the value of the test statistic. Round to two decimal places.arrow_forward
- A marketing research company desires to know the mean consumption of milk per week among males over age 32. They believe that the milk consumption has a mean of 4 liters, and want to construct a 98% confidence interval with a maximum error of 0.07 liters. Assuming a variance of 0.64 liters, what is the minimum number of males over age 32 they must include in their sample? Round up to the next integer.arrow_forwardSuppose GRE Verbal scores are normally distributed with a mean of 461 and a standard deviation of 118. A university plans to recruit students whose scores are in the top 4 % . What is the minimum score required for recruitment? Round to the nearest whole number, if necessaryarrow_forwardNeed help with my homework thank you random sample of 6 fields of durum wheat has a mean yield of 45.5 bushels per acre and standard deviation of 7.43 bushels per acre. Determine the 80 % confidence interval for the true mean yield. Assume the population is approximately normal. Step 1: Find the critical value that should be used in constructing the confidence interval. Round to three decimal places. Step 2 of 2: Construct the 80% confidence interval. Round to one decimal place. I got 1.476 as my critical value and 41.0 and 49.9 as my confidence intervalarrow_forward
- Can someone check my work? If you draw a card with a value of four or less from a standard deck of cards, I will pay you $14. If not, you pay me $8. (Aces are considered the highest card in the deck.) Step 1 of 2: Find the expected value of the proposition. Round to two decimal places. Losses must be expressed as negative values. PT 2: If you played this game 718 times how much would you expect to win or lose? Round your answer to two decimal places. Losses must be expressed as negative values. for part 1 I got -2.92 pt 2 -2097.56arrow_forwardThe following table describes the distribution of a random sample S of 200 individuals, arranged by education level and income. Income(Dollars per Year) < High School High School Diploma Some College Bachelor’s Degree Graduate Degree Post-Graduate Degree 0-25,000 12 8 3 2 1 0 25,000-50,000 7 12 9 12 11 2 50,000-75,000 1 3 4 6 14 5 75,000-100,000 0 2 1 8 11 8 100,000-125,000 0 1 1 4 8 9 125,000-150,000 0 0 2 3 7 12 150,000+ 0 0 1 1 3 6 Let events be defined as follows: A = the event the subject makes 0-25,000 dollars per yearB = the event the subject makes 25,000-50,000 dollars per year C = the event the subject makes 50,000-75,000 dollars per yearD = the event the subject makes 75,000-100,000 dollars per yearE = the event the subject makes 100,000-125,000 dollars per yearF = the event the subject makes 125,000-150,000 dollars per yearG = the event…arrow_forwardwhat does the central limit theorem, for all samples of the same size n with n>30, the sampling distribution of x can be approximated by a normal distribution mean? What is a real life example using this theoremarrow_forward
- An investigator analyzed the leading digits from 797 checks issued by seven suspect companies. The frequencies were found to be 0, 19, 2, 50, 361, 309, 10, 22, and 24, and those digits correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively. If the observed frequencies are substantially different from the frequencies expected with Benford's law shown below, the check amounts appear to result from fraud. Use a 0.10 significance level to test for goodness-of-fit with Benford's law. Does it appear that the checks are the result of fraud? Leading Digit Actual Frequency Benford's Law: Distribution of Leading Digits 1 2 3 4 5 6 7 8 9 0 19 2 50 361 309 10 22 24 30.1% 17.6% 12.5% 9.7% 7.9% 6.7% 5.8% 5.1% 4.6% Determine the null and alternative hypotheses. Ho The leading digits are from a population that conforms to Benford's law. H₁: At least one leading digit has a frequency that does not conform to Benford's law. Calculate the test statistic, x². x² = (Round to three…arrow_forwardFor the distribution drawn here, identify the mean, median, and mode. Question content area bottom Part 1 A. Aequalsmode, Bequalsmedian, Cequalsmean B. Aequalsmode, Bequalsmean, Cequalsmedian C. Aequalsmedian, Bequalsmode, Cequalsmean D. Aequalsmean, Bequalsmode, Cequalsmedianarrow_forwardA study was done using a treatment group and a placebo group. The results are shown in the table. Assume that the two samples are independent simple random samples selected from normally distributed populations, and do not assume that the population standard deviations are equal. Complete parts (a) and (b) below. Use a 0.05 significance level for both parts. a. Test the claim that the two samples are from populations with the same mean. What are the null and alternative hypotheses? OA. Ho PP2 H₁: P1 P2 OC. Ho H₁₂ H₁: P₁arrow_forwardarrow_back_iosSEE MORE QUESTIONSarrow_forward_ios
- College AlgebraAlgebraISBN:9781305115545Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningElementary Linear Algebra (MindTap Course List)AlgebraISBN:9781305658004Author:Ron LarsonPublisher:Cengage Learning
- Algebra and Trigonometry (MindTap Course List)AlgebraISBN:9781305071742Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage LearningLinear Algebra: A Modern IntroductionAlgebraISBN:9781285463247Author:David PoolePublisher:Cengage Learning