9.3
docx
keyboard_arrow_up
School
Collin County Community College District *
*We aren’t endorsed by this school
Course
1342
Subject
Mathematics
Date
Jan 9, 2024
Type
docx
Pages
18
Uploaded by BaronTree7248
Chapter 9 Central Limit Theorem
In this chapter, we will learn about the second of the two most important theorems in Statistics:
1.
Law of Large Numbers
2.
Central Limit Theorem
Recall that the Law of Large Numbers says that if you repeat a probability experiment many times, the relative frequency of an event will be approximately equal to the probability of the event.
As we will see below, the Central Limit Theorem tells us that the normal distribution can be used to study any population, even those that are not normally distributed!
9.3 Central Limit Theorem for Means
Sampling Distribution for a Normal Population
Suppose that we would like to estimate the mean height of MATH 1680 students.
We could simply pick one student at random and measure their height, but that is unlikely to give us a good estimate of the mean height of all
students. Instead, we could randomly select 4 students:
59 66 67 67
What is the mean height of these 4 students? Does it underestimate or overestimate the population mean?
´
x
=
59
+
66
+
67
+
67
4
=
64.75
Underestimate
Since our first estimate was too low, let’s generate a new estimate by randomly selecting 4 different students:
64 65 69 71
What is the mean of the new sample? Does it underestimate or overestimate the population mean?
´
x
=
64
+
65
+
69
+
71
4
=
67.25
Overestimate
If we repeat this procedure many times, we will eventually generate every possible sample of size 4, each of which yields a different estimate of the population mean. The distribution of these estimates is called the sampling distribution of the sample mean
.
Although a few of our estimates are highly inaccurate, most of them are close to the correct value of μ
=
65.89
inches. How would you describe the shape of the sampling distribution?
Normal
Where is the sampling distribution centered? How spread out is it?
μ
´
x
=
65.85
and σ
´
x
=
2.30
According to the Empirical Rule, 68% of our estimates will fall between what two values?
(
63.55,68.15
)
It is important to distinguish between the population and the sampling distribution:
μ
=
populationmean
σ
=
standard deviation
μ
´
x
=
average estimate
σ
´
x
=
standarderror of the estimates
Note that Knewton Alta calls μ
´
x
the “mean of the sampling distribution” and σ
´
x
the “standard deviation of the sampling distribution.” What could we do to obtain more accurate estimates?
Collect more data!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Suppose that we randomly selected 16 students and measured their heights:
60 60 62 63 64 64 66 66 67 67 68 68 70 71 74 75
What is the mean height of these students? Is this an overestimate or an underestimate?
´
x
=
66.56
Overestimate
On average, estimates based on samples of size 16 will be more accurate than estimates based on samples of size 4,
while estimates based on samples of size 64 will be more accurate still.
The following table shows how the average estimate and the standard error change as the sample size increases.
n
μ
´
x
σ
´
x
1
65.89
4.56
4
65.85
2.30
16
65.91
1.14
64
65.91
0.57
⋮
n
65.894.56
/
√
n
What happens to the average estimate as the sample size increases?
It stays the same (approximately).
What happens to the standard error as the sample size increases?
When the sample size is multiplied by a factor of 4, the standard error is divided by a factor of 2.
Suppose we tried to estimate the mean height of MATH 1680 students based on a sample of size n
. What would you expect the average estimate to be? The standard error?
μ
´
x
=
65.89
and σ
´
x
=
4.56
√
n
Given that μ
=
65.89
and σ
=
4.56
, can you guess the general formulas for the average estimate and standard error? μ
´
x
=
μ
and σ
´
x
=
σ
√
n
How would you describe the shape of the population distribution? The sampling distributions?
The population distribution and the sampling distributions are normal.
Click the Play button next to “n = 1” to demonstrate how the sampling distribution changes when we collect more data.
Sampling Distribution for a Non-Normal Population
Now we will investigate what happens when the population from which we are sampling is not normally distributed. For example, ages of MATH 1680 students are skewed right with a mean of 20.54 years and a standard deviation of 4.48 years.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Compare the shape of the population distribution to the shapes of the sampling distributions for samples of size n
= 4, 16, and 64.
The following table shows how the average estimate and the standard error change as the sample size increases.
n
μ
´
x
σ
´
x
1
20.54
4.48
4
20.49
2.20
16
20.56
1.14
64
20.52
0.56
Do you observe the same pattern as before?
Yes!
Central Limit Theorem for Means
When drawing samples of size n
from a population with mean μ
and standard deviation
σ
, the sampling distribution of ´
x
will have the same overall shape as the population distribution, with
μ
´
x
=
μ
and σ
´
x
=
σ
√
n
.
If the population is normally distributed, the sampling distribution will be normally distributed.
If the population distribution is skewed, the sampling distribution will be skewed in the same direction. However, the sampling distribution becomes less skewed as the sample size increases.
For any population, we can assume that the sampling distribution is approximately normal when n≥
30
.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Example.
The shoe sizes of MATH 1680 students are normally distributed with a mean of 8.75 and a standard deviation of 2.07. Suppose that 10 students are randomly selected. Identify the expected mean, standard error, and shape of the sampling distribution.
μ
=
8.75
and σ
=
2.07
μ
´
x
=
8.75
and σ
´
x
=
2.07
√
10
=
0.65
Normal
Example.
The commute times of MATH 1680 students who don’t live on campus are skewed right with a mean of 24.08 minutes and a standard deviation of 19.71 minutes. Suppose that 18 students are randomly selected. Identify the expected mean, standard error, and shape of the sampling distribution.
μ
=
24.08
and σ
=
19.71
μ
´
x
=
24.08
and σ
´
x
=
19.71
√
18
=
4.65
Skewed right
Describe the expected mean, standard deviation, and shape of the sampling distribution when the sample size is 48.
μ
´
x
=
24.08
and σ
´
x
=
19.71
√
48
=
2.84
Approximately normal
Using the Central Limit Theorem to Find Probabilities Example.
The heights of MATH 1680 students are normally distributed with a mean of 65.89 inches and a standard deviation of 4.56 inches. Suppose that four students are randomly selected. What is the probability that their average height will be between 60 and 72 inches?
Wrong answer!
Right answer!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Example.
The commute times of MATH 1680 students who don’t live on campus are skewed right with a mean of 24.08 minutes and a standard deviation of 19.71 minutes. Suppose that 18 students are randomly selected. What is the probability that their average commute time will be greater than 25 minutes?
We can’t use the normal distribution because the population distribution is skewed right and the sample size is less than 30. Thus, we need to collect more data!
What is the probability that the average commute time of 48 randomly selected students will be
less than 25 minutes?
μ
´
x
=
24.08
and σ
´
x
=
19.71
√
48
=
2.84
z
=
25
−
24.08
19.71
/
√
48
=
0.92
2.84
=
0.32
P
(
´
x
<
25
)
=
P
(
Z
<
0.32
)
=
0.6255
Now solve the problem using Microsoft Excel:
1.
Open the file named “MATH 1680 Chapter 9 Data”
2.
Click on the CLT Mean
sheet
3.
In cell A2, enter the number 24.08
4.
In cell B2, enter the number 19.71
5.
In cell C2, enter the number 48
6.
In cell E2, enter the formula =A2
7.
In cell F2, enter the formula =B2/SQRT(C2)
8.
In cell I2, enter the number 25
9.
In cell L2, enter the formula =(I2 - E2)/F2
10. In cell N2, enter the formula =NORM.S.DIST(L2, TRUE) or the formula =NORM.DIST(I2, E2, F2, TRUE)
Example.
The cookie machine at Chips Ahoy adds a random number of chips to each cookie. The number of chips is a random variable with mean 28.5 and standard deviation 5.3. Find the probability that, in a bag of 50 cookies, the average number of chips per cookie is at least 30.
Think-Pair-Share
μ
=
28.5
and σ
=
5.3
μ
´
x
=
28.5
and σ
´
x
=
5.3
√
50
=
0.75
Sampling distribution is approximately normal
z
=
30
−
28.5
5.3
/
√
50
=
1.5
0.75
=
2.00
P
(
´
x ≥
30
)
=
P
(
Z ≥
2.00
)
=
1
−
0.9772
=
0.0228
Now solve the problem using Microsoft Excel:
1.
In cell A3, enter the number 28.5
2.
In cell B3, enter the number 5.3
3.
In cell C3, enter the number 50
4.
In cell E3, enter the formula =A3
5.
In cell F3, enter the formula =B3/SQRT(C3)
6.
In cell H3, enter the number 30
7.
In cell K3, enter the formula =(H3 – E3)/F3
8.
In cell P3, enter the formula =1 - NORM.S.DIST(K3, TRUE) or the formula =1 - NORM.DIST(H3, E3, F3, TRUE)
Using the Central Limit Theorem to Find a Mean Given a Probability
We can also use the Central Limit Theorem to find thresholds such that a certain percentage of our estimates will be above or below the threshold.
Example.
The heights of MATH 1680 students are normally distributed with a mean of 65.89 inches and a standard deviation of 4.56 inches. Suppose that four students are randomly selected, and their average height is calculated. If this process is repeated many times, what height will separate the lowest 20% of the estimates from the highest 80% of the estimates?
Now solve the problem using Microsoft Excel:
1.
In cell A5, enter the number 65.89
2.
In cell B5, enter the number 4.56
3.
In cell C5, enter the number 4
4.
In cell E5, enter the formula =A5
5.
In cell F5, enter the formula =B5/SQRT(C5)
6.
In cell N5, enter the number 0.2
7.
In cell P5, enter the number 0.8
8.
In cell L5, enter the formula =NORM.S.INV(N5)
9.
In cell I5, enter the formula =E5 + F5*L5 or the formula =NORM.INV(N5, E5, F5)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Example.
The commute times of MATH 1680 students who don’t live on campus are skewed right with a mean of 24.08 minutes and a standard deviation of 19.71 minutes. Suppose that 48 students are randomly selected, and their average commute time is calculated. What time will separate the lowest 90% from the highest 10% of such estimates?
Draw a normal curve
z
=
´
x
−
μ
σ
/
√
n
1.28
=
´
x
−
24.08
19.71
/
√
48
1.28
∙
2.84
=´
x
−
24.08
´
x
=
24.08
+
3.64
=
27.72
Now solve the problem using Microsoft Excel:
1.
In cell A6, enter the number 24.08
2.
In cell B6, enter the number 19.71
3.
In cell C6, enter the number 48
4.
In cell E6, enter the formula =A6
5.
In cell F6, enter the formula =B6/SQRT(C6)
6.
In cell N6, enter the number 0.9
7.
In cell P6, enter the number 0.1
8.
In cell K6, enter the formula =NORM.S.INV(N6)
9.
In cell H6, enter the formula =E6 + F6*K6 or the formula =NORM.INV(N6, E6, F6)
Find the Sample Size that Corresponds to a Given Standard Error If we want our estimates to have a certain level of accuracy, we can substitute the desired value for the standard error into the left side of the formula
σ
´
x
=
σ
√
n
then solve for n
.
Example.
In the previous example, we found that a typical estimate of the mean commute time of MATH 1680 students who don’t live on campus will be off by no more than σ
´
x
=
19.71
√
48
=
2.84
minutes
Suppose that we wish to estimate the mean commute time so that a typical estimate will be off by no more than two minutes? How many MATH 1680 students who don’t live on campus should we survey?
2.00
=
19.71
√
n
√
n
=
19.71
2.00
n
=
(
19.71
2.00
)
2
=
97.12
↑
=
98
students
Example.
The cookie machine at Chips Ahoy adds a random number of chips to each cookie. The number of chips is a random variable with mean 28.5 and standard deviation 5.3. Researchers wish to estimate the mean number of chips per cookie so that the standard error is no more than 0.5 chips per cookie. How many cookies should they test?
0.5
=
5.3
√
n
√
n
=
5.3
0.5
n
=
(
5.3
0.5
)
2
=
112.36
↑
=
113
cookies
Now solve the problem using Microsoft Excel:
1.
Copy the cell range A3:F3 to the cell range A8:F8
2.
Enter different sample sizes (100, 110, 120, …) into cell C8 to observe the effect on the standard error in cell F8
3.
In cell F8, enter the desired standard error, 0.5
4.
In cell C8, enter the formula =(B8/F8)^2
Round the resulting sample size in cell C8 up to the nearest integer
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help