09 Jan 26 sampling distribution of p hat

pdf

School

University Of Georgia *

*We aren’t endorsed by this school

Course

6315

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

14

Uploaded by DrZebra4092

Report
Example: Group Project ( Watch Video Before Class ) For a small graduate level course, there are 12 students. Some are master’s students and some are doctoral students indicated in parentheses. Amrit (M) Geffrey (M) Ivan (D) Binita (M) Xinyu (M) Avik (D) Hari (M) Mayra (M) Carlos (M) Emile (D) April (M) Taylor (M) a) We can think about this group of students as a population. What proportion of students are master’s level students? What is the appropriate symbol? b) The students had to work in pairs to complete a group project. Suppose that one pair of students that worked together was Geffrey and Ivan. What proportion of students in this pair are master’s students? What is the appropriate symbol? 1 9/12 = 0.75 = p 1/2 = 0.5 = p(hat)
c) We can list all possible samples of size n =2. For each sample, we can determine the proportion of master’s students in each sample. Am rit & Gef frey ( ) Am rit & Ivan ( ) Am rit & Binita ( ) Am rit & Xinyu ( ) Am rit & Avik ( ) Am rit & Hari ( ) Am rit & Mayra ( ) Am rit & Car los ( ) Am rit & Emile ( ) Am rit & April ( ) Am rit & Tay lor ( ) Gef frey & Ivan ( ) Gef frey & Binita ( ) Gef frey & Xinyu ( ) Gef frey & Avik ( ) Gef frey & Hari ( ) Gef frey & Mayra ( ) Gef frey & Car los ( ) Gef frey & Emile ( ) Gef frey & April ( ) Gef frey & Tay lor ( ) Ivan & Binita ( ) Ivan & Xinyu ( ) Ivan & Avik ( ) Ivan & Hari ( ) Ivan & Mayra ( ) Ivan & Car los ( ) Ivan & Emile ( ) Ivan & April ( ) Ivan & Tay lor ( ) Binita & Xinyu ( ) Binita & Avik ( ) Binita & Hari ( ) Binita & Mayra ( ) Binita & Car los ( ) Binita & Emile ( ) Binita & April ( ) Binita & Tay lor ( ) Xinyu & Avik ( ) Xinyu & Hari ( ) Xinyu & Mayra ( ) Xinyu & Car los ( ) Xinyu & Emile ( ) Xinyu & April ( ) Xinyu & Tay lor ( ) Avik & Hari ( ) Avik & Mayra ( ) Avik & Car los ( ) Avik & Emile ( ) Avik & April ( ) Avik & Tay lor ( ) Hari & Mayra ( ) Hari & Car los ( ) Hari & Emile ( ) 2 1 1 1 1 1 1 .5 .5 0 1 1 1 1 .5 0 1 .5 .5 1 1 1 1 1 .5 .5 .5 1 1 1 1 .5 .5 1 1 1 .5 .5 .5 1 1 0 .5 .5 1 1 1 .5 .5 .5 .5 .5 .5 .5 .5
Hari & April ( ) Hari & Tay lor ( ) Mayra & Car los ( ) Mayra & Emile ( ) Mayra & April ( ) Mayra & Tay lor ( ) Car los & Emile ( ) Car los & April ( ) Car los & Tay lor ( ) Emile & April ( ) Emile & Tay lor ( ) April & Tay lor ( ) d) Create a probability distribution for the proportion of master’s students in samples of n = 2 . ˆ p P p ) 3 1 .5 .5 .5 .5 1 1 1 1 1 1 1 0 0.5 1 3/66 27/66 36/66
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The probability distribution of a statistic is referred to as the sampling distribution of that statistic. Instead of listing out all possible samples, we can use a simulation. A simulation is when we use a computer to pretend to draw random samples from some population of values over and over. A simulation can help us understand how sample proportions ( ˆ p ) vary due to random sampling. Even though the sample proportions vary from sample to sample, they do so in a pattern that we can model and understand. 4
Example: Uninsured ( Watch Video ) According to a website, in 2020, 11.1% of Americans do not have health insurance. A random sample of 35 Americans is taken. As a result, 7 do not have health insurance. a) Is the variable categorical or quantitative? b) Sketch a graph of the distribution of the population. Use the appropriate symbol and value. c) Sketch a graph of the distribution of the sample. Use the appropriate symbol and value. 5 Categorical; whether or not someone has health insurance. yes no Have health insurance? 88.9% 11.1% p = 0.111 Have health insurance? yes no 80% 20% p(hat) = 7/35 = 0.2
d) Besides the sample proportion that was observed: , we wonder what other sample proportions could have been observed. Generate 10,000 samples of size n = 35 . For each sample, determine the sample proportion of Americans without health insurance. Are the 10,000 values generated parameters or statistics? Describe the sampling distribution of ˆ p . Shape: Mean: Standard Deviation: 6 0.2 p(hat) is a statistic The collection of p(hat) values we could have observed. Unimodal, skewed right 0.1109 0.0539 (Assuming p = 0.111)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Example: Hurricane Evacuation It is believed that approximately 48% of households in Florida have no plans for escaping an approaching hurricane. In a random sample of 35 households, 13 have no plans for escaping an approaching hurricane. a) Is the variable categorical or quantitative? b) Sketch a graph of the distribution of the population. Use the appropriate symbol and value. c) Sketch a graph of the distribution of the sample. Use the appropriate symbol and value. 7 Categorical; whether or not a household has a hurricane escape plan. p = 0.48 yes Have a plan? no 52% 48% p(hat) = 0.371 Have a plan? yes no 37.1% 62.9% 13/53 =
d) Generate 10,000 samples of size n =35. For each sample, determine the proportion of households that do not have plans to escape an approaching hurricane. Describe the sampling distribution of ˆ p : Shape: Mean: Standard Deviation: 8 Unimodal and symmetric 0.4806 0.0845 *The collection of p(hat) that we might observe; a.k.a. sampling distribution of p(hat). Distribution of Sample vs. Sampling Distribution
General Properties of the Sampling Distribution of ˆ p Let ˆ p be the proportion of successes in a random sample of size n from a population whose proportion of successes is p . The mean value of ˆ p is . The standard deviation of ˆ p is . This formula is exact if the population is infinite and is approximately correct if the population is finite and no more than 10% of the population is included in the sample. When n is large and p is not too near 0 or 1, the sampling distribution of ˆ p is approximately normal. The farther the value of p is from 0.5, the larger n must be for a normal approximation to the sampling distribution to be accurate. A rule of thumb is that if np and n (1 p ) are BOTH at least 15, then a normal approximation provides a reasonable approximation to the sampling distribution of ˆ p . 9 p From Hurricane Example: n = 35, p = 0.48 np = 16.8, n(1-p) = 18.2 Since both are greater than 15, the sampling distribution of p(hat) is nearly normal.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
For each combination of n and p , determine the mean and standard deviation of the sampling distribution of ˆ p . Also, determine if the distribution could be modeled with a normal distribution. a) n = 150 , p = 0 . 95 b) n = 50 , p = 0 . 42 c) n = 180 , p = 0 . 09 10 np = 142.5, n(1-p) = 7.5; thus, since one or more is not greater than 15, the sampling distribution of p(hat) is probably not nearly normal. mean = 0.95 mean = 0.42 SD = 0.0697 np = 21, n(1-p) = 29 Since both values are greater than 15, the sampling distribution of p(hat) is approximately normal. mean = 0.09 SD = 0.021 np = 16.2, n(1-p) = 163.8 Since both valuesa are greater than 15, the sampling distribution of p(hat) is approximately normal. SD = sqrt[p(1-p)/n] = sqrt[0.95(0.05)/150) = 0.0178
Example: Quality Control A manufacturing firm purchases components for its products from suppliers. Good practice calls for suppliers to manage their production processes to ensure good quality. If a random sample contains too many components that don’t conform to specifications, the firm will not accept the shipment. A quality engineer at the firm chooses a random sample of 200 components from a shipment of 10,000 components. Suppose that 8% of the components in the shipment are nonconforming. a) What is the variable? Is the variable categorical or quantitative? b) Check the conditions for modeling the proportion of nonconforming components with a binomial distribution. c) Determine the mean and standard deviation of the sampling distribution for the proportion of nonconforming components in a sample of 200. d) Is it reasonable to model the sampling distribution with a normal distribution? 11 Categorical; whether or not a component is non-conforming. Two categories Randomization Sampling less than 10% np = 16, n(1-p) = 184 Since both values are at least 15, the sampling distribution of p(hat) is approximately normal. Mean = 0.08 SD = sqrt[(.08*.92)/200] = 0.019
e) Suppose the firm will not accept the shipment if they find convincing evidence based on the random sample that the proportion of nonconforming components in the shipment of 10,000 is higher than 0.08. Based on one sample of 200 components, 20 are nonconforming. Find P ( ˆ p 0 . 1) . Use the binomial distribution. Use the normal approximation. 12 P(20 or more successes) = 0.1789 (Using Binomial Distribution Calculator) 0.1463 is an approximation to the 0.1789. (Distribution Calculator) Mean of 0.08 and SD of 0.019 (found above), X greater than 0.1 P(p(hat) greater than or equal to 0.1) = 0.1463
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Normal Approximation with Continuity Correction Graphically, the binomial probability adds area of the gray rectangles in the plot below. The normal approximation finds the area under the normal curve that is to the right of 20. If we zoom in on the left limit, we might come up with a better approximation. Each rectangle corresponds to an integer. Where does the rectangle for “20” start? Use this correction to approximation the probability. Then, compare it to the binomial probability. 13 19.5 P(p(hat) greater than or equal to 0.1) = P(p(hat) > 19.5/200 = P(p(hat) > 0.0975) = 0.1785 (with binomial distribution)
Example: Roller Coasters About 68% of roller coasters are constructed of steel. A random sample of 10 roller coasters is taken. As a result, 4 were constructed of steel and 6 were constructed of wood. a) Sketch a graph of the distribution of the population. Use the correct symbol and value. b) Sketch a graph of the distribution of the sample. Use the correct symbol and value. c) Use the normal approximation to estimate the chance that in a sample of 10 roller coasters, less than 4 will be constructed of steel. 14 Steel Wood 68% 32% p = 0.68 60% 40% p(hat) = 0.4 Steel Wood Mean of p(hat) = 0.68 SD of p(hat) = 0.147 Using the normal distribution calculator, the area to the left of 0.4 is 0.0288. np and n(1-p) are not both greater than 15; thus, the probability of 0.0288 is not valid. Use the binomial distribution instead.