STAT1250_SGTA7

docx

School

Macquarie University *

*We aren’t endorsed by this school

Course

1250

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

5

Uploaded by MagistrateFog23892

Report
© Copyright Macquarie University 1 Stat1250 Tutorial 7: Means: Distribution and CI y - In this SGTA we will: Consider the distribution of sample means. Calculate probabilities for sample means. Calculate confidence intervals for population means. In lectures we learned about the Normal distribution. If we have a population with a Normal distribution, we can calculate probabilities for the random variable, 𝑦 , using the Excel command: NORM. DIST ( 𝑦 , 𝜇 , 𝜎 , true ) . In this SGTA, we will extend the use of the Normal distribution to calculate probabilities for the sample mean, 𝑦 - , and confidence intervals for the population mean, μ . Central Limit Theorem The Central Limit Theorem (CLT) is very important in statistics. The CLT states that the means of repeated random samples have a distribution which is approximately Normal. The CLT allows us to use the Normal distribution to calculate probabilities for the mean of a sample, 𝑦 - , regardless of the distribution of the original population. For the CLT to apply, we need to have a large enough sample size. We will use 25 as the minimum sample size required. The larger the sample size, the closer that the distribution of the sample means will be to a Normal distribution. If we have an original population with a Normal distribution, then sample means from this population will have a Normal distribution, regardless of the sample size. Sampling Distribution of the Mean We can take many samples of the same size and calculate the sample mean for each, giving us many values of 𝑦 - . The distribution of all the sample means (values of 𝑦 - ) from all possible samples of the same size from a population is known as the sampling distribution of the mean. The sampling distribution of the mean is centred at the same mean, μ, as the original population. The standard deviation of the sample mean is known as the standard error. The standard error of the sample mean is calculated as: 𝜎 = 𝜎 𝑛 where σ is the standard deviation of the original population and 𝑛 is the sample size. We can see from this formula that as the sample size increases, the standard error of the mean becomes smaller. Using these values for the mean and standard error of the sampling distribution of the mean, we can calculate probabilities for sample means using Excel: N ORM. DIST ( 𝑦 - , 𝜇 , C5 , true ) . Confidence Intervals: If we can apply the CLT or have other evidence of a Normal distribution, we can use the sample mean to calculate a confidence interval for the range of plausible values for the population mean, 𝜇 . The 95% confidence interval for 𝜇 is: - 𝑦 ± 1.96 X C5 when σ is known and You are expected to read this material and think about the problems before coming to class. SGTA 7: Means sampling distribution, probabilities and confidence intervals
© Copyright Macquarie University 2 Stat1250 Tutorial 7: Means: Distribution and CI - 𝑦 ± 𝑡 crit X s when σ is not known
© Copyright Macquarie University 3 Stat1250 Tutorial 7: Means: Distribution and CI We are going to use an example of student incomes to demonstrate the sampling distribution of the mean. The data is in Student_Income.xlsx in the SGTA section of iLearn. The target population is the weekly income of students in 2017. The mean weekly income of STAT150 students in 2017 (population mean), μ , was $198 and the population standard deviation, σ , was $34. A histogram of the weekly income of a large number of students from this population is shown, below, to illustrate the shape of the weekly income distribution. a) What shape is the distribution of student weekly income? Right skewed b) Can we use the Normal distribution to calculate probabilities for the weekly income of an individual student? no c) We have taken 20 random samples from this population, each of sample size 30 and have calculated the mean of each sample. What shape is the histogram of sample means? What can you say about the range of values in this histogram compared to the histogram of individual student weekly income? Can we use the Normal distribution to calculate probabilities for the mean weekly income of a sample of 30 students? Histogram of Sample Means 6 5 4 3 2 1 0 18 0 18 5 19 0 195 Sample Means 20 0 20 5 21 0 The Sampling Distribution of the Mean Frequency
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
y - © Copyright Macquarie University 4 Stat1250 Tutorial 7: Means: Distribution and CI The United Nations has set out 17 Goals for Sustainable Development. Goal 3 relates to good health and well being. https://www.un.org/sustainabledevelopment/ health/ . Avocados offer health benefits including healthy fats, antioxidants, fibre and folate. Australia produced 66,000 tonnes of avocados in 2016/17 with a gross value of production of $398 million. The Shepard variety of avocado are grown on the Atherton Tablelands in Queensland. Shepard avocado in a particular production year is known to have an average weight of 210g with a standard deviation of 15g. Avocados may be sold in trays of 25 fruit in which the individual fruit will vary in weight. The avocado producer would like to estimate the probability that the average weight of avocados in a tray of 25 is less than 200g. Research question: What is the probability that mean weight of avocados in a sample (tray) of 25 is less than 200g? a) Write down the values that you need to answer this Research question: 𝑛 = 25 𝜇 =210g 𝜎 = 15g y - = 200g b) Why can we use the Normal distribution to calculate this probability? c) To use the Normal distribution, we need to calculate the standard error: 𝜎 = C5 . 𝜎 y- =3 d) Draw a diagram of the Normal distribution that you would use to answer the Research question. Mark the mean and shade the area of interest. e) Use NORM.DIST in Excel to find the probability. You can use the table, below. Excel Probability Excel Probability NORM.DIST(200,210,25,TRUE) 0.3446 NORM.DIST(200,210,15,TRUE) 0.2525 NORM.DIST(210,200,25,TRUE) 0.6554 NORM.DIST(200,210,3,TRUE) 0.0004 f) Interpret this result: Calculating Probabilities for Sample Means
© Copyright Macquarie University 5 Stat1250 Tutorial 7: Means: Distribution and CI The United Nations has set out 17 Goals for Sustainable Development. Goal 3 relates to good health and well being. https://www.un.org/sustainabledevelopment/ health/ . The Australian Government Department of Health has physical activity guidelines which recommend adults: “Accumulate 150 to 300 minutes (2 ½ to 5 hours) of moderate intensity physical activity or 75 to 150 minutes (1 ¼ to 2 ½ hours) of vigorous intensity physical activity, or an equivalent combination of both moderate and vigorous activities, each week”. http://www.health.gov.au/internet/main/publishing.nsf/content/F01F92328EDADA5BCA257B F0001E720D/$File/brochure%20PA%20Guidelines_A5_18-64yrs.pdf Research question: What is the average amount of physical activity per week by first year students at an Australian university? A study of 40 first year university students at an Australian university recorded the amount of physical activity per week. The histogram and descriptive statistics for the data are: a) Use the histogram and descriptive statistics to observe and comment on the data: b) Calculate a 95% confidence interval for the average amount of physical activity per week by first year university students. You can use the Excel from the table, below. Excel Value Excel Value T.INV(0.975,40) 2.0211 T.INV(0.95,39) 1.6849 T.INV(0.975,39) 2.0227 NORM.INV(0.975,0,1) 1.9600 c) Interpret this confidence interval. 95% Confidence Intervals for the Population Mean μ