Section #04.2 shared lab

docx

School

Pennsylvania State University *

*We aren’t endorsed by this school

Course

200

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

6

Uploaded by BarristerFlag13728

Report
LAB 4.2 Statistics 200: Lab Activity for Section 4.2 Measuring Evidence with P-values - Learning objectives: Recognize that a randomization distribution shows what is likely to happen by random chance if the null hypothesis is true Use technology to create a randomization distribution Interpret a p-value as the proportion of samples that would give a statistic as extreme as the observed sample, if the null hypothesis is true Distinguish between one-tailed and two-tailed tests in finding p-values Find a p-value from a randomization distribution Activity 1: Create a Randomization Distribution This activity is meant to have you participate in the creation of a randomization distribution to understand that it shows a distribution of sample statistics that were created assuming the null hypothesis is true. Every year in Punxsutawney, Pennsylvania, a famous groundhog, Phil, makes a prediction about the end of winter. If he comes out of his burrow and sees his shadow, he predicts six more weeks of winter. If he does not see his shadow, his prediction is early spring. In the ten years from 2011 to 2020, Phil made the following predictions: Year Prediction February temperature Prediction accuracy 2020 End of winter Above normal Correct 2019 End of winter Below normal Incorrect 2018 More winter Above normal Incorrect 2017 More winter Above normal Incorrect 2016 End of winter Above Normal Correct 2015 More winter Below normal Correct 2014 More winter Below normal Correct 2013 End of winter Above normal Correct 2012 More winter Below normal Correct 2011 End of winter Below normal Incorrect Are his predictions better than a random 50-50 chance? 1. What are the correct null and alternative hypotheses? Hint – what is p if his predictions are random? H 0 : p =0.5 H a : p does not equal 0.5 2. What is p-hat when considering this example (round your answer to 4 decimal places, 0.xxxx) ? P hate = 7/10 or .7 3. To create a randomization distribution, we must determine what the distribution of p-hat is if his prediction is random. We will use virtual coins, which have a true 50% chance of being heads. Go to justflipacoin.com 2/19/20 © - Pennsylvania State University
LAB 4.2 How many times will you need to flip this penny to create one sample statistic for the randomization distribution? 10 times 4. Now flip the penny that many times. Pretend that getting a heads with the coin is equivalent to Phil making a correct prediction. What was your p-hat? .6 5. Did your sample make as many correct predictions as Phil? yes 6. Now we will go big and have StatKey create many many more statistics for our randomization distribution. Verify that the null is set to the correct proportion and edit the data as necessary. Now generate at least 5000 samples. a. Where is the randomization distribution centered? b. Find the p-value. In StatKey, click on the correct tail (right or left), then click on the box along the x-axis. Enter in our original sample statistic (from part 2), correct to 3 decimal places, 0.xxx. What was the p-value? c. Interpret the p-value in context: If Phil is choosing randomly, the chance that he would correctly predict at least 6 out of 10 years is ________. 2/19/20 © - Pennsylvania State University
LAB 4.2 (Continue on to next page) 7. In 2021 Phil correctly predicted that there will be more winter, bringing his record to 7 correct of 11. The randomization distribution for this scenario is below: Using the new data and randomization distribution, what is our sample p-hat and the approximate p-value for testing the same hypothesis we wrote in question 1? (Choose the correct answer from below) a. p-hat = 0.636, p-value = 0.276 b. p-hat = 0.636, p-value = 0.724 c. p-hat = 1, p-value = 0.007 d. p-hat = 1, p-value = 0.039 Activity 2: Where is The Middle? For the settings below, determine a) where the middle of the randomization will be and b) whether the hypothesis test is right-tailed, left-tailed, or two-tailed. Finally consider c) how to find the p-value. 1. To test H 0 : m = 45 vs H a : m > 45 using sample data with x = 53.7: a. Where will the randomization distribution be centered? Centered at 45 b. Is this a left-tail test, a right-tail test, or a two-tail test? Right tailed c. How can we find the p-value once we have the randomization distribution? Set the x axis value to the sample statistic and chose a right tailed test Example answer: Find the proportion of randomization statistics that are to the left of the sample statistic of 2. (use this as a guide when answering questions 1.c, 2.c and 3.c). 2. To test H 0 : p 1 = p 2 vs H a p 1 ≠ p 2 using sample data with ^ p 1 ^ p 2 = 0.35: a. Where will the randomization distribution be centered? 2/19/20 © - Pennsylvania State University
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
LAB 4.2 At 0 b. Is this a left-tail test, a right-tail test, or a two-tail test? Two tailed c. How can we find the p-value once we have the randomization distribution? Set statkey to two tailed then double the p value given 3. To test H 0 : ρ = 0 vs H a: ρ < 0 (rho) using sample data with r = -0.13: a. Where will the randomization distribution be centered? 20 b. Is this a left-tail test, a right-tail test, or a two-tail test? Right tailed c. How can we find the p-value once we have the randomization distribution? First need to calculate test statistics which is generally z score,t score. then need to calculate probability for the caculated test statistics, this probability is called p-value. suppose we get test statistics as 2.91 so p-value=p(Z>2.91) 4. Here is a randomization distribution and p-value calculation based on a sample statistic of -1.07. Select the hypothesis set that could correspond to this randomization distribution and p-value calculation: a. H 0 : p 1 - p 2 = 0 vs H a p 1 - p 2 ≠ 0 b. H 0 : μ = 0 .24 vs H a: μ < 0.24 c. H 0 : p = 0.5 vs H a: p < 0.5 d. H 0 : ρ = 0.5 vs H a: ρ ≠ 0.5 2/19/20 © - Pennsylvania State University
LAB 4.2 Activity 3: Use StatKey to Create a Randomization Distribution and Find a P- value. In a 2014 study to compare on time arrivals of airlines, 1000 Delta flights and 1000 United flights were randomly selected from the month of December in the US. For each flight, the difference between the actual and scheduled arrival time was recorded (so a negative time means the flight was early). We wish to see whether this data provides evidence that Delta has a better arrival record than United (or, more precisely, that the mean difference of times of Delta is significantly lower than the mean of United) Group 1: Delta times and Group 2: United times. Use the dataset provided in StatKey called ‘Arrival Time –2e (Delta vs. United, 2014) ‘ Understanding The Research: 1. What type of study was this based on the summary above? 2. What are the cases? 3. What type of and how many variables are in the research question? 4. What is the parameter of interest? Analysis: 1. State the null and alternative hypotheses for this test. Define any parameters used . H 0 : mu 1 = mu 2 vs Ha:mu1≠mu2 2. The data on Arrival Time is one of the available datasets in StatKey, under Test for a Difference in Means.) Use StatKey to create a randomization distribution for this test using at least 5,000 samples Use the randomization distribution to indicate whether each of the following possible differences in means is very likely to occur just by random chance, relatively unlikely to occur but might occur occasionally, or very unlikely to ever occur just by random chance: Difference in means –7 1 -4 -0.5 6 Likelihood no yes no yes no Note: Only consider the H 0 when answering this question 3. What is the observed difference in means from the Original Sample? Give notation and the value of the sample statistic. Mu1 – mus 2 = -2.82 or 12.83 4. Where does the sample statistic lie in the randomization distribution? Is it likely or unlikely to occur just by random chance? 2/19/20 © - Pennsylvania State University
LAB 4.2 Likely, close to middle 5. Use the sample statistic to find the p-value. Is it large or small? .055 x 2 = .11 6. Complete the interpretation for the p-value: If Delta and United flights are (equally, not equally) on time, then the chance that we see a sample statistic of ________________ or any statistic _ (larger, smaller) __ is _____________________. 7. Use your randomization distribution from part 3 to match the sample statistics (i.e. difference in means) below to the corresponding p-values. You can answer without doing any calculations. Sample statistic p-value -1 0.968 1.5 0.285 3.25 0.995 4.4 0.804 Hint – Think about both hypotheses when answering this question. 2/19/20 © - Pennsylvania State University
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help