Lab_Wk 07_2023

docx

School

University of Wollongong *

*We aren’t endorsed by this school

Course

251

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

11

Uploaded by PresidentMusicCaterpillar33

Report
STAT251 Fundamentals of Biostatistics LABORATORY NOTES, Week 7 Hypothesis Tests Aim: The aim of this lab is to perform hypothesis tests relating to a single sample. Note: Starred (*) exercises do not require Jamovi. 1. P-Values without Raw Data In some situations, only numerical summaries (e.g. ¯ x, s,n ) are available rather than an original raw data set. Test statistics required for hypothesis tests can easily be calculated from summary data, but the corresponding p-values usually require a computer. These calculations can be easily done in Jamovi, Excel, and other statistical packages. Log book question: 1. Calclulate the p-values (shaded areas) for the following hypothesis tests. In each case, state whether the null hypothesis would be rejected using a significance level = 0.05. (Reject H 0 if the p-value p .). a. H 0 : = 60 H a : ≠ 60, t = -0.9, df = 19 b. H 0 : p = 0.25 H a : p < 0.25, z = – 1.8 Hint : For each question, Decide whether you need a one-tailed or two-tailed test based on H a On the provided diagrams, shade the area that corresponds to your p-value. Remember that p-value is the probability of observing a test statistic (t or z score) at least as extreme as the observed value, in the direction of H a . Using the distrACTION module, an online calculator or statistical tables, calculate the p-value according to your H a . In this case, the x 1 that Jamovi asks for is the observed t or z score. (Point to ponder: if the shaded area is two-tailed, you can also specify an x 2 value and Jamovi will give the area in between x 1 and x 2 ; how would you use this to get the two tailed p-value?) Compare the probability p you calculated to the significance level and decide whether or not to reject H 0 . 1
The following table may help you: Alternate hypothesis (H a ) p-value H a : 0 2 × min(P(Z ≤ x 1 ), P(Z ≥ x 1 )) H a : > 0 P(X ≥ x 1 ) H a : < 0 P(X ≤ x 1 ) 2.* Hypothesis Test for Single Proportion (Large Sample) (Based on Agresti & Franklin, 8.66) In an experiment on chlorophyll inheritance in maize (corn), of the 1103 seedlings of self-fertilized green plants, 854 seedlings were green and 249 were yellow. Theory predicts the ratio of green to yellow is the ratio 3 to 1. Using a 5% significance level, test the hypothesis that 3 to 1 is the true ratio, following the steps below. (Hint: let p and ^ p denote the true and sample proportion of green seedlings, respectively.) Log book question: 2. Perform a hypothesis test that the true ratio of green samplings is 3 to 1: a. Hypotheses: Is there any particular reason to use a one-sided alternative? Write down H 0 and H a . b. Assumptions: Are these data categorical or continuous/numerical? Is the sample size sufficiently large to use a Normal approximation? Assume that the 1103 seedlings are randomly sampled. c. Test Statistic: Evaluate the test statistic z = ^ p p 0 p 0 ( 1 p 0 )/ n by calculator (Hint remember p 0 comes from your H 0 ) . d. P-value: Use a web calculator or the distrACTION module in Jamovi to find the P- value for this z-score. e. Decision: What is the value of ? Is the p-value such that you’d reject the null hypothesis, H 0 ? f. Conclusions: State your conclusions in the context of this particular application. 2
3. Paired t-Test (Based on Rosner, p 290). It is claimed that reported nutrient consumption for Total calories intake is different when estimated using the diet record (DR) or the food frequency questionnaire (FFQ). Data from 173 nurses were obtained. In Jamovi: Download valid.sav from Moodle. Double click the empty column next to the variables CALOR_DR and CALOR_FF, then select “New computed variable” when prompted. Name the new variable “DR-FF difference”, and type “CALOR_DR - CALOR_FF” into the formula box to compute this variable. Test whether Total calories intake by using DR and FFQ methods differ on average, using a significance level of 0.01. This test can be performed in two ways using Jamovi. Method 1: One-Sample T-Test on Differences Method Jamovi: Click on Analyses T-Tests One Sample T-Test , Put the variable DR-FF difference across to the Dependent Variables list. Under Additional statistics , select Mean difference and change the Confidence interval from the default of 95% to 99%. Also select the Descriptives option. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
One Sample T-Test One Sample T-Test 99% Confidence In- terval Statis- tic df p Mean dif- ference Lower Upper differ- ence Stu- dent's t 6.865 8 172.000 0 < .000 1 248.1428 154.003 6 342.281 9 Method 2: Paired T-Test Method Alternatively, in Jamovi, Click on Analyses T-Tests Paired Sample T-Test and transfer CALOR_DR and CALOR_FF into the Paired Variables box Under Additional statistics , select Mean difference and change the Confidence interval from the default of 95% to 99%. Also select the Descriptives option. 4
Paired Samples T-Test Paired Samples T-Test 99% Confidence Interval stati stic df p Mean differ- ence SE dif- fer- ence Lower Upper CALOR_D R CALOR_ FF Stu- dent' s t 6.865 8 172.00 00 < .00 01 248.14 28 36.14 18 154.00 36 342.28 19 Log book question: 3. Use the Jamovi output (of either test) to perform the hypothesis test: a. Hypotheses: Let d denote the population mean difference between Total calories by using DR and FFQ methods. Write down H 0 and H a . Note that d = 0 corresponds to no difference between DR and FFQ methods. Would you consider a one-sided or two-sided alternative? b. Assumptions: Check whether the difference data are normally distributed using Analyses Exploration Descriptives . Remember to click on the plots box and tick the histogram, QQ plot and normality test option. 5
c. Test Statistic: Evaluate the test statistic t = ¯ x d μ 0 s d / n by calculator, by using the sample mean and sample standard deviation of difference from the Jamovi descriptives output. d. Decision: What is the value of ? Is the p-value such that you’d reject the null hypothesis, H 0 ? e. Conclusions: State your conclusions in the context of this particular application. 4. One-sample t-Test The one-sample t-test is also useful if you wish to compare your data to a value which is not zero. For example, if we wanted to compare the mean caloric intake using the diet record to the Recommended intake for an Australian adult (2080 calories), we would do the following for a hypothesis test at an level of, say, 0.01. In Jamovi: Click on Analyses T-Tests One Sample T-Test , Click CALOR_DR across to the Dependent Variable list. Under Hypothesis , change the Test Value from 0 to 2080. Under Additional statistics , select Mean difference and change the Confidence interval from the default of 95% to 99%. Also select Descriptives . 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Since the p-value p is very small (smaller than our cutoff ), we conclude that there is sufficient evidence to believe that the population mean caloric intake is different from the recommended intake of 2080. Log book questions: 7
4. Suppose that we wanted to know if there was evidence that the population mean caloric intake using the diet record was higher than the recommended intake of 2080. a. What would be the hypotheses? b. Based on the Jamovi output, what would be the one-tailed p-value? c. Write the conclusion in the context of the problem. d. Test whether the population mean alcohol intake recorded using the diet record is significantly different to the recommended upper consumption level of 20g per day. Is it higher or lower? Perform the steps of the hypothesis test with 1% level of significance. 5.* Optional: Binomial Test for Proportion (Small Sample) The following example illustrates a binomial test for proportion. Example: A physiotherapist who has experience in concussion treatment, claims that at most 50% of patients with a mild concussion usually recover within 7 days without any specific treatment except rest. This is important for health care as if this is so, then waiting 7 days to give treatment can have an impact on the system and the patients who need extra treatment.. As a small pilot she collects data on 20 concussed patients, after 7 days 12 of them are free of symptoms. Would you conclude that her claim can be believed? And then, is it worth to wait always for 7 days before starting any treatment?As the sample size is small, we can’t use a normal approximation, so we use the binomial distribution in the distrACTION module in Jamovi with n=20 and p = 0.5. We are calculating the p-value of P(X≥12). From the distrACTION module, P(X≥12) = 0.252. Since this P-value > 0.05 we would not reject the null that p 0.5, and conclude that there is sufficient evidence to support the claim that at most 50% of patients recover. As for waiting for 7 days one could make an argument that at least 50% DON’t get better in 7 days, and depending on the symptoms during that time one should or not wait. Optional log book question: 5 A new therapy is available for treating multiple sclerosis. The treatment has been successful in 9 out of 12 patients who have tried the therapy. A treatment is considered successful if the probability of success is greater than 0.5. Based on these data, is there sufficient evidence that it is? Write down H 0 and H a . As the sample size is small, evaluate the p-value using the Binomial distribution in Jamovi or a web calculator. 8
Appendix: If time permits, this section can be attempted – otherwise do as homework. A Confidence Intervals for Proportions (Jamovi) Open the MathScienceTest data file, available from Moodle. The columns Q1 to Q14 contain the responses of individual students to 14 questions on a test, where 1 means correct and 0 means incorrect. We will focus on Question 6 (variable Q6 ). Ignore all the others. Record the sample count ( x ) of correct responses to Q6 , sample size ( n ), sample proportion ( x / n ) Jamovi: only needed to get summaries Use Analyses → Exploration → Descriptives, then select the Frequency Table box to obtain x and n in the following table, and Repeat the procedure for the first 10 students in the dataset. To select the first 10 students, create a new computed variable named rownum by clicking on the Data tab and selecting Add Insert computed variable . Type “ROW()” in the formula box to get the row number for each row. Now click on the Filter button. A new window will open, allowing us to specify what we want the filter to do. Type the following formula into the filter’s formula box to filter only rows 1 to 10: 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
X n ^ p = x / n all students first 10 students Log book questions: 6. a. Use the table above, your calculator, and Jamovi/web calculator or printed tables to find a 90% confidence interval for the population proportion of students who answer question 6 correctly, based on the entire sample. b. Use the table above, your calculator and Jamovi/web calculator or printed tables to find a 95% confidence interval for the population proportion of students who answer question 6 correctly, based on the entire sample. 10
B Confidence Intervals for Proportions: Methods for small samples For large sample size n , the formula ^ p± z 1 α / 2 ^ p ( 1 ^ p )/ n provides a good approximation to the confidence interval for a population proportion, and is easy to evaluate by calculator. Note that ^ p = x / n is the sample proportion and z 1 α / 2 is the z -score which cuts off an area of 1 α / 2 = 0.975 “below” the standard normal curve Log book question: 7. Use your calculator to apply the large sample confidence interval method to the proportion answering question 6 correctly among the first 10 students , using 95% confidence. What is clearly wrong with the lower limit? One simple way of improving the accuracy of a hand-calculated confidence interval for a proportion based on a small sample size is to add 2 to x and add 4 to n , then use the standard large sample formula. Log book question: 8. Re-evaluate the “large sample” 95% confidence interval based on the first 10 students, after adding 2 to x and 4 to n . Show that the original estimate x / n lies within the new interval, but is no longer the midpoint. 11