asg4

pdf

School

University of Alberta *

*We aren’t endorsed by this school

Course

1511

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

5

Uploaded by JusticeBraveryMandrill39

Report
1 LAB 4 ASSIGNMENT Due Date: Friday April 5 at 9:59 PM INFERENCES FOR NUMERICAL DATA IMPORTANT: 1) In this lab, you will need to use statistical functions in R ( or R Commander) to generate the outputs. 2) For all graphs and charts, please label the axis and ensure proper titles are used. 3) For all tables, please ensure the correct variable name(s) are used. 4) Each group will be expected to create a Google document for the lab report where students will type their answers (in full sentences) and paste the R output (where necessary) for each lab question. 5) Completed assignments will be saved as a PDF file, submitted, graded, and returned on eClass. 6) Each lab group MUST upload and submit only ONE lab report, so students MUST work together to complete the lab assignment together. 7) Please see the Lab Submission Info tab through the Lab Information link in the Labs section on eClass. In this lab assignment, you will learn how to examine data produced in a simple statistical experiment. In particular, you will examine the experiment design and apply graphical, numerical, and inferential tools available in R (or R Commander) to compare two distributions produced by the study. Before you start working on the assignment, you should review the course material about designing experiments and comparing two population means. Caffeine Dependence Experiment Caffeine is the world's most widely consumed mood-altering substance. In North America, about 90% of adults consume caffeine daily. Coffee is the leading dietary source of caffeine among adults in Canada, while soft drinks represent the largest source of caffeine for children. People who consume large amounts of caffeine each day may experience physical withdrawal symptoms if they stop taking in their usual amounts of the substance. In this lab assignment, you will follow an experiment on caffeine dependency conducted on a group of volunteers by some medical researchers. The researchers recruited 70 volunteers who drink at least 3 cups of coffee each day and are in good health. Of these 70 volunteers, 50 were diagnosed as caffeine-dependent based on some general substance dependence criteria. Of the 50 subjects who were diagnosed as caffeine dependent, 30 agreed to participate in a study to evaluate their caffeine dependency. Before the experiment was conducted, daily caffeine intake measurements for each subject were obtained based on food diaries of the participants. The experiment was conducted on two 2-day periods which occurred exactly one week apart. During one of the 2-day periods, the subjects were given a set of capsules containing the amount of caffeine normally ingested by the subject in one day. During the other study period, the subjects were given placebos. The order in which each subject received the two types of capsules was randomized. At the end of each 2-day study period, subjects were evaluated in three areas: depression symptoms, fatigue, and concentration. The experimenters were blinded to whether the subject was receiving the caffeine pills or the caffeine-free pills. The proper data are available in the Data link located in the Lab 4 tab display in the Labs section on eClass. The data are not to be included with your submission.
2 The following is a description of the variables in the data file: Variable Name Description of Variable Subject Subject number (a whole number from 1 to 40), Depr_Caf Depression score during caffeine period, Depr_NC Depression score during no-caffeine period, Fatig_Caf Fatigue score during caffeine period, Fatig_NC Fatigue score during no-caffeine period, Focus_Caf Concentration/focus score during caffeine period, Focus_NC Concentration/focus score during no-caffeine period, Smoker Smoker status (Y, if smoked cigarettes daily; N, otherwise), Caffeine Daily intake of caffeine (in mg), Change_In_Depr Change in depression score (Change = Depr_NC – Depr_Caf). Use the data provided to answer the following questions: 1. First you will examine the experiment design. (a) Can we treat the 30 subjects who agreed to participate in the study as a random sample from the population of all caffeine-dependent individuals? Explain why or why not. Can you generalize the results of the study to the population of all caffeine-dependent individuals? Explain briefly. (b) Why were the subjects blinded to receiving the caffeine pills or the caffeine-free pills? Why were the experimenters interviewing the subjects blinded? (c) Why was the order in which the two series of capsules were taken randomized? (d) Why were the two study periods held one week apart instead of using two consecutive 2-day periods? 2. Now you will use inferential tools in R (or R Commander) to compare the levels of depression for the no-caffeine and caffeine periods. (a) Using α = 0.01, do the data give sufficient evidence that being deprived of caffeine raises depression scores? Use an appropriate test to answer the question. State the null and alternative hypotheses in terms of parameters . Pate the output in your report. Report the value of the appropriate test statistic, the distribution of the test statistic under the null hypothesis, and the p - value of the test to answer the question. State your conclusion. (b) Obtain also a 98% two-sided confidence interval for the mean difference in the depression scores between no-caffeine and caffeine periods. ( Hint: Although a one-sided confidence interval can be obtained in R, this type of confidence interval is not discussed in STAT 151 classes. Thus, students must use a two-sided confidence interval to answer this question). Paste the output in your report. Interpret the confidence interval. Use the confidence interval to answer the question in part (a). Is the confidence interval’s result consistent with the outcome of the test in part (a)? Explain briefly. (c) What assumptions must be satisfied to justify the procedures you used in (a) and (b)? Are the assumptions met in this case? Obtain the appropriate plot to verify if the underlying population is normally distributed and paste it into your report. Based on this plot, is it reasonable to assume that the underlying population is normally distributed? Explain. What is the chief threat to the validity of the results obtained in parts (a) and (b)?
3 3. Now you will use inferential tools in R (or R Commander) to compare the changes in depression levels (Depr_NC minus Depr_Caf) for smokers and non-smokers. (a) Using α = 0.10, use an appropriate test to see whether the change in the depression scores is different for non-smokers and smokers. Explain the choice of your test. State the null and alternative hypotheses in terms of parameters . Paste the output in your report. Report the value of the appropriate test statistic, the distribution of the test statistic under the null hypothesis, and the p -value of the test to answer the question. State your conclusion. (b) Obtain a 90% confidence interval for the difference in the mean depression scores for non- smokers and smokers. Interpret the 90% confidence interval. Use the confidence interval to answer the question in part (a). Is the confidence interval consistent with the test in part (a)? Explain briefly. (c) What assumptions must be satisfied to justify the procedures you used in (a) and (b)? You do not need to verify the assumptions. 4. You have studied the change in depression scores exhibited by caffeine-dependent individuals when they are deprived of caffeine. Now you will use inferential tools in R (or R Commander) to explore changes in fatigue and focus. (a) Using α = 0.05, do the data give sufficient evidence that being deprived of caffeine raises fatigue scores? Use an appropriate test to answer the question. State the null and alternative hypotheses in terms of parameters . Paste the output in your report. Report the value of the appropriate test statistic, the distribution of the test statistic under the null hypothesis, and the p -value of the test to answer the question. State your conclusion. (b) Using α = 0.05, do the data give sufficient evidence that being deprived of caffeine lowers focus scores? Use an appropriate test to answer the question. State the null and alternative hypotheses in terms of parameters . Paste the output in your report. Report the value of the appropriate test statistic, the distribution of the test statistic under the null hypothesis, and the p -value of the test to answer the question. State your conclusion. 5. Summarize briefly your findings in Questions 1-4 in the form of a brief report. In particular, indicate which of the three withdrawal symptoms (more depressive mood, more fatigue, less focus) seems to be the most intense. Refer to the test results in your summary. Can these results be generalized to the whole population of caffeine dependent individuals?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 LAB 4 ASSIGNMENT: MARKING SCHEMA Proper Title Page (Using Lab Assignment Template on eClass): 5 marks Appearance: 5 marks (1 bonus point for each question submitted properly on eClass) Note: Lab assignments must be typed and submitted on eClass . A handwritten assignment is not acceptable and it will receive a mark of zero for the whole assignment. Question 1 (12) (a) Random sample or not with explanation: 2 marks Generalization to population: 2 marks (b) Reason for subjects: 2 marks Reason for experimenters: 2 marks (c) Order of administration of pills: 2 marks (d) Timing of the two study periods: 2 marks Question 2 (39) (a) Choice of test: 2 marks R output: 3 marks Hypotheses: 2 marks The value of the test statistic: 2 marks The distribution of the test statistic under the null hypothesis: 2 marks The p -value: 2 marks Conclusion: 2 marks (b) R output: 3 marks The 98% confidence interval: 2 marks Interpretation of the confidence interval: 2 marks Using the confidence interval to answer question in part (a) with justification: 4 marks Consistency of interval with the test: 2 marks (c) Specifying assumptions (in general): 2 marks Checking assumptions: 2 marks Plot to verify normality assumption: 3 marks Normality assumption for the data: 2 marks Chief threat to validity: 2 marks Question 3 (25) (a) Choice of test: 2 marks R output: 3 marks Hypotheses: 2 marks The value of the test statistic: 2 marks The distribution of the test statistic under the null hypothesis: 2 marks The p -value: 2 marks Conclusions: 2 marks
5 (a) The 90% confidence interval: 2 marks Interpretation of the confidence interval: 2 marks Using the confidence interval to answer question in part (a) with justification: 2 points Consistency of interval with the test: 2 marks (b) Assumptions of the appropriate test: 2 marks Question 4 (30) (a) Choice of test: 2 marks R output: 3 marks Hypotheses: 2 marks The value of the test statistic: 2 marks The distribution of the test statistic under the null hypothesis: 2 marks The p -value: 2 marks Conclusions: 2 marks (b) Choice of test: 2 marks R output: 3 marks Hypotheses: 2 marks The value of the test statistic: 2 marks The distribution of the test statistic under the null hypothesis: 2 marks The p -value: 2 marks Conclusions: 2 marks Question 5 (4) Brief summary (using your answers to the questions): 3 marks (one mark for depression, one for fatigue and one for focus) Generation to the population: 2 marks TOTAL = 120