Homework6_SP2024

docx

School

University of Texas *

*We aren’t endorsed by this school

Course

2241

Subject

Biology

Date

Apr 3, 2024

Type

docx

Pages

9

Uploaded by PrivateOxideRam163

Report
PH 1690: Introduction to Biostatistics for Public Heal th Assignment 6 Part A 1. Auto exhaust and lead exposure . Researchers interested in lead exposure due to car exhaust sampled the blood of 52 randomly selected police officers subjected to constant inhalation of automobile exhaust fumes while working traffic enforcement in a primarily urban environment. The blood samples of these officers had an average lead concentration of 124.43 μg / l and a SD of 37.74 μg / l ; a previous study of individuals from a nearby suburb, with no history of exposure, found an average blood level concentration of 35 μg / l . Researchers are interested in determining if the police officers in the urban environment have been exposed to a different concentration of lead than police officers in the suburbs. n:52 mean : 124.43 sd:37.74 (a) Define the parameter of interest in the context of the question. the 52 randomly selected police officers subjected to constant inhalation of automobile exhaust fumes while working traffic enforcement in a primarily urban environment. (b) Using words and symbols, define the hypotheses that would be appropriate for testing the research question. H0 : δ = 0; HA : δ ≠ 0 (c) What test would be appropriate to answer this research question? Single t-test and then p-value (d) Explicitly state and check all conditions necessary for inference on these data. Independent?:Yes Sample larger than 30?:Yes , they have 52 Random?: yes Normal (e) Regardless of your answers in (d), calculate the test statistic. t= (124.43-35) /(37.74/√52)=17.08767624 (f) Find the p -value and interpret the p -value in the context of the research question. Use the t -distribution-based approach and show all the steps manually for the full credit. (You may use Stata to obtain the p -value. Copy and paste your Stata output.) Based on our research question this is one-sided P, upper test -> di 1-normal(17.08) P-value = 9.413e-23 1
(g) What is your conclusion of the test? State the conclusion in the context of the research question. Since the p-value << 0.05=⍺, we could reject the null hypothesis at the significance level 0.05 2. Paired or not, Part II. In each of the following scenarios, determine if the data are paired and explain your choice. (a) We would like to know if Intel’s stock and Southwest Airlines’ stock have similar rates of return. To find out, we take a random sample of 50 days, and record Intel’s and Southwest’s stock on those same days. - Yes, this is a paired test based on the random sample of 50 days and identifying a common interest in recording stock rates on the same days. (b) We randomly sample 50 items from Target stores and note the price for each. Then we visit Walmart and collect the price for each of those same 50 items. - Yes, this is a paired test based on the random sample of 50 common interests in difference being the collecting the price of exact same items from two different populations of interest stores. (c) A school board would like to determine whether there is a difference in average SAT scores for students at one high school versus another high school. - No, this is not a paired test because based on the research question it the two population interests are not related 3. High school and beyond, Part I . The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 25 students from this survey. Side-by- side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below: (a) Based on the graphs, does there appear to be a clear difference in the average reading and writing scores? 2
There does appear to be a clear difference in the average reading and writing scores based on the graphs shown above. (b) Are the reading and writing scores of each student independent? Yes, the reading and writing scores of each student are independent. (c) Using words and symbols, define the hypotheses for the following research question: Is there an evident difference in the average scores of students in the reading and writing exam? (d) What test would be the most appropriate to answer the research question stated above? A paired-t test would be best used to answer the research question stated above. (e) State and check the conditions required to complete this test. Small sized paired data: The subjects have been randomly selected and the difference between the reading and writing scores come from a nearly normal distribution. (f) The average observed difference in scores is   x read write =− 0.545 and the standard deviation of the differences is 8.887 points. Calculate the test statistics to answer the research question in (c). (For the full credit, you need to show your work.) (g) Find the p -value using t-distribution-based method and interpret it in the context of the question. (You may use Stata to obtain the p -value. Copy and paste your Stata output.) (h) What is your conclusion of the test (using 0.05 significance level)? State your conclusion in the context of the research question. Since we have p-value > than 0.05, we could fail to reject that there is no difference between the average reading and writing scores of students. (i) What type of error might we have made? Explain what the error means in the context of the application. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
We may have made a type 2 error and concluded to fail to reject the difference of - 0.545 as the average reading and writing scores of students when it is not correct. (j) Based on the results of this hypotheses test, would you expect a confidence interval for the average difference between the reading and writing scores to include 0? Explain your reasoning. Yes I would. With the std error of the difference being 8.887 / sqrt(25) = 1.7774 then when adding /subtracting that to/from the mean, that range would include 0. 4. High school and beyond, Part II. We considered the differences between the reading and writing scores of a random sample of 25 students who took the High School and Beyond Survey in Exercise. The mean and standard deviation of the differences are x read write =− 0.545 and 8.887 points. SD:8.887 x : -.545 n:25 ci-.95 (z:1.96) (a) Calculate a 95% confidence interval ( t -distribution approach) for the average difference between the reading and writing scores of all students. (You can use the t-table or Stata only for finding t ¿ . Show all work.) t* = +/- 2.06 DF = 24 (n=25 - 1) s(diff) / sqrt (n) = 8.887 / sqrt(25) = 1.7774 x bar diff +/- t* x (8.887/sqrt (25)) -.545 + (2.06*1.7774) = 3.116 -.545 - (2.06*1.7774) = -4.206 (b) Interpret this interval in context. This interval shows that there is a 95% chance that the average difference between the reading and writing scores of all students is between -4.206 and 3.116 points. (c) Does the confidence interval provide convincing evidence that there is a real difference in the average scores? Explain. No. Since the confidence interval contains the null value, 0, we can conclude that there is not a difference between the average scores. _____________________________________________________________________________ Part B. 1. High school and beyond, Part III. We considered the differences between the reading and writing scores of a random sample of 25 students who took the High School and Beyond Survey in Exercise. The mean and standard deviation of the differences are 4
x read write =− 0.545 and 8.887 points. We are interested in if there is an evident difference in the average scores of students in the reading and writing exam. Assume that all the necessary assumptions are satisfied here. (a) Do these data provide convincing evidence of a difference between the average scores on the two exams? Conduct a statistical test with significance level 0.01 and support your conclusion using p-value. Write your conclusion in the context of the research question (Hint: use ttesti ) We are 99% confident that the true mean difference ( δ ) of the scores fall between -5.51628 and 4.42628. (b) Find the 99% confidence interval for the average difference between the reading and writing scores of all students. Interpret this interval. Does the confidence interval support the conclusion you made in (a)? (Hint: use cii means ) Yes, since the p-value =.75914785 > α=.05, we could NOT reject Ho, with the significance level .05. That is, we do not have enough evidence to reject the null hypothesis that the test scores show evident difference in the average scores of students in the reading and writing exam. 2. Infections Disease. Download palpable.csv from Canvas course page. The degree of clinical agreement among physicians on the presence or absence of generalized lymphadenopathy was assessed in 32 randomly selected participants from a prospective study of male sexual contacts of men with acquired immunodeficiency syndrome (AIDS) or an AIDS-related condition (ARC). The total number of palpable lymph nodes was assessed by each of two physicians (say, Dr. A and Dr. B). The results can be found in the downloaded dataset ( palpable.csv ). Suppose that we would like to determine whether there is a systematic difference between the assessments of Doctor A and Doctor B. 5
n: 32(random) (a) Define the parameter of interest and construct the statistical hypotheses for the test (using symbols). Ho: δ≠δ presence or absence from a prospective study of male sexual contacts of men with acquired immunodeficiency syndrome (AIDS) or an AIDS-related condition (ARC). (b) Based on the provided information, what statistical test would be the most appropriate? Explain. CLT-based test While the sample is large enough and independent, the data shows a non-normal distribution (c) Generate a variable diff by taking the difference between the assessments of the two physicians ( dra – drb ). 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
(d) Check the conditions for hypothesis testing: (i) Independence. Are the observations independent? Explain . Yes, 32 randomly selected participants from a prospective study of male sexual contact. (ii) Normality. Check the conditions for the variable diff by generating the following graphs and comment on the distribution: a. Histogram -histogram shows a normal distribution with no obvious outliers b. Boxplot- boxplot shows a normal distribution with a singular outlier at 10. 7
c. QQ plot – it appears to be close to the line, making it a normal distribution d. Shapiro-Wilks Test - follows a normal distribution since the Prob>z is close to 1. (e) Based on your findings from part (d), is t -test appropriate? Comment. Yes since the data is independent, the sample size is above 30 and has a normal distribution (f) Perform the t -test using the two different methods in STATA and compare the result outputs (i) Using the original variable “ dra” and “ drb” 8
(ii) Using the difference variable “ diff” (g) Based on the results in (f), report the p -value for the test and interpret the p -value in the context of the research question. Since the p-value=.0001<<.05=α, we could reject he Ho. Therefore, we have strong statistical evidence to support that there is presence of generalized lymphadenopathy of male sexual contacts of men with acquired immunodeficiency syndrome (AIDS) or an AIDS-related condition (ARC).AIDs patients (h) State your conclusion in the context of the research question. What does this imply? We are 95% confident that the true mean of the differences in the presence of generalized lymphadenopathy of male sexual contacts of men with acquired immunodeficiency syndrome (AIDS) or an AIDS-related condition (ARC).AIDs patients falls between 1.730243 and 3.769757. 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help