Bryson_Biostat_HW3

docx

School

Abraham Baldwin Agricultural College *

*We aren’t endorsed by this school

Course

3000

Subject

Health Science

Date

Feb 20, 2024

Type

docx

Pages

7

Uploaded by SuperHumanProton12926

Report
Biostatistics: Homework 3 Hannah Bryson Fri Feb 9 13:48:47 2024 EST Due Date: 2/12/2024 Instructions: For all hypothesis test problems, write H0 and H1 in words starting with ##. After the appropriate hypothesis test code, type the p-value and write decision and conclusion Question 1: Part (a). As a researcher for the EPA, you have been asked to determine if the air quality in the United States has changed over the past 2 years. You select a random sample of 10 metropolitan areas and find the number of days each year that the areas failed to meet acceptable air quality standards. The data are shown. Use α = 0.05 to test whether mean number of days of unacceptable air quality has increased. rm ( list= ls ()) Y1 = c ( 18 , 25 , 9 , 22 , 138 , 29 , 1 , 19 , 17 , 31 ) Y2 = c ( 24 , 52 , 13 , 21 , 152 , 23 , 6 , 31 , 34 , 20 ) #H0: Y1=Y2 #H1: Y1>Y2 t.test (Y1,Y2, mu= 0 , paired= FALSE , alternative= "greater" , conf.level= 0.95 ) ## ## Welch Two Sample t-test ## ## data: Y1 and Y2 ## t = -0.37069, df = 17.873, p-value = 0.6424 ## alternative hypothesis: true difference in means is greater than 0 ## 95 percent confidence interval: ## -38.05407 Inf ## sample estimates: ## mean of x mean of y ## 30.9 37.6 #Conclusion: p-value = 0.6424 > alpha=.05. We do not reject H0; there is not enough evidence to support the claim that the mean number of days of unacceptable air quality has increased significantly Part (b) A veterinary nutritionist developed a diet for overweight dogs. The total volume of food consumed remains the same, but one-half of the dog food is replaced with a low-calorie
filler such as canned green beans. Six overweight dogs were randomly selected from her practice and were put on this program. Their initial weights were recorded, and they were weighed again after 4 weeks. At the 0.05 level of significance, can it be concluded that the dogs lost weight by at least 2 lb? B = c ( 42 , 53 , 48 , 65 , 40 , 52 ) A = c ( 39 , 45 , 40 , 58 , 42 , 47 ) #H0:mu1-mu2=2 #H1:mu1-mu2=/=2 t.test (B,A, mu= 2 , paired= TRUE , alternative= "less" , conf.level= 0.95 ) ## ## Paired t-test ## ## data: B and A ## t = 1.794, df = 5, p-value = 0.9336 ## alternative hypothesis: true mean difference is less than 2 ## 95 percent confidence interval: ## -Inf 8.015863 ## sample estimates: ## mean difference ## 4.833333 #Conclusion: p-value = 0.9336 > alpha=.05. We do not reject H0; there is not enough evidence to support the claim that the avg. weight lost is significantly different from 2lbs Part (c) Randomly selected students in a statistics class were asked to report the number of hours they slept on weeknights and on weekends. Construct 95% confidence interval for the difference between the mean sleep times over weekdays and weekends. Is there suficient evidence that there is a difference in the mean number of hours slept? WKEND = c ( 8 , 5.5 , 7.5 , 8 , 7 , 6 , 6 , 8 ) WDAYS = c ( 4 , 7 , 10.5 , 12 , 11 , 9 , 6 , 9 ) #H0:mu1-mu2=0 #H1:mu1-mu2=/=0 t.test (WKEND,WDAYS, mu= 0 , paired= FALSE , alternative= "two.sided" , conf.level= 0.95 ) ## ## Welch Two Sample t-test ## ## data: WKEND and WDAYS ## t = -1.5194, df = 8.9884, p-value = 0.163 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -3.8892185 0.7642185 ## sample estimates:
## mean of x mean of y ## 7.0000 8.5625 #Conclusion: p-value = .163 > alpha=.05. We do not reject H0; there is not enough evidence to support the claim that the avg number of days slept on weekends is significantly different from the days slept on weekdays Question 2 Part (a) The following data is taken from `Duke Cardiac Catheterization Coronary Artery Disease Diagnostic Dataset’ from https://hbiostat.org/data/ . You can click the link and see the details. The data consists of 3504 patients and 6 variables. The patients were referred to Duke University Medical Center for chest pain. The following 6 variables are used: Sex: 0=male, 1=female (categorical data) age: years (numerical data) cad.dur=Duration of Symptoms of Coronary Artery Disease in days (numerical) choleste= Cholesterol in mg% (numerical) sigdz= Significant Coronary Disease by Cardiac Cath (0=no, 1=yes, categorical) tvdlm=Three Vessel or Left Main Disease by Cardiac Cath (0=no, 1=yes, categorical) Do the following steps: Read the data in R and remove the NA values. Remember, before removing NA values, you should first replace any blank values by NA. A = read.csv ( 'acath.csv' , header= TRUE ) head (A) ## sex age cad.dur choleste sigdz tvdlm ## 1 0 73 132 268 1 1 ## 2 0 68 85 120 1 1 ## 3 0 54 45 NA 1 0 ## 4 1 58 86 245 0 0 ## 5 1 56 7 269 0 0 ## 6 0 64 0 NA 1 0 is.na (A) <- A == "" names ( which ( colSums ( is.na (A)) > 0 )) ## [1] "choleste" "tvdlm" B = na.omit (A) names ( which ( colSums ( is.na (B)) > 0 ))
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## character(0) dim (B) ## [1] 2258 6 Part (b) For α = 0:05, test if the average cholesterol of male and female are equal. First check if the variances are equal and then use the appropriate t-test. Mal = subset (B,sex == '0' ) head (Mal) ## sex age cad.dur choleste sigdz tvdlm ## 1 0 73 132 268 1 1 ## 2 0 68 85 120 1 1 ## 8 0 41 15 247 1 0 ## 12 0 35 44 257 0 0 ## 14 0 58 7 168 1 0 ## 15 0 81 2 246 1 1 Fem = subset (B,sex == '1' ) head (Fem) ## sex age cad.dur choleste sigdz tvdlm ## 4 1 58 86 245 0 0 ## 5 1 56 7 269 0 0 ## 21 1 52 30 240 0 0 ## 24 1 57 30 261 0 0 ## 32 1 59 3 200 1 0 ## 34 1 58 1 246 1 1 #H0:sigma1^2=sigma2^2 #H1:sigma1^2=/=sigma2^2 var.test (Mal $ choleste,Fem $ choleste, alternative = "two.sided" , conf.level = 0.95 ) ## ## F test to compare two variances ## ## data: Mal$choleste and Fem$choleste ## F = 0.69048, num df = 1568, denom df = 688, p-value = 4.626e-09 ## alternative hypothesis: true ratio of variances is not equal to 1 ## 95 percent confidence interval: ## 0.6072584 0.7826084 ## sample estimates: ## ratio of variances ## 0.6904775 #Conclusion: p-value=4.626e-09 < alpha=.05. We reject H0; the variances of males vs females are significantly different
#H0: mu1-mu2=0 #H1: mu1-mu2=/=0 t.test (Mal $ choleste, Fem $ choleste, mu= 0 , paired= FALSE , alternative= "two.sided" , conf.level= 0.95 ) ## ## Welch Two Sample t-test ## ## data: Mal$choleste and Fem$choleste ## t = -3.9774, df = 1123.2, p-value = 7.414e-05 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -14.70173 -4.98842 ## sample estimates: ## mean of x mean of y ## 226.9242 236.7692 #Conclusion: p-value=7.414e-05 < alpha = .0. We reject H0; the avg. cholosterol level of male and female are significantly different. Part (c) For α = 0:01, test if the average age of patients who have Significant Coronary Disease by Cardiac Cath (sigdz) is equal to the patients who do not have `sigdz’. First check if the variances are equal and then use the appropriate t-test. HS = subset (B,sigdz == '1' ) head (HS) ## sex age cad.dur choleste sigdz tvdlm ## 1 0 73 132 268 1 1 ## 2 0 68 85 120 1 1 ## 8 0 41 15 247 1 0 ## 14 0 58 7 168 1 0 ## 15 0 81 2 246 1 1 ## 16 0 58 79 221 1 1 DNH = subset (B,sigdz == '0' ) head (DNH) ## sex age cad.dur choleste sigdz tvdlm ## 4 1 58 86 245 0 0 ## 5 1 56 7 269 0 0 ## 12 0 35 44 257 0 0 ## 21 1 52 30 240 0 0 ## 24 1 57 30 261 0 0 ## 36 1 53 120 250 0 0 #H0:sigma1^2=sigma2^2 #H1:sigma1^2=/=sigma2^2 var.test (HS $ age,DNH $ age, alternative = "two.sided" , conf.level = 0.99 )
## ## F test to compare two variances ## ## data: HS$age and DNH$age ## F = 0.87823, num df = 1489, denom df = 767, p-value = 0.037 ## alternative hypothesis: true ratio of variances is not equal to 1 ## 99 percent confidence interval: ## 0.745496 1.030841 ## sample estimates: ## ratio of variances ## 0.878233 #Conclusion: p-value=.037 < alpha=.05. We reject H0; the variances of the avg age of patients who have sigdz and who do not have sigdz are significantly different. #H0:mu1-mu2=0 #H1:mu1-mu2=/=0 t.test (HS $ age, DNH $ age, mu= 0 , paired= FALSE , alternative= "two.sided" , conf.level= 0.99 ) ## ## Welch Two Sample t-test ## ## data: HS$age and DNH$age ## t = 10.347, df = 1464, p-value < 2.2e-16 ## alternative hypothesis: true difference in means is not equal to 0 ## 99 percent confidence interval: ## 3.238185 5.388636 ## sample estimates: ## mean of x mean of y ## 52.29128 47.97786 #Conclusion: p-value=2.2e-16 < alpha=.01. We reject H0; the avg. age of patients that have sigdz and who do not have sigdz are significantly different. Question 3 National statistics show that 23% of men smoke and 18.5% of women smoke. A random sample of 180 men indicated that 50 were smokers, and a random sample of 150 women surveyed indicated that 39 smoked. Test the claim the percentage of the male and female smokers are different, α = 0:02. Construct a 98% confidence interval for the true difference in proportions of male and female smokers. Comment on your interval does it support the claim that there is a difference? n1 = 180 X1 = 50 n2 = 150
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
X2 = 39 alpha = . 02 #H0:p1-p2=0 #H1:p1=/=p2 prop.test ( x = c (X1, X2), n = c (n1, n2), alternative = "two.sided" , conf.level = 0.98 , correct= FALSE ) ## ## 2-sample test for equality of proportions without continuity correction ## ## data: c(X1, X2) out of c(n1, n2) ## X-squared = 0.13129, df = 1, p-value = 0.7171 ## alternative hypothesis: two.sided ## 98 percent confidence interval: ## -0.0961232 0.1316788 ## sample estimates: ## prop 1 prop 2 ## 0.2777778 0.2600000 #Conclusion: The confidence interval (-.0961232 .1316788) contains the value 0, so we do not reject the H0. The proportion of male smokers and female smokers are not significantly different.