Stat311 Homework 7

pdf

School

University of Washington *

*We aren’t endorsed by this school

Course

EDDD 8

Subject

Statistics

Date

May 31, 2024

Type

pdf

Pages

16

Uploaded by JusticeFlower13326

Report
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 1 of 16 file:///Users/tinasong/Downloads/Homework7Template.html Problem 1 Problem 2 Problem 3 Problem 4 Problem 5 Problem 6 Read in the ice cream, birthweight, and cholesterol data sets. Stat311 Homework 7 Tina Song 2022-11-30 Read in the ice cream, birthweight, and cholesterol data sets. Problem 1 Part 1a) “more than” The given statement is a statement about the alternative hypothesis. H0: p = 0.25 vs. Ha: p > 0.25 Part 1b) “most” The given statement is a statement about the alternative hypothesis. H0: p = 0.5 vs. Ha: p > 0.5 Part 1c) “equal to” The given statement is a statement about the null hypothesis. H0: mu = 121 vs. Ha: mu 121 Part 1d) “no more than” The given statement is a statement about the null hypothesis. H0: p 0.02 vs. Ha: p > 0.02 Code Hide IC.df <- read.csv("IceCream.csv", header=TRUE, as.is=TRUE ) IC.df$Sex <- as.factor(IC.df$Sex) IC.df$Flavor <- as.factor(IC.df$Flavor) # BW.df <- read.csv("BirthWeight.csv", header=TRUE, as.is=T RUE) BW.df$Smoker <- as.factor(BW.df$Smoker) BW.df$BirthWt <- as.factor(BW.df$BirthWt) # C.df <- read.csv("Cholesterol.csv", header=TRUE, as.is=TR UE) C.df$Cereal <- as.factor(C.df$Cereal)
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 2 of 16 file:///Users/tinasong/Downloads/Homework7Template.html Part 1e) “at least” The given statement is a statement about the null hypothesis. H0: mu 0.8535 vs. Ha: mu < 0.8535 Part 1f) “better than’ The given statement is a statement about the alternative hypothesis. p1: the success rate with surgery p2: the success rate with splinting H0: p1 = p2 vs. Ha: p1 > p2 Part 1g) “greater” The given statement is a statement about the alternative hypothesis. mu1: the mean age unsuccessful job applicants mu2: the mean age of successful applications H0: mu1 = mu2 vs. Ha: mu1 > mu2 Problem 2 mu0: the student’s observed sample mean puzzle score = 52.405 “di " erent than” (alternative hypothesis) De # ne the statistical hypotheses as: H0: mu = 52.405 vs. Ha: mu 52.405 Use a 5% signi # cance level: alpha = 0.05 the t test statistic: -0.7927453 the critical value: 1.971957 Since this is a two-tailed test, t-crit is ± 1.97 by hand p- value: 0.4288703 R p-value: 0.4289 Since the p-value (=0.43) is > 0.05, we fail to reject the null hypothesis. There is no evidence that student’s population mean video score is di " erent than the student’s observed sample mean puzzle score (p = 0.43). ## mean.Puzzle SD.Puzzle ## 1 52.405 10.73579 Hide IC.df %>% summarize(mean.Puzzle = mean(IC.df$Puzzle, na.r m=TRUE), SD.Puzzle = sd(IC.df$Puz zle, na.rm=TRUE)) Hide IC.df %>% summarize(mean.Video = mean(IC.df$Video, na.rm= TRUE), SD.Video = sd(IC.df$Vide o, na.rm=TRUE))
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 3 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## mean.Video SD.Video ## 1 51.85 9.900891 ## [1] -0.7927453 ## [1] 199 ## [1] 1.971957 ## [1] 0.4288703 Hide (t <- (51.85-52.405)/ (9.900891/ sqrt(200))) Hide (df <- 200 - 1) Hide (tcrit <- qt(0.975, df)) Hide (pvalue <- 2 * pt(-0.7927453, df)) Hide t.test(IC.df$Video, mu = 52.405, alpha = 0.05, alternativ e = "two.sided")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 4 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## ## One Sample t-test ## ## data: IC.df$Video ## t = -0.79275, df = 199, p-value = 0.4289 ## alternative hypothesis: true mean is not equal to 52.4 05 ## 95 percent confidence interval: ## 50.46944 53.23056 ## sample estimates: ## mean of x ## 51.85 Problem 3 Part 3a) mu1: the population mean puzzle score for students that prefer vanilla ice cream mu2: the population mean puzzle score for students that prefer chocolate ice cream Hypotheses: H0: mu1 = mu2 vs. Ha: mu1 mu2 alpha = 0.05 p-value = 0.014 Since p < 0.05, we reject the null hypothesis. There is su $ cient evidence indicating that students with a preference for vanilla ice cream have a population mean puzzle score that is di " erent than the population mean score for students that prefer chocolate ice cream (p = 0.014). Hide IC.V <- filter(IC.df, Flavor == "1") IC.C <- filter(IC.df, Flavor == "2") t.test(IC.V$Puzzle, IC.C$Puzzle, alternative = "two.sided ")
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 5 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## ## Welch Two Sample t-test ## ## data: IC.V$Puzzle and IC.C$Puzzle ## t = 2.5026, df = 85.294, p-value = 0.01423 ## alternative hypothesis: true difference in means is no t equal to 0 ## 95 percent confidence interval: ## 0.9687439 8.4561161 ## sample estimates: ## mean of x mean of y ## 52.03158 47.31915 Part 3b) Assuming the same signi # cance level: alpha = 0.05 p-value: 1.987 The p- value for the permutation test is greater than the p-value (=0.014) in part(a). I do not make the same conclusion as in (a). 1.987 > 0.05 ## `summarise()` has grouped output by 'replicate'. You c an override using the ## `.groups` argument. Hide set.seed(15) IC.VC <- filter(IC.df, Flavor == "1" | Flavor == "2") PermsOut <- IC.VC %>% rep_sample_n(size = nrow(IC.VC), reps = 1000, replace = FALSE) %>% mutate(IC.VC_perm = sample(Puzzle)) %>% group_by(replicate, Flavor) %>% summarize(prop_IC.df_perm = mean(IC.VC_perm), mean_IC. df = mean(Puzzle)) %>% summarize(diff_perm = diff(prop_IC.df_perm), diff_orig = diff(mean_IC.df)) Hide PermsOut
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 6 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## # A tibble: 1,000 × 3 ## replicate diff_perm diff_orig ## <int> <dbl> <dbl> ## 1 1 2.95 -4.71 ## 2 2 -0.896 -4.71 ## 3 3 0.440 -4.71 ## 4 4 -5.06 -4.71 ## 5 5 0.0580 -4.71 ## 6 6 -2.39 -4.71 ## 7 7 0.662 -4.71 ## 8 8 1.52 -4.71 ## 9 9 1.30 -4.71 ## 10 10 -5.51 -4.71 ## # … with 990 more rows ## # A tibble: 1 × 1 ## count ## <int> ## 1 1987 ## [1] 1.987 Hide (countout <- PermsOut %>% summarize(count = sum(diff_orig <= diff_perm) + sum(diff_perm <= -diff_orig))) Hide (pvalue1 <- 1987 / 1000) Hide origdiff <- PermsOut$diff_orig[1] p1 <- ggplot(data = PermsOut, aes(x = diff_perm)) + geom_histogram(bins = 13) + xlab("Puzzle scores") + geom_vline(xintercept = origdiff, col="Red") + geom_vline(xintercept = abs(origdiff), col="Red" ) p1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 7 of 16 file:///Users/tinasong/Downloads/Homework7Template.html Part 3c) I think the statistical test results from parts (a and b) do not have practical signi # cance. Problem 4 Part 4a) p1: the proportion of low birth weight babies for mothers that smoked p2: the proportion of low birth weight babies for mothers that did not smoke De # ne the statistical di " erence as: H0: p1 = p2 vs. Ha: p1 > p2 (one-tailed test) signi # cance level: alpha = 0.05 the test statistic: 1.639706 critical z-score: + 1.645 by-hand p-value: 0.05053316 Since the test statistic z = 1.6397 < 1.645, we fail to reject the null hypothesis. Since the p-value of 0.051 is greater than 0.05, we fail to reject the null hypothesis. There is no evidence to conclude that the proportion of low birth weight babies is higher for mothers that smoked (p=0.051). ## ID Length BirthWt HeadCicr Gestation Smoker MAge MNumCig MHeight MPPWt ## 1 792 53 3.64 38 40 1 20 2 170 59 Hide (BW.S <- BW.df %>% filter(Smoker == "1"))
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 8 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## 2 1388 51 3.14 33 41 1 22 7 160 53 ## 3 575 50 2.78 30 37 1 19 7 165 60 ## 4 569 50 2.51 35 39 1 22 7 159 52 ## 5 1363 48 2.37 30 37 1 20 7 163 47 ## 6 300 46 2.05 32 35 1 41 7 166 57 ## 7 431 48 1.92 30 33 1 20 7 161 50 ## 8 1764 58 4.57 39 41 1 32 12 173 70 ## 9 532 53 3.59 34 40 1 31 12 163 49 ## 10 752 49 3.32 36 40 1 27 12 152 48 ## 11 1023 52 3 35 38 1 30 12 165 64 ## 12 57 51 3.32 38 39 1 23 17 157 48 ## 13 1522 50 2.74 33 39 1 21 17 156 53 ## 14 223 50 3.87 33 45 1 28 25 163 54 ## 15 272 52 3.86 36 39 1 30 25 170 78 ## 16 27 53 3.55 37 41 1 37 25 161 66 ## 17 365 52 3.53 37 40 1 26 25 170 62 ## 18 619 52 3.41 33 39 1 23 25 181 69 ## 19 1369 49 3.18 34 38 1 31 25 162 57 ## 20 1262 53 3.19 34 41 1 27 35 163 51 ## 21 516 47 2.66 33 35 1 20 35 170 57 ## 22 1272 53 2.75 32 40 1 37 50 168 61 ## Fage FEdYrs FNumCig Fheight LowBWt MAgeGT35 ## 1 24 12 12 185 0 0 ## 2 24 16 12 176 0 0 ## 3 20 14 0 183 0 0 ## 4 23 14 25 200 1 0
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 9 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## 5 20 10 35 185 1 0 ## 6 37 14 25 173 1 1 ## 7 20 10 35 180 1 0 ## 8 38 14 25 180 0 0 ## 9 41 12 50 191 0 0 ## 10 37 12 25 170 0 0 ## 11 38 14 50 180 0 0 ## 12 32 12 25 169 0 0 ## 13 24 12 7 179 0 0 ## 14 30 16 0 183 0 0 ## 15 40 16 50 178 0 0 ## 16 46 16 0 175 0 1 ## 17 30 10 25 181 0 0 ## 18 23 16 2 181 0 0 ## 19 32 16 50 194 0 0 ## 20 31 16 25 185 0 0 ## 21 23 12 50 186 1 0 ## 22 31 16 0 173 0 1 ## ID Length BirthWt HeadCicr Gestation Smoker MAge M NumCig MHeight MPPWt Fage ## 1 569 50 2.51 35 39 1 22 7 159 52 23 ## 2 1363 48 2.37 30 37 1 20 7 163 47 20 ## 3 300 46 2.05 32 35 1 41 7 166 57 37 ## 4 431 48 1.92 30 33 1 20 7 161 50 20 ## 5 516 47 2.66 33 35 1 20 35 170 57 23 ## FEdYrs FNumCig Fheight LowBWt MAgeGT35 ## 1 14 25 200 1 0 ## 2 10 35 185 1 0 ## 3 14 25 173 1 1 ## 4 10 35 180 1 0 ## 5 12 50 186 1 0 Hide (BW.low <- BW.S %>% filter(LowBWt == "1")) Hide (S.LowBwt <- 5/22)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 10 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## [1] 0.2272727 ## ID Length BirthWt HeadCicr Gestation Smoker MAge MNumCig MHeight MPPWt ## 1 1360 56 4.55 34 44 0 20 0 162 57 ## 2 1016 53 4.32 36 40 0 19 0 171 62 ## 3 462 58 4.1 39 41 0 35 0 172 58 ## 4 1187 53 4.07 38 44 0 20 0 174 68 ## 5 553 54 3.94 37 42 0 24 0 175 66 ## 6 1636 51 3.93 38 38 0 29 0 165 61 ## 7 820 52 3.77 34 40 0 24 0 157 50 ## 8 1191 53 3.65 33 42 0 21 0 165 61 ## 9 1081 54 3.63 38 38 0 18 0 172 50 ## 10 822 50 3.42 35 38 0 20 0 157 48 ## 11 1683 53 3.35 33 41 0 27 0 164 62 ## 12 1088 51 3.27 36 40 0 24 0 168 53 ## 13 1107 52 3.23 36 38 0 31 0 164 57 ## 14 755 53 3.2 33 41 0 21 0 155 55 ## 15 1058 53 3.15 34 40 0 29 0 167 60 ## 16 321 48 3.11 33 37 0 28 0 158 54 ## 17 697 48 3.03 35 39 0 27 0 162 62 ## 18 808 48 2.92 33 34 0 26 0 167 64 ## 19 1600 53 2.9 34 39 0 19 Hide (BW.N <- BW.df %>% filter(Smoker == "0"))
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 11 of 16 file:///Users/tinasong/Downloads/Homework7Template.html 0 165 57 ## 20 1313 43 2.65 32 33 0 24 0 149 45 ## Fage FEdYrs FNumCig Fheight LowBWt MAgeGT35 ## 1 23 10 35 179 0 0 ## 2 19 12 0 183 0 0 ## 3 31 16 25 185 0 1 ## 4 26 14 25 189 0 0 ## 5 30 12 0 184 0 0 ## 6 31 16 0 180 0 0 ## 7 31 16 0 173 0 0 ## 8 21 10 25 185 0 0 ## 9 20 12 7 172 0 0 ## 10 22 14 0 179 0 0 ## 11 37 14 0 170 0 0 ## 12 29 16 0 181 0 0 ## 13 35 16 0 183 0 0 ## 14 25 14 25 183 0 0 ## 15 30 16 25 182 0 0 ## 16 39 10 0 171 0 0 ## 17 27 14 0 178 0 0 ## 18 25 12 25 175 0 0 ## 19 23 14 2 193 0 0 ## 20 26 16 0 169 1 0 ## ID Length BirthWt HeadCicr Gestation Smoker MAge M NumCig MHeight MPPWt Fage ## 1 1313 43 2.65 32 33 0 24 0 149 45 26 ## FEdYrs FNumCig Fheight LowBWt MAgeGT35 ## 1 16 0 169 1 0 ## [1] 0.05 Hide (BW.low1 <- BW.N %>% filter(LowBWt == "1")) Hide (N.LowBwt <- 1/20) Hide
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 12 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## [1] 0.2272727 ## [1] 0.05 ## [1] 0.1428571 ## [1] 0.8571429 ## [1] 1.639706 ## [1] 0.05053316 Part 4b) (p1hat <- 0.2272727) Hide (p2hat <- 0.05) Hide (phat <- (5 + 1)/(22+20)) Hide (qhat <- 1-phat) Hide (z <- (p1hat-p2hat)/sqrt(phat*qhat*(1/22+1/20))) Hide (pvalue2 <- 1-pnorm(1.639706)) Hide
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 13 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## Warning in prop.test(x = c(5, 1), n = c(22, 20), corre ct = FALSE, alternative = ## "greater"): Chi-squared approximation may be incorrect ## ## 2-sample test for equality of proportions without con tinuity correction ## ## data: c(5, 1) out of c(22, 20) ## X-squared = 2.6886, df = 1, p-value = 0.05053 ## alternative hypothesis: greater ## 95 percent confidence interval: ## 0.009871231 1.000000000 ## sample estimates: ## prop 1 prop 2 ## 0.2272727 0.0500000 ## [1] 1.639695 Part 4c) I do not have con # dence in the results of the tests from part(a) and part(b). Some studies show that mothers who are smokers while pregnant are more likely to have lower birth weight babies. From the sample, 0.23(smoked, low birth weight) > 0.05(did not smoke, low birth weight). Part 4d) In the context of this problem, it means that we accept the null hypothesis. There is no di " erence between the proportions of low- birth-weight babies for mothers that are smokers and nonsmokers(p1 = p2). Given that the null hypothesis is false, I think there are signi # cant consequences. Babies will have low birth weight if their mothers continue to smoke. prop.test(x=c(5,1), n=c(22,20),correct = FALSE,alternativ e = "greater") Hide (zscore <- sqrt(2.6886))
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 14 of 16 file:///Users/tinasong/Downloads/Homework7Template.html Problem 5 Part 5a) mud: serum cholesterol levels (corn % akes) - serum cholesterol levels (oat bran) De # ne the statistical hypotheses as: H0: mud = 0 vs. Ha: mud > 0 (one-tailed test) signi # cance level: 5% p-value: 0.02 Since the p-value of 0.02 is less than 0.05, we reject the null hypothesis. There is su $ cient evidence to conclude that a diet that includes oat bran decreases serum cholesterol (p=0.02). ## ## Paired t-test ## ## data: C.df.C$Cholesterol and C.df.O$Cholesterol ## t = 2.6149, df = 11, p-value = 0.02405 ## alternative hypothesis: true mean difference is not eq ual to 0 ## 95 percent confidence interval: ## 0.05461324 0.63538676 ## sample estimates: ## mean difference ## 0.345 Part 5b) A 90% con # dence interval (one sided test) I chose this con # dence interval because part (a) states that alpha equals 5%. (alpha = 1/2 (100% - CL%)) (0.1081, 0.5819) mmol/L We are 90% con # dent that the population mean di " erence for people eating oat bran and those eating corn % akes falls between 0.1081 and 0.5819 mmol/L. zero is not in the interval. Hide C.df.C <- C.df %>% filter(Cereal == "Cornflk") C.df.O <- C.df %>% filter(Cereal == "OatBran") t.test(C.df.C$Cholesterol, C.df.O$Cholesterol, paired = T RUE) Hide t.test(x = C.df.C$Cholesterol, y = C.df.O$Cholesterol, pa ired=TRUE, conf.level = 0.90)$conf.int
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 15 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## [1] 0.1080601 0.5819399 ## attr(,"conf.level") ## [1] 0.9 Part 5c) In the context of this problem, it means that we reject the null hypothesis. (a diet that includes oat bran does not decrease serum cholesterol) Given that the null hypothesis is true, I think there are signi # cant consequences. Oats can help us maintain healthy cholesterol levels. Problem 6 Part 6a) mu1: population mean (treatment) mu2: population mean (placebo) Hypotheses: H0: mu1 = mu2 vs. Ha: mu1 mu2 (two-tailed test) signi # cance level: alpha = 0.05 SE(pooled): 1.179924 df(pooled): 41 t- crit(two-tailed test): ± 2.02 p-value: 0.03 Since the test statistic t = |-2.29| is > the critical t cuto " value of 2.02, we reject the null hypothesis. Since the p-value of 0.03 is less than 0.05, we reject null hypothesis. There is no evidence that the treatment and placebo groups come from populations with the same mean (p=0.03). ## [1] 3.817033 ## [1] 1.179924 ## [1] -2.288284 Hide (Sp <- sqrt(((17*(3.77)^2) + (24*(3.85)^2))/(18+25-2))) Hide (SE2 <- Sp * sqrt(1/18 + 1/25)) Hide (t2 <- (22.5-25.2)/SE2)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11/30/22, 9 : 41 PM Stat311 Homework 7 Page 16 of 16 file:///Users/tinasong/Downloads/Homework7Template.html ## [1] 41 ## [1] 2.019541 ## [1] 0.02734739 Part 6b ## [1] 1.042891 I agree with the variance assumption made in part(a). Rule of Thumb for pooling variances: 1.04 < 3 (so it is safe to assume equal variances and form a pooled SE) Samples also have similar standard deviations. The larger sample size (25) produced the larger standard deviation (3.85). Hide (df <- 18+25-2) Hide (tcrit2 <- qt(0.975, 41)) Hide (pvalue3 <- 2 * pt(t2, 41)) Hide (ROT <- (3.85)^2 / (3.77)^2)