Homework1Key

docx

School

University of Michigan *

*We aren’t endorsed by this school

Course

522

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

8

Uploaded by AdmiralGoatPerson1056

Report
Homework 1 Key BIOSTAT 522 1-A. If a 99% confidence interval for 1 - 2 is (2.5, 3.5), which of the following conclusion(s) can be drawn based on this interval? a. Reject H 0 : 1 = 2 at = 0.01 if the alternative is H α A : 1 2. b. Reject H 0 : 1 = 2 at = 0.01 if the alternative is H α A : 1 < 2. c. Do not reject H 0 : 1 = 2 at = 0.05 if the alternative is H α A : 1 2. d. Do not reject H 0 : 1 = 2 at = 0.01 if the alternative is H α A : 1 2. 1-B Suppose that ¯ X = 125.2 and ¯ X 2 = 125.4 are the mean systolic blood pressure for two samples of workers from different plants in the same industry. Suppose, further, that a test of H 0 : 1 = 2 using these samples is rejected at = 0.01. Which of the following α conclusions is most reasonable? a. There is a (clinically) meaningful difference in population means, but not a statistically significant difference. b. The difference in population mean s is both statistically and clinically meaningfully significant. c. There is a statistically significant difference, but not a clinically meaningful difference in population means. d. The sample sizes used must have been quite small. e. There is neither a statistically significant nor a clinically meaningful difference in population means. 1-C. Suppose that Y i is the random variable representing weight, measured in pounds , and that Y i ~ N( 150 , 10 ). Note that the notation is such that N( μ Y , σ Y ). (a) What is the probability that an individual weighs at most 120 pounds? P(Y i ≤ 120) = P(z ≤ -3) = 0.0013 (b) What is the median weight, in kg ? Note that 1 pound equals 0.45 kg. 150 pounds= 67.5 kg (c) What is the probability that an individual weighs more than 80 kg? P(Y i ≥ 80kg) = P(Y i ≥ 177.78) = P(z ≥ 2.78) = 0.0028 1
2-A. Obtain a 95% and 98% confidence interval for mean age of the women who smoke during pregnancy? 2-B. Carry out a hypothesis test for mean Age = 40 . Write the null and alternative hypotheses, find the appropriate test and the test statistic, degrees of freedom and p-value, and draw a conclusion. H 0 : Age = 40 H a : Age ≠ 40 t-statistic = -10.74 df = 19 p-value < 0.0001 Conclusion: We reject the null 2
3A [Univariate Analysis] Find appropriate descriptive statistics for pain and anxiety and describe for each what you find. Obtaining mean and standard deviation is sufficient, but feel free to use graphs and other descriptive statistics as you want to practice using SAS/R. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
3-B . [Bivariate Analysis: Graphical] The goal here is to assess if anxiety is predictive of pain . Assess graphically the relationship by first selecting the appropriate Y and X variable. Describe in one sentence what you saw from the graph and submit the graph. 3-C . [Bivariate Analysis: Model] Fit a regression model corresponding to the study goal. Write the model found and interpret estimates for the intercept ( β 0 ) and slope ( β 1 ). Model: ^ Pain i = 2.696 + 0.485 × Anxiet y i Interpretations: Intercept : The estimated average pain score is 2.696 when the anxiety score is 0. Based on the following scatter plot, there seems to be a weak/moderate, positive, linear relationship between pain and anxiety. 4
Slope : For every one-unit increase in anxiety score, the average pain score will increase by 0.485. 3-D . Find the N and % for the following questions. How many non-missing pain observations do you have in this dataset? How about those with non-missing anxiety data? And how many patients provided data for the plot described in problem # 3-B? Lastly, how many observations are used in the regression model in problem #3-C? Non-missing PAIN data: N = 206 (78.9%) Non-missing ANXIETY data: N = 202 (77.4%) Observations used in SCATTER PLOT and REGRESSION MODEL: N = 199 (76.2%) SAS Code proc import datafile = "C:\Users\amwallpa\Downloads\smoke.csv" 5
/*Or wherever you saved the file*/ out = smoke dbms = csv replace; getnames = yes; run ; proc means data =smoke alpha = 0.05 mean clm ; var A; run ; proc means data =smoke alpha = 0.02 mean clm ; var A; run ; proc ttest h0 = 40 ; var A; run ; libname homework 'C:\Users\amwallpa\Downloads' ; DATA tmp; set homework.surgery; RUN ; **3a); PROC MEANS data = tmp n nmiss mean std ; var pain anxiety; **descriptive statistics; **including number of missing values; RUN ; ods select histogram; **suppresses the tables with the histograms; PROC UNIVARIATE data = tmp noprint ; var pain anxiety; histogram / normal ; **shows the distribution of the variable; **i.e., whether or not it is normal or skewed; RUN ; **3b); PROC SGPLOT data = tmp; scatter x = anxiety y = pain; title 'Scatterplot of Pain vs. Anxiety' ; 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
RUN ; **3c); PROC REG data = tmp; model pain = anxiety / clb ; title ; RUN ; R Code # reading in the data smoke = read.csv("~/Downloads/smoke.csv", header=T) #95 % CI for mean age mean(smoke$A) - qt(1-0.05/2, 19)*sd(smoke$A)/sqrt(20) mean(smoke$A) + qt(1-0.05/2, 19)*sd(smoke$A)/sqrt(20) # 98% CI for mean age mean(smoke$A) - qt(1-0.02/2, 19)*sd(smoke$A)/sqrt(20) mean(smoke$A) + qt(1-0.02/2, 19)*sd(smoke$A)/sqrt(20) # t-test t.test(smoke$A, mu=40) # reading in the data surgery = read.csv(‘~/Downloads/surgery.txt", header=T) # Question 3a ## PAIN ## sum(complete.cases(surgery$pain)) # N mean(surgery$pain, na.rm=TRUE) # mean sd(surgery$pain, na.rm=TRUE) # sd # Histogram with normal curve x=surgery$pain[complete.cases(surgery$pain)] h=hist(x, col = "gray", right=FALSE, xlab="res", main="Histogram for Pain") xfit = seq(min(x),max(x),length=40) yfit = dnorm(xfit, mean=mean(x) ,sd=sd(x)) yfit = yfit*diff(h$mids[1:2])*length(x) lines(xfit, yfit, col="blue", lwd=2) 7
## ANXIETY ## sum(complete.cases(surgery$anxeity)) # N mean(surgery$anxiety, na.rm=TRUE) # mean sd(surgery$anxiety, na.rm=TRUE) # sd # Histogram with normal curve x=surgery$anxiety[complete.cases(surgery$anxiety)] h=hist(x, col = "gray", right=FALSE, xlab="res", main="Histogram for Anxiety") xfit = seq(min(x),max(x),length=40) yfit = dnorm(xfit, mean=mean(x) ,sd=sd(x)) yfit = yfit*diff(h$mids[1:2])*length(x) lines(xfit, yfit, col="blue", lwd=2) # Question 3b # Scatterplot of Pain vs. Anxiety plot(x=surgery$anxiety, y=surgery$pain, main = 'Scatterplot of Pain vs. Anxiety', xlab='Anxiety', ylab='Pain') # Question 3c # Model Y=pain and X=anxiety summary(lm(pain ~ anxiety, data=surgery)) # Question 3d # Number non-missing pain data sum(complete.cases(surgery$pain)) # Number non-missing anxiety data sum(complete.cases(surgery$anxiety)) # Obs used in scatter plot and regression model sum(complete.cases(cbind(surgery$pain, surgery$anxiety))) 8