Review exercises_Final Exam_11_30_2022
pdf
keyboard_arrow_up
School
University of Texas Health Science Center at Houston School of Nursing *
*We aren’t endorsed by this school
Course
1700
Subject
Health Science
Date
Feb 20, 2024
Type
Pages
43
Uploaded by HighnessSalmonPerson999
1 THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH PH1700 –
INTERMEDIATE BIOSTATISTICS EXERCISES IN PREPARATION FOR FINAL EXAM Please read the problems carefully and identify what question is being asked. Try to identify the variable of interest, possible hypotheses for the question, and which test statistic you would need to evaluate the hypothesis. Write down assumptions you need to investigate prior to choosing the appropriate test. We recommend you use the flow chart in your textbook!!!
Focus on developing a strategy for solving the problems. If time allows, you can solve the problems. Otherwise, please solve the questions, bring your answers and questions for the lecture on December 5, 2022. The following is a list of potential test statistics that you should use in the problems. Distribution tables are provided as appendix. List A
A. Contingency Chi-square Test / Chi-square test of Heterogeneity / Chi-square test of Independent / Test for RxC Tables B. Analysis of Variance (ANOVA) C. Kruskal-Wallis test D. One sample t-test E. Pearson correlation coefficient F. Spearman correlation coefficient G. Linear regression H. Regression diagnosis I. Logistic regression J. Kaplan Meier Survival analysis K. Odds ratio L. Relative Risk M. Fisher Exact Test
N. Test for homogeneity of variance O. McNemar
’s
Test P: Test for difference of proportions Q: Log rank test R: Wald test S: Cox proportional hazard model(s) T: Coefficient of determination U: Multiple comparison using the Dunn Test
2 1.
Suppose we wish to investigate the familial aggregation of respiratory disease according to the specific type of respiratory disease. 100 families in which the head of household or the spouse has asthma, referred to as type A families, and 200 families in which either the head of household or the spouse has non-asthmatic pulmonary disease, but neither has asthma, referred to as type B families, are identified. Suppose that in 15 of the type A families the first-born child has asthma, whereas in 3 other type A families the first-born child has some non-asthmatic respiratory disease. Furthermore, in 4 of the type B households the first-born child has asthma, whereas in 2 other type B households the first-born child has some non-asthmatic respiratory disease. Test the hypothesis H
0
: P
A
=P
B
versus H
1
: P
A
≠P
B
, where P
A
=Pr(first-born child has asthma in a type A family) P
B
=Pr(first-born child has asthma in a type B family) Choose the appropriate test from List A (first page). 2.
Construct the 2x2 table and compute the expected values in problem 1. Which one of the outputs below you will use after checking the assumptions regarding problem 1? Please compute the p-value by hand. [Note: Please compute the p-value by hand, assume we will give you copies of the distribution tables from the textbook] 3.
Compute the prevalence of non-asthmatic respiratory disease in the two types of families. State all hypotheses being tested. What would you conclude from this STATA output? Output A OUPUT B
3 OUTPUT C 4.
The authors analyzed the results of their study, data below, for whether there was a difference in the proportion of ill between the workers and non-workers by using a chi-square statistic. Is this method of analysis reasonable for this table? If not, suggest an alternative from list A (first page). Health Status Work Status Ill Well Total Worked 10 12 22 Did not work 2 26 28 total 12 38 50 The following outputs are provided for your convenience and with the purpose of preparing you for the exam with incorrect and correct outputs for you to choose the correct ones. Please select the correct outputs to answer the following questions. Output A. Output B. 1-sided Fisher's exact = 0.208
Fisher's exact = 0.338
Pearson chi2(1) = 1.6271 Pr = 0.202
5.0 295.0 300.0 Total 5 295 300 3.3 196.7 200.0 2 2 198 200 1.7 98.3 100.0 1 3 97 100 row 1 2 Total
col
expected frequency
frequency
Key . tabi 3 97\ 2 198, chi2 expected exact
24.00 76.00 100.00 Total 12 38 50 7.14 92.86 100.00 1 2 26 28 45.45 54.55 100.00 0 10 12 22 workstatus 0 1 Total
healthstatus
row percentage
frequency
Key . tab workstatus healthstatus, row
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000
Ha: mean < 0 Ha: mean != 0 Ha: mean > 0
Ho: mean = 0 degrees of freedom = 49
mean = mean(healthstatus) t = 12.4566
health~s 50 .76 .0610119 .4314191 .637392 .882608
Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
One-sample t test
. ttest healthstatus == 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4 Output C. Output D. Output E. 5. The hypothesis for problem 4 is H
0
: p
1
=p
2
; H
1
: p
1
≠p
2, where p1= proportion of people who were ill in the work group, p2= proportion of people who were ill in the “did not work” group. Given a chi square statistic of 7.92 and using the appropriate appendix in your textbook, please estimate the p-value for this test statistic by hand? What conclusion will you reach? 𝜒
49,0.005
2
= 27.25
; 𝜒
49,0.01
2
= 28.94
; 𝜒
49,0.025
2
= 31.55
; 𝜒
49,0.05
2
= 33.93
;
𝜒
49,0.95
2
= 66.34;
𝜒
49,0.975
2
= 70.22
; 𝜒
49,0.99
2
= 74.92
; 𝜒
49,0.995
2
= 78.23
; 𝜒
49,0.999
2
= 85.35
; 𝜒
4,0.005
2
= 0.21
; 𝜒
4,0.01
2
= 0.3
; 𝜒
4,0.025
2
= 0.48
; 𝜒
4,0.05
2
=
0.71
;
𝜒
4,0.95
2
= 9.49;
𝜒
4,0.975
2
= 11.14
; 𝜒
4,0.99
2
= 13.28
; 𝜒
4,0.995
2
= 14.86
; 𝜒
4,0.999
2
= 18.47
; 𝜒
1,0.005
2
= 0.00
; 𝜒
1,0.01
2
= 0.00
; 𝜒
1,0.025
2
= 0.00
; 𝜒
1,0.05
2
= 0.00
;
𝜒
1,0.95
2
= 3.84;
𝜒
1,0.975
2
= 5.02
; 𝜒
1,0.99
2
= 6.63
; 𝜒
1,0.995
2
= 7.88
; 𝜒
1,0.999
2
= 10.83)
healthstatus 50 0.93567 3.025 2.361 0.00912
Variable Obs W V z Prob>z
Shapiro-Wilk W test for normal data
. swilk healthstatus
1-sided Fisher's exact = 0.002
Fisher's exact = 0.002
Pearson chi2(1) = 9.9140 Pr = 0.002
12.0 38.0 50.0 Total 12 38 50 6.7 21.3 28.0 2 2 26 28 5.3 16.7 22.0 1 10 12 22 row 1 2 Total
col
expected frequency
frequency
Key . tabi 10 12\2 26, chi2 expected exact
Pr(Z < z) = 1.0000 Pr(|Z| > |z|) = 0.0000 Pr(Z > z) = 0.0000
Ha: p < 0.07 Ha: p != 0.07 Ha: p > 0.07
Ho: p = 0.07
p = proportion(x) z = 10.5312
x .45 .0703562 .3121043 .5878957
Variable Mean Std. Err. [95% Conf. Interval]
One-sample test of proportion x: Number of obs = 50
. prtesti 50 .45 .07
5 6.
Student’s t test with 40 df was used to analyze the results in the table below. If we want to test H
0
: p
1
=p
2
versus H
1
: p
1
≠p
2, where p
1
= proportion of hepatitis A in the “ate salad” group and p
2
=proportion of hepatitis A in the “did not eat salad” group
. Do you agree that a t-test is the correct test? Why or why not? Suggest an alternative test from the list and conduct the test. Which of the following output(s) help you to determine if there is a difference in the proportion of illness between those who ate salad and those who did not eat salad? Health status Salad Ill Well Total No 3 6 9 Yes 25 8 33 Total 28 14 42 Output A. Output B. Output C. 66.67 33.33 100.00 Total 28 14 42 75.76 24.24 100.00 1 25 8 33 33.33 66.67 100.00 0 3 6 9 salad 0 1 Total
health
row percentage
frequency
Key . tab salad health, row
6 Output D. Output E. Output F. Pr(Z < z) = 1.0000 Pr(|Z| > |z|) = 0.0000 Pr(Z > z) = 0.0000
Ha: p < 0.24 Ha: p != 0.24 Ha: p > 0.24
Ho: p = 0.24
p = proportion(x) z = 6.5250
x .67 .0725554 .527794 .812206
Variable Mean Std. Err. [95% Conf. Interval]
One-sample test of proportion x: Number of obs = 42
. prtesti 42 .67 .24
Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0001 Pr(T > t) = 0.0000
Ha: mean < 0 Ha: mean != 0 Ha: mean > 0
Ho: mean = 0 degrees of freedom = 41
mean = mean(health) t = 4.5277
health 42 .3333333 .073621 .4771187 .1846527 .482014
Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
One-sample t test
. ttest health==0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
7 7.
The authors of the data from problem 6 claim that there is a significant difference between the proportion with hepatitis A among those who did and did not eat salad. Does their claim match the conclusions reached using the following STATA output? 8.
The data in the table below are for 27 patients with acute dilated cardiomyopathy. LVEF indicates left ventricular ejection fraction. Researchers are interested in evaluating the linear association of age on the LVEF. Regress Age on left ventricular ejection fraction and check any assumptions associated with linear regression. Please answer the questions below which will help you with the thinking process. 8.1. What will be the parameter of interest? What is your dependent and what is your independent variable if you are interested in the effect of age on LVEF? 8.2. Which one of the outputs help you to answer the question? 8.3. Name two methods for determining the linear association of age and LVEF. 8.4. What is the regression equation evaluating the effect of age on LVEF? 8.5. What would we be our estimate for the average LVEF for a 45-year-old patient with this condition using this regression line? 8.6. Name two methods for determining the linear association of age and LVEF. 8.7. What is your conclusion about the F-statistic? Report the p-value by hand. [Note: Please compute the p-value by hand, assume we will give you copies of the distribution tables from the textbook] 8.8. What is your conclusion about the slope and intercept of the line using the t-statistic? Report the p-
value by hand. [Note: Please compute the p-value by hand, assume we will give you copies of the distribution tables from the textbook] 8.9. What are the standard errors of the slope and intercept for the regression line? 8.10. Identify the 95% confidence interval for the slope. 8.11. Identify the 95% confidence interval for the intercept 8.12. What is the coefficient of determination (R
2
) for the regression line in Exercise 8.4? 8.13. If we want to evaluate the linear association of age and LVEF, would you recommend Pearson or Spearman? The following outputs are for your convenience:
8 Output A. Output B. Output C. Output D. LVEF 27 0.96943 0.899 -0.220 0.58694
Variable Obs W V z Prob>z
Shapiro-Wilk W test for normal data
. swilk LVEF
age 27 0.93805 1.821 1.231 0.10907
Variable Obs W V z Prob>z
Shapiro-Wilk W test for normal data
. swilk age
0
.01
.02
.03
.04
20
30
40
50
60
70
age
0
2
4
6
0
.1
.2
.3
.4
lvef
9 Output E. Output F. Output G. 99% 75 75 Kurtosis 1.815696
95% 65 65 Skewness .2906468
90% 65 65 Variance 264.1795
75% 56 63
Largest Std. Dev. 16.2536
50% 42 Mean 42.11111
25% 26 23 Sum of Wgt. 27
10% 23 23 Obs 27
5% 23 23
1% 19 19
Percentiles Smallest
age
. summarize age, det
_cons .1740596 .0429702 4.05 0.000 .0855608 .2625583
age .0011877 .0009542 1.24 0.225 -.0007776 .003153
lvef Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total .166051857 26 .00638661 Root MSE = .07909
Adj R-squared = 0.0207
Residual .156363028 25 .006254521 R-squared = 0.0583
Model .009688829 1 .009688829 Prob > F = 0.2248
F(1, 25) = 1.55
Source SS df MS Number of obs = 27
. regress lvef age
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
10 Output H. Output I. Output J. 9.
Suppose that 20 married couples, each in the age group 25-34, have their systolic blood pressures taken, with the data listed below. What statistic from list A would be useful for determining whether male and female SBP are related? _cons 31.10282 9.370703 3.32 0.003 11.8035 50.40215
LVEF 49.12789 39.47203 1.24 0.225 -32.16628 130.4221
age Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 6868.66667 26 264.179487 Root MSE = 16.085
Adj R-squared = 0.0207
Residual 6467.89223 25 258.715689 R-squared = 0.0583
Model 400.774433 1 400.774433 Prob > F = 0.2248
F(1, 25) = 1.55
Source SS df MS Number of obs = 27
. regress age LVEF
Couple
Male
Female
Couple Male Female
1
136
110
11
156
135
2
121
112
12
98
115
3
128
128
13
132
125
4
100
106
14
142
130
5
110
127
15
138
132
6
116
100
16
126
146
7
127
98
17
124
127
8
150
142
18
137
128
9
180
143
19
160
135
10
172
150
20
125
110
11 10.
If this were your STATA output: what would you conclude? Output A. 11.
Which is the appropriate test for the correlation coefficient? Output B. Output C. spearman2 male female Number of obs = 20 Spearman's rho = 0.7169 ( 0.1643 ) Test of Ho: male and female are independent t( 18 ) = 4.3623 Prob > |t| = 0.0004 12.
Assess whether there is any overall difference in mean fluorescence level by retinopathy grade. Assume that the variable is normal. What test would you use, from list A? 13.
A hypothesis has been suggested that a principal benefit of physical activity is to prevent sudden death from heart attack. The following study was designed to test this hypothesis: 100 men who died for a first heart attack and 100 men who survived a first heart attack in the age group 50-59 were identified and their wives were each given a detailed questionnaire con
cerning their husbands’ physical activity in the 20 20
0.0010
Female 0.6807 1.0000 20
Male 1.0000 Male Female
. pwcorr Male Female, sig obs
Retinopathy grade
n Mean± sd
Nephropathy grade
n
Mean±
sd
0
11 447±
17
0
28 487±
24
1
16 493±
30
1
6 481±
16
2
14 551±
35
2
7 567±
24
12 year preceding their heart attacks. The men were then classified as active or inactive. Suppose that 30 of the 100 who survived and 10 of the 100 who died were physically active. Construct the 2x2 table. Output A Output B Output C chi2(1) = 12.50 Pr>chi2 = 0.0004
Odds ratio 3.857143 1.766603 8.42156 (Woolf)
Attr. frac. pop .125 Attr. frac. ex. .4166667 .250415 .5460451 Risk ratio 1.714286 1.334072 2.202862 Risk difference .3125 .1578542 .4671458 Point estimate [95% Conf. Interval]
Risk .75 .4375 .5
Total 40 160 200
Noncases 10 90 100
Cases 30 70 100
Exposed Unexposed Total
. csi 30 70 10 90, or woolf
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
13 14.
Which STATA output is correct for problem 13? What would you conclude about the Odds Ratio? 15.
What formula would you use to compute a 95% CI for the odds ratio for problem 14? 16.
An issue of current interest is the effect of delayed childbearing on pregnancy outcome. In a recent paper a population of first deliveries was assessed for low-birthweight deliveries (<2500g) according to the woman’s age and prior pregnancy history
. Estimate the odds ratio relating age to the prevalence of low-
birthweight deliveries, for both pregnancy history and no history of pregnancy groups. Report both 2x2 tables and both odds ratios and their 95%CI. Below are some outputs that may help you to answer this question. Please select the correct test from the list and the outputs below. OUTPUT A: Age
History
n
Percentage low birthweight
≥ 30
No
225
3.56
≥ 30
Yes
88
6.82
< 30
No
906
3.31
<30 Yes
153
1.31
14 OUTPUT B: OUTPUT C: OUTPUT D: 1.546753 1.62 0.2035 0.785529 3.045648
Odds Ratio chi2(1) P>chi2 [95% Conf. Interval]
Mantel-Haenszel estimate controlling for history
1 1.076498 0.03 0.8556 0.48647 2.38215
0 5.524390 5.27 0.0218 1.06708 28.60031
history Odds Ratio chi2(1) P>chi2 [95% Conf. Interval]
by history
Comparing bw==1 vs. bw==0
Maximum likelihood estimate of the odds ratio
. mhodds age bw, by(history)
15 OUTPUT E: 17.
Estimate the odds ratio relating pregnancy history to the prevalence of low-birthweight deliveries, for both under 30, and 30+ groups. Report both 2x2 tables and both odds ratios. You may want to use the outputs from the previous problem. 18.
A study was done to evaluate if a treatment plan has an effect on a specific condition. The following 2x2 table shows the symptoms before and after the treatment plan among 670 participants. Does the treatment plan have a statistically significant effect on the condition in question? Test this hypothesis using a type I error level of 0.01. Do not forget to report the correct 2X2 table. Please report the p-value by hand. [Note you can use the tables in the Appendix] After: Present After: Absent Total Before: Present 206 259 465 Before: Absent 110 95 205 Total 316 354 670 Output A:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
16 Output B: OUTPUT C:
17 Output D 19.
Suppose we are interested in comparing the effectiveness of two different antibiotics, A and B, in treating gonorrhea. Each person receiving antibiotic A is matched with an equivalent person (age within 5 years, same sex), to whom antibiotic B is given. These people are asked to return to the clinic within 1 week to see if the gonorrhea has been eliminated. Carry out an appropriate test for comparing the relative effectiveness of the two antibiotics. Suppose the results are below. Make sure you know how to create the 2x2 table for analysis: (1) For 40 pairs of people, both antibiotics are successful. (2) For 50 pairs of people, antibiotic A is effective whereas antibiotic B is not. (3) For 25 pairs of people, antibiotic B is effective whereas antibiotic A is not. (4) For 10 pairs of people, neither antibiotic is effective. Which output below is the correct one for this question? Treatment Treatment A effective yes A effective No B effective yes 40 25 B effective No 50 10 Output A
18 Output B 20.
Suppose we are interested in comparing the effectiveness of two different antibiotics, A and B, in treating gonorrhea. Each person receiving antibiotic A is matched with an equivalent person (age within 5 years, same sex), to whom antibiotic B is given. These people are asked to return to the clinic within 1 week to see if the gonorrhea has been eliminated. Carry out an appropriate test for comparing the relative effectiveness of the two antibiotics. Suppose the results are as follows: (1) For 16 pairs of people, both antibiotics are successful. (2) For 8 pairs of people, antibiotic A is effective whereas antibiotic B is not. (3) For 9 pairs of people, antibiotic B is effective whereas antibiotic A is not. (4) For 18 pairs of people, neither antibiotic is effective. Output A
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
19 Output B 21.
A group of researchers published the data regarding the use of three methods to repair a meniscus from different individuals who only needed the repair in a single knee (P. Borden, J. Nyland, D.N.M. Caborn, D. Pienkowski (2003). "Biomechanical Comparison of the FasT-Fix Meniscal Repair Suture System with Vertical Mattress Sutures and Meniscus Arrows," The American Journal of Sports Medicine, Vol. 31, #3, pp. 374-378.). The data was modified, reported below and provided in the dataset MENISCUS. They compared 3 methods for Meniscal Repair (1=Vertical Suture, 2=Meniscus Arrow, 3=FasT-Fix) on 3 Outcome measures as the change before and after the repair: Load at failure (N), Displacement (mm) and Stiffness (N/mm). What assumptions would you check before analysis? Are there statistically significant differences in the mean of the change in the displacement of the knee between these three methods? What test statistic will you use to answer this question and what will you inform these researchers? The data is shown below as well as some outputs that may help you to answer the questions:
20 OUTPUT A: OUTPUT B: OUTPUT C:
21 OUTPUT D: OUTPUT E:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
22 Output F OUTPUT G:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
23 OUTPUT H: OUTPUT I: OUTPUT J:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
24 OUTPUT K: OUTPUT L: OUTPUT M:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
25 OUTPUT N: 21.b. Please report the p-value by hand for testing if there is homogeneity of variance for the displacement outcome. 22.
Using the data and outputs from Problem 21, are there statistically significant differences in the change in the mean load at failure of the knee between the three methods? What assumptions will you check? What test statistic will you use to answer this question and what will you inform these researchers? 23.
Using the data and outputs from Problem 21, are there statistically significant differences in the change in the mean stiffness between these three methods? What assumptions will you check? What test statistic will you use to answer this question and what will you inform these researchers? 24.
We are conducting a taste test to determine which type of asthma treatment has better effect. 80 asthma patients were randomly assigned to two asthma drugs and drug types. They completed the survey evaluated the drug effect after. Our two independent variables are both categorical variables. We have two drug (A and B) and two drug types (Tablet and Aerolizer) in our analysis. Our model with the interaction term is presented with the code in STATA. Please indicate if the interaction term needs to be reported for the readers of these results and explain why yes or no. Write the equation of the model below. Is this a good model? Satisfaction = drug drugtype drug*drugtype generate treatment= 0 replace treatment= 1 if drug=="B" generate drugtype= 0 replace drugtype= 1 if type=="Aerolizer" gen interaction=treatment*drugtype regress satisfaction i.treatment i.drugtype i.interaction
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
26 25.
A study investigated the effects of two treatments (budesonide, nedocromil) on pulmonary function as measured by normalized FEV. Does hemoglobin have a significant effect in the model?
Report the partial correlation between FEV and treatment after controlling Hemoglobin (g/dl) and interpret.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
27 26.
The following data contain times to tumor relapse (in months) among 10 women with breast cancer. The right censored observations are indicated by a+ Genotype A women: 12.5+, 3.3, 5.1+, 4.0, 5.0+ Genotype B women: 10.0, 2.5, 8.5+, 5.5+, 7.1+ Conduct a log-rank test to examine whether time to relapse depends on genotype. 26.1. Interpret the STATA outputs provided: 26.2. do the confidence intervals for Genotype A and Genotype B women overlap? 26.3 What is the survival probability for women with breast cancer at 5 months? What is the 95%CI for this survival probability? 26.4 Please evaluate if the Cox proportional hazard model with the type of genotype holds the cox proportional assumption. 26.5. What is the hazard ratio and its associated 95%CI when exploring if there are differences between the survival time for women with breast cancer between genotype A and B? 26.6. Does the Cox proportional hazard model indicate that there are differences between the survival time for women with breast cancer between genotype A and B?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
28 0.00
0.25
0.50
0.75
1.00
0
5
10
15
analysis time
genotype = 0
genotype = 1
Kaplan-Meier survival estimates
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
29 27.
Returning to Problem 16, a researcher wants to explore the association of both age and pregnancy history on birthweight deliveries (<2500g). 27.1. Please run and report a simple logistic regression of age on of birthweight deliveries and report the equation using the logistic regression coefficients. 27.2. Please run and report a simple logistic regression of pregnancy history on of birthweight deliveries and report the equation using the logistic regression coefficients. 27.3. Please run and report a multiple logistic regression of age and pregnancy history on of birthweight deliveries and report the equation using the logistic regression coefficients. What do you conclude of this model, interpret the odds ratio(s)?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
30 OUTPUT A: OUTPUT B: OUTPUT C: OUTPUT D:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
31 OUTPUT E: OUTPUT F:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
32 28.
The following output is from an analysis of the Smoke.dat.dta data with an additional variable that splits the sample into two groups: high log co adjusted, and low log co adjusted, using a median cutpoint. The analysis compared recidivism of high versus low log CO adjusted scores. Looking at the output what do you conclude about the recidivism of the two groups? [write 1-2 sentences interpreting the KM Plot and 1-2 sentences interpreting the log rank test]. Do the confidence intervals for the two groups overlap? STATA output: last observed exit t = 365
earliest observed entry t = 0
at risk from t = 0
18,712 total analysis time at risk and under observation
188 failures in single-record/single-failure data
221 observations remaining, representing
13 observations end on or before enter()
234 total observations
exit on or before: failure
obs. time interval: (0, day_abs]
failure event: smoked == 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
33
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
34 29.
A researcher fits a multivariate linear regression model with response y=body fat and predictors X1= triceps skinfold thickness (cms), and x2=thigh circumference (cms). The regression model is with independent errors . The outcome of fitting the model is shown below. 29.1. Provide the equation of the estimated regression line. 29.2. What are the null and alternative hypotheses corresponding to the p-value= 0.000001 in the ANOVA table? Ho: ______________________________________ Ha: __________________________________________________ 29.3. Compute a 95% confidence interval for the regression coefficient associated with triceps skinfold thickness, . Pr>chi2 = 0.0854
chi2(1) = 2.96
Total 188 188.00
1 99 87.54
0 89 100.46
medianlogc~j observed expected
Events Events
L
o
g
-
r
a
n
k
t
e
s
t
f
o
r
e
q
u
a
l
i
t
y
o
f
s
u
r
v
i
v
o
r
f
u
n
c
t
i
o
n
s
analysis time _t: day_abs
failure _d: smoked == 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
35 29.4. If a person with X1=49.8 has an estimated expected body fat of y-hat= 22.8, what is the estimated expected body fat value for another patient with the same thigh circumference (same X2) but 2 cms more of triceps skinfold thickness, this is X1(another patient)=49.8+2? 30.
An experiment was conducted to determine whether a test designed to identify a certain form of mental illness could be easily interpreted with little psychological training. The test was given to 100 people (half of which had the illness, and half didn't) and fifteen people were asked to evaluate them. The fifteen judges were five staff members of a mental hospital, five trainees at the hospital, and five undergraduate psychology majors. The results in the table give the number of the 100 tests correctly classified by each judge. The data are analyzed with the Kruskal-Wallis Test. What can we conclude?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
36 31.
Below are four 2 × 2 tables relating in-hospital mortality to surfactant use in individual birthweight strata. We first test for if the odds ratio is the same in all strata (Wolf test of homogeneity eq 13.19 page 665 in textbook) and obtained: We then computed: And check if the common odds ratio (across all strata) is equal to 1. 31.1. Using a Type I error level of 0.05, is there a significant effect modification? In other words, is there evidence to claim that the odds ratio in at least two strata differ from each other? Explain. 31.2. Using a Type I error level of 0.05, can you conclude that the common odds ratio is different from 1? Explain.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
37 31.3. What does the value imply? In other words is mortality higher or lower in the exposure group “Surfactant +”? Explain. (Note: “Mortality +” means mortality yes)
32. Below is the output of logistic regression of data containing a set of maternal personal, demographic, medical history, and health care related variables as risk factors for low birthweight among infants born at the Baystate Medical Center in Springfield, MA. In this analysis the dependent variable is a binary variable ‘low’ determining whether the infant is born low (coded as y=1) birthweight or not (coded as y=0). As independent variables, mother’s age, smoking status, weight at last menstrual period, history of preterm labor, history of hypertension, and history of uterine irritability are included in the model. 32.1. Write the appropriate logistic regression model with the estimated intercept and coefficients in the output above. 32.2. Is the model significant? Please explain your answer with the test statistic and p-value. 32.3. Is mother’s age (variable named age) significant factor for having a low birthweight infant? Report the p-value of variable age and interpret its estimated coefficient. 33. A proportional hazards model was obtained based on the Framingham Heart Study data. The exam when the 1st event occurred for an individual was used as the time of failure (exams 5, 6, 7, 8, or 9, where the exams are approximately 2 years apart) (YRCOMP). The output is below. We will focus on the variable SMOKEM13=average number of cigarettes smoked daily. From the table we have that beta(Smoking)=0.017 and SE(beta smoking)=0.006.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
38 33.1. Does smoking have a statistically significant (at type I error level α=0.05) effect on the time of failure of the patients, yes no? Explain. 33.2. What is the point estimate of the relative risk (also called the hazard ratio) between two individuals with exactly the same covariate values except that the first one smokes 𝚫
cigarettes (daily in average) and the other zero? Interpret this value in the context of =
𝟐𝟎
in this study. 33.3. Construct a 95% Confidence interval for the relative risk computed on 33.2. 34. The authors analyzed the results of their study, data below, for whether there was a difference in the proportion of “
health statuses
”
between students who studied or did not study for a test by using a chi-
square statistic. Is this method of analysis reasonable for this table? If not, suggest an alternative from the given outputs. Please state the hypothesis for this question and interpret the chosen output and conclude the results. Health Status Study Status Ill (1) Moderate (2) Well (3) Studied (1) 9 2 10 21 Did not Study 17 6 6 29 26 8 16 50 The following outputs are provided for your convenience and with the purpose of preparing you for the exam with incorrect and correct outputs for you to choose the correct ones. Please select outputs that help you to answer the question.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
39 Output A. Output B. Output C.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
40 Output D. Output E.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
41 35. Student’s t test with 4
1 df was used to analyze the results in the table below. This study is interested in determining if there is a difference in the proportion of illness between those who ate salad, those who did not eat salad, and those who are allergic to salads. Do you agree that a t-test is the correct test? Why or why not? Suggest an alternative test from the list and conduct the test. Health Status Salad Eating Ill Moderate Well No 6 3 1 10 Yes 3 6 2 11 Allergic 6 4 11 21 15 13 14 42 Please state the hypothesis for this question and interpret the chosen output and conclude the results. Output A. Output B.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
42 Output C. Output D. Output E.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
43 Output F.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help