PH1700_Answer_Key_Exercises in Preparation for Midterm_10_05_2022_03
pdf
keyboard_arrow_up
School
University of Texas Health Science Center at Houston School of Nursing *
*We aren’t endorsed by this school
Course
1700
Subject
Statistics
Date
Feb 20, 2024
Type
Pages
22
Uploaded by HighnessSalmonPerson999
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 1 ANSWER KEY TO EXERCISES IN PREPARATION FOR THE MIDTERM 1. Suppose we have a random variable X distributed around a mean of 40 with a variance of 21. If samples of size 50 are selected and the values of calculated, then a sampling distribution for will be formed. 1.1) Find E(
) and
(x-bar). Mathematically, it has been proven that the expected value of the mean of the sampling distribution of means is equal to the sample mean. E(X )= E(
). Thus the E(
)= 40. 1.2) Is the sampling distribution of normally distributed? Explain. Do any assumptions need to be made? Yes, the sample size assumption (n>30) has been met. The expected standard deviation of the sampling distribution of means is equal to (
/
√
n)
2
=21 so
= 4.58 N= 50 so √
n= 7.07 (
/
√
n)= 4.58/7.07= 0.648 1.3) Calculate the following: P(
= 40) P(
= 40)= 0 The probability of a continuous variable at a single point is 0. 2. There are twenty-four families in a neighborhood in Brownsville. Each family has a mother, a father and three children. Conduct a test of hypothesis to evaluate if Brownsville is reporting a different number of flu cases in comparison to the season number of flu cases in the general population using a type I error level of 0.05. N=24 2.1. What is the expected number of mothers with flu? E(mothers with flu)=24*0.15=3.6 2.2. Nine mothers had flu. Is this usual? Ho: p=0.15 Ha: p
്
0.15 Is 𝑛𝑝
𝑞
൏
5
? Then 24*0.15*0.85=3.06 We will need to use compute the p-value for the exact binomial test. Because p-hat> than 0.15 then we need to estimate the probability P(X>=9). X
X
X
X
X
X
X
X
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 2 Computing each probability in Stata to estimate the p-value we can start with: P(mothers with flu=9)=
ቀ
24
9
ቁ
0.15
ଽ
0.85
ଶସିଽ
= 0.0044. P(mothers with flu=10)=
ቀ
24
10
ቁ
0.15
ଵ
0.85
ଶସିଵ
= 0.00116. P(mothers with flu=11)=
ቀ
24
11
ቁ
0.15
ଵଵ
0.85
ଶସିଵଵ
= 0.00026. P(mothers with flu=12)=
ቀ
24
12
ቁ
0.15
ଵଶ
0.85
ଶସିଵଶ
= 0.00004991. P(mothers with flu=13)=
ቀ
24
13
ቁ
0.15
ଵଷ
0.85
ଶସିଵଷ
= 0.000000081 P(mothers with flu=14)=
ቀ
24
14
ቁ
0.15
ଵସ
0.85
ଶସିଵସ
= 0.000000011 P(mothers with flu=15)=
ቀ
24
15
ቁ
0.15
ଵହ
0.85
ଶସିଵହ
= 0.0000000013 You will compute the rest of the probabilities with P(mothers with flu=16) P(mothers with flu=17) P(mothers with flu=18) P(mothers with flu=19) P(mothers with flu=20) P(mothers with flu=21) P(mothers with flu=22) P(mothers with flu=23) P(mothers with flu=24) But those will be very close to zero. Adding these numbers to calculate the p-value then it is less than 0.05. Yes, this result is unusual. This result is lower than type I error level of 0.05. Using Stata:
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 3 Two sided p-value using the exact binomial test is 0.0058*2=0.0116 Using the CI: This 95%CI does not contain the proportion in the null hypothesis of 0.15. Therefore, we reject the null hypothesis and conclude that the 9 number of cases of flu in Brownsville are unusual to what is expected from the general population. 2.3. A 99% CI of the probability of mothers getting the flu based on prevalence given in 6.2 is given by Equation 6.19: If 9 mothers have the flu, p-hat = 9/24 = 0.375; q-hat=.625 Checking the assumption, we have npq=24*(0.375)*(1-0.625)=5.625. We can use the large sample formulae, we report both the exact and the normal approximation. 99%CI for parameter p = p-hat ± z
.995
*sqrt(p-hat*q-hat/n) =0.375±2.58*sqrt(0.375*.625/24) = [0.12;0.63] Which is the approximation to the exact method which reports: [0.146, 0.653] using stata: The 99%CI for the probability of mothers getting the flu based on the prevalence observed is = .146, .653 using the exact method. 3. We are concerned with the possible spread of diphtheria and wish to know how many cases we can expect to see in a particular year. The expected number of cases of diphtheria reported in the United States in a given year between 1980 and 1989 was 1.8. 3.1. What is the probability that no cases of diphtheria will be reported during a given year?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 4 . 3.2. What is the probability that between two and three cases of diphtheria will be reported during a given year? 3.3. By Equation 6.23 and Table 8 in the Appendix, for confidence level and observed number , the exact 99% CI for the expected number of cases of diphtheria is in terms of Mu. In terms of Lambda then we have: 0.672/18= 0.0373 12.59/18= 0.6994 The 99%CI for lambda is (0.0373; 0.6994) which coincides with the stata output. The table in your textbook gives the confidence interval in terms of mu=lambda*t while stata expects the 4. Some researchers interested in the diagnosis of coronary artery disease used a procedure called cardiac fluoroscopy to determine whether there is calcification of coronary arteries and thereby to diagnose coronary artery disease. From the test, it can 165
.
0
!
0
)
8
.
1
(
)
0
(
0
8
.
1
e
X
P
P
(2
X
3)
P
(
X
2)
P
(
X
3)
e
1.8
(1.8)
2
2!
e
1.8
(1.8)
3
3!
0.268
0.161
0.429
(1
)
.99
4
x
(.672,12.59)
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 5 be determined if 0,1, 2 or 3 coronary arteries are calcified. Let T
0, T
1
, T
2 and T
3
denotes these events. Let D+ or D- denote the event that disease is present or absent respectively. The researchers presented the following table: i 0 1 2 3 P(T
i
|D+) 0.42 0.24 0.20 0.14 P(T
i
|D-) 0.96 0.02 0.02 0.00 Suppose my population of interest are male between the ages of 30 and 39 years old suffering from non-anginal chest pain. Suppose that in this population of interest the prevalence of having the disease is 0.05. 4.1. What is the probability that a person from this population has the disease given that the test found 0 arteries calcified? P
ሺ
D
|Ti
ሻ ൌ
P
ሺ
Ti|D
ሻ
P
ሺ
D
ሻ
P
ሺ
Ti|D
ሻ
P
ሺ
D
ሻ
P
ሺ
Ti|D
െሻ
P
ሺ
D
െሻ
P(T
0
│
D+)=.42 (probability of the test showing 0 calcified arteries, given that the person has the disease) P(D+)= .05 (prevalence of disease in the population) P(T
0
| D-)= .96 (probability of not the test showing 0 calcified arteries, given that the person does NOT have the disease) P(D-)= .95 (prevalence of no disease in the population) Plug in the probabilities into the formula P
ሺ
D
|T0
ሻ ൌ
.
ସଶ∗
.
ହ
.
ସଶ∗
.
ହା
.
ଽ∗
.
ଽହ
=0.02 This indicates that it is unlikely that the patient has coronary artery disease 4.2. What is the probability that a person from this population has the disease given that the test found 1 artery calcified? P(T
1
│
D+)=.24 (probability of the test showing 1 calcified artery, given that the person has the disease) P(D+)= .05 (prevalence of disease in the population) P(T
1
| D-)= .02 (probability of showing 1 calcified artery, given that the person does NOT have the disease) P(D-)= .95 (prevalence of no disease in the population) P
ሺ
D
|T1
ሻ ൌ
.
ଶସ∗
.
ହ
.
ଶସ∗
.
ହା
.
ଶ∗
.
ଽହ
=0.39 4.3. What is the sensitivity of the test for 0 artery calcified? Sensitivity=1-false negative Sensitivity=P (T+|D+) =0.42 for no arteries calcified. 4.4. What is the specificity of the test for 1 artery calcified? Specificity= 1-false positive; specificity=P(T-|D-)
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 6 False positive= P(T+|D-)=0.02 for 1 artery calcified Then Specificity= 1-0.02=0.98 5. Suppose my population of interest are male between the ages of 50 and 59 years old suffering from typical angina. Suppose that in this population the prevalence of having the disease is 0.92. i 0 1 2 3 P(T
i
|D+) 0.42 0.24 0.20 0.14 P(T
i
|D-) 0.96 0.02 0.02 0.00 5.1. What is the probability that a person from this population has the disease given that the test found 0 arteries calcified? P(T
0
│
D+)=.42 (probability of the test showing 0 calcified arteries, given that the person has the disease) P(D+)= .92 (prevalence of disease in the population) P(T
0
| D-)= .96 (probability of not the test showing 0 calcified arteries, given that the person does NOT have the disease) P(D-)= .08 (prevalence of no disease in the population) P
ሺ
D
|T0
ሻ ൌ
.
ସଶ∗
.
ଽଶ
.
ସଶ∗
.
ଽଶା
.
ଽ∗
.
଼
=0.83 5.2. What is the probability that a person from this population has the disease given that the test found 1 arteries calcified? P(T
1
│
D+)=.24 (probability of the test showing 1 calcified artery, given that the person has the disease) P(D+)= .92 (prevalence of disease in the population) P(T
1
| D-)= .02 (probability of showing 1 calcified artery, given that the person does NOT have the disease) P(D-)= .08 (prevalence of no disease in the population) P
ሺ
D
|T1
ሻ ൌ
.
ଶସ∗
.
ଽଶ
.
ଶସ∗
.
ଽଶା
.
ଶ∗
.
଼
=0.99 5.3. What is your conclusion from these statistics? Comparing the two patients, we see strong influence of the prevalence of the disease in the population of interest. 6. Calculate from the 2x2 table: 6.1. The probability of the false positive is ଶ
ଽହ
ൌ
0.284
6.2. The probability of the false negative is ଵଵ
ଶ
ൌ
0.568
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 7 6.3. The sensitivity of the test is ଼
ଵ଼ହ
ൌ
0.368
6.4. The specificity of the test is ଼ଽ
ଵଵ
ൌ
0.767
6.5. The probability of having prostate cancer given that the test is positive is ଼
ଽହ
ൌ
0.716
6.6. The odds ratio of prostate cancer when comparing people who had a positive prediction of cancer result by DRE with people who did not have a positive prediction of cancer result by DRE is ଼ൈ଼ଽ
ଵଵൈଶ
ൌ
1.92
7. Calculate from the 2x2 table: 7.1. The probability of the false positive is ଼଼
ଶ
ൌ
0.427
which is low. 7.2. The probability of the false negative is ଶ
ଽହ
ൌ
0.211
which is low. 7.3. The sensitivity is ହ
ଽହ
ൌ
0.789
which is good. 7.4. The specificity is ଵଵ଼
ଶ
ൌ
0.573
which is low. 7.5. The probability of having cancer given that the test is positive is ହ
ଵଷ
ൌ
0.460
which is low. 7.6. The odds ratio of prostate cancer when comparing people who had a prostate specific antigen density greater than 0.14 with people who have prostate specific antigen density less or equal to 0.14 is ଼∗଼ଽ
ଵଵ∗ଶ
ൌ
1.92
8.1. We compute: d=
¯
𝑑 ൌ
ఀௗ
= 469/24 = 19.54mg/dl 8.2. Standard error = s/
√
n = 16.81/
√
24 = 3.43 8.3. 95% CI = 𝑑
± t
n
-1,.975 s
/
√
n = 19.54± t
23,.975
(3.43) = 19
. 54
± 2.069 (3.43)
= 12.4; 26.6 8.4. We can conclude that the cholesterol levels have changed significantly after adopting the diet because 0 is not within the 95% CI 9. 𝑋
= 3043 s= 852 , n=121 df= 120 α
= 0.05 𝑋
± t
df, 1-
α
/2 (s/
√
n) = 3043± 1.98 (852/
√
121) 3043 ± 153.35 (2889.65; 3196.35) 10.1
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 8 10.2 Identify the variable of interest: internet users who have searched for information on experimental treatments or medicines Identify the parameter of interest: Proportion State the null and alternative hypothesis Ho: Pi=0.25 Ha: Pi different from 0.25 Identify or determine the type I error level that you will use to test the hypothesis: 0.01 Identify the test statistic: One sample test for one proportion Identify the distribution of the test statistic: Standard normal if the assumption is meet or binomial for the exact distribution. Checking assumptions: n*Pi*(1-Pi)=20*.25*.75=3.75. The normal approximation to the binomial is not valid so the exact formula is required which follows a binomial distribution. Equation 7.29 (page 253) for the exact formulae respectively. Determine the decision rule (do/describe a graph!): Exact method requires the p-value or the exact CI: 𝐼𝑓
𝑝̂ 𝑝
, 𝑝 ൌ
2
ൈ
Pr
ሺ𝑋 𝑥ሻ ൌ 𝑚𝑖𝑛
2
∗ ቀ
𝑛
𝑘
ቁ 𝑝
ሺ
1
െ 𝑝
ሻ
ି
௫
ୀ
, 1
൩
𝐼𝑓
𝑝̂ 𝑝
, 𝑝 ൌ
2
ൈ
Pr
ሺ𝑋 𝑥ሻ ൌ 𝑚𝑖𝑛
2
∗ ቀ
𝑛
𝑘
ቁ 𝑝
ሺ
1
െ 𝑝
ሻ
ି
ୀ௫
, 1
൩
Calculate the test statistic and report the degrees of freedom. Test statistic is not computed as it is through the p-value that the exact probability is estimated. Compute the p-value for the test statistic by hand and in STATA •
An exact 100%(1
‐
) upper one sided confidence interval for the
binomial parameter
p
that is always valid is given by
p
>
p
1
where
p
1
satisfies
pr
ሺ𝑋 𝑥
|
𝑝 ൌ 𝑝
ଵ
ሻ ൌ 𝛼 ൌ ቀ
𝑛
𝑘
ቁ 𝑝
ଵ
ሺ
1
െ 𝑝
ଵ
ሻ
ି
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 9 In Stata: Manually using stata (You will not have access to stata during the midterm but for you to recall the stata command) Make a decision using your test statistic: No possible only possible thought p-value. Make a decision using your p-value: p-value>type I error level, therefore, we DO NOT reject Ho. Conclude and interpret: We do not have enough evidence to conclude that the proportion of internet users who have searched for information on experimental treatments or medicines is different from 25%. 11. Survey Problem: Use Sample size for One Sample Binomial Test, Equation 7.46, p.250 p0=.35,p1=.40 𝑛 ൌ
𝑝
𝑞
𝑧
ଵି
ఈ
ଶ
𝑧
ଵିఉ
ට
𝑝
ଵ
𝑞
ଵ
𝑝
𝑞
൨
ଶ
ሺ𝑝
ଵ
െ 𝑝
ሻ
ଶ
𝑛 ൌ
. 35
∗
.65
ሺ𝑧
.
ଽହ
𝑧
.
଼
∗
ට
. 40
∗
.60
. 35
∗
.65
ሻ
ଶ
ሺ
.40
െ
.35
ሻ
ଶ
n= 726 people are needed.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 10 12. Yes this two results contradict each other. A p-value of 0.03 is less than 0.05, which mean the test of hypothesis is significant, but is insignificant when using the 95% CI (-
0.6, 7.3) since 0 is contained within the CI. 13. n = 56
Estimated required sample size:
sd = 8.5
alternative m = -3.2
power = 0.8000
alpha = 0.0500 (two-sided)
Assumptions:
Test Ho: m = 0, where m is the mean in the population
to hypothesized value
Estimated sample size for one-sample comparison of mean
. sampsi 0 -3.2,sd(8.5) onesample a(.05) p(0.80)
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 11 14. Going back to problem 9, regarding the “God Loves Children Hospital”, the hospital took a sample of 121 babies and they estimated summary statistics. The mean estimate of this sample birthweight was 3043 g with a standard deviation of 852 gr. Please assume that a researcher wants to test the hypothesis that this hospital is usually to have low birth weight babies. Please provide the power of the test for the hypothesis that the mean birthweight in this hospital is different from 2000 gr with an effect size of 200gr. Using a type I error level of 0.01 what is the power of the test. 𝑛 ൌ
121,
𝜇
1
ൌ
2200
𝑔
,
𝜎 ൌ
852
𝑔
,
𝜇
0
ൌ
2000
𝑔
,
𝛼 ൌ
0.01
Power =
𝜙 ቀെ
Z
ሺ
1
െ
ଶ
ሻ
|
ఓఖିఓଵ
|
ఙ
/
√
ቁ
Power = 𝜙 ቀെ
2.576
|
ଶଶିଶ
|
଼ହଶ
/
√ଵଶଵ
ቁ
= P(z> -2.576 +2.582) = P(z> 0.006) = 0.497 by hand
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 12 Or The formula in Rosner is close to the formulae used in STATA. Please notice the delta here in stata is the effect size of the differences divided by the standard deviation: (2200-
2000)/821=0.243). The command DIFF computes the effect size discussed in class 2200-
2000=200gr. 15. Going back to problem 10, regarding the pew internet and American life project reported in 2003 that 18 percent of internet users have used the internet to search for information regarding experimental treatments or medicines. The sample consisted of 20 adult internet users, and information was collected from telephone interviews and the hull
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 13 hypothesis is that the proportion is 25%. Using a type I error level of 0.01 what is the power of the test. 16.1 One sided test 𝐻
ൌ
191 𝑣𝑠
. 𝐻
ଵ
191
𝑡
ൌ
195.5
െ
191
25.6
√
100
ൌ
1.76
The t
99
critical value at 0.05 significance level (one tail) is: . display invttail(99,0.05)
1.6603912
Since t
ൌ
1.76
1.66
, we reject the null hypothesis and conclude that the mean weight in men in 2006 is more than 191 pounds. If 2 Two sided test: 𝐻
ൌ
191 𝑣𝑠
. 𝐻
ଵ
്
191
𝑡 ൌ
195.5
െ
191
25.6
√
100
ൌ
1.76
The t
99
critical value at 0.05 significance level (two tail) is: . display invttail(99,0.025)
1.984217
Since 𝑡 ൌ
1.76
൏
1.98
, we fail to reject the null hypothesis and conclude that there is not enough evidence to support that mean weight in men in 2006 is different from 191 pounds.
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 14 16.2 There are not statistically significant differences in the mean weight in men in 2006 or there is not enough sample size to detect the difference with the 100 participants. 17.1 One sample t-test 17.2 t distribution 17.3 Compute the p-value for the hypothesis test 𝐻
: μ
ௗ௧
ൌ
15 𝑣𝑠
. 𝐻
ଵ
: μ
ௗ௧
൏
15
𝑡 ൌ
13
െ
15
4
√
10
ൌ െ
1.581
The p-value is given by Pr(t,9, < -1.581). t (9,0.10) = -1.383 and t (9, 0.05) = -1.833 Then -1.581 is between these values. Therefore, 0.05 < p-value < 0.10
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 15 18.1 Use a one sample t-test. 18.2 Output A is correct. Therefore, we reject Ho and conclude that the mean creatinine level for analgesic abusers is significantly different than the normal population. 19.1 A two-sample test for paired samples also known as PAIRED t-TEST or one sample test of the difference is needed here because we are comparing refractive error between the right and left eyes. 19.2 A two-sided test is needed here because we have no prior idea as to whether the right or left eye has the larger mean refraction. 19.3 We have the test statistic: 16
~
79
.
0
17
150
.
1
206
.
0
t
n
s
d
t
d
Since t 16,.975 =2.120>0.79, it follows that p>.05 and there is no significant difference in mean spherical refractive error between the right and left eyes. 19.4 Output B is correct. As you can see from the output, t=-0.79, therefore, we cannot reject H0. . Pr(T < t) = 0.9855 Pr(|T| > |t|) = 0.0289 Pr(T > t) = 0.0145
Ha: mean < 1 Ha: mean != 1 Ha: mean > 1
Ho: mean = 1 degrees of freedom = 14
mean = mean(creatinine) t = 2.4335
creati~e 15 1.273333 .1123204 .4350151 1.03243 1.514237
Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
One-sample t test
. ttest creatinine == 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 16 19.5 A two-sided 90% CI for the mean difference in spherical refractive error between right and left eyes is given by: )
248
.
0
,
660
.
0
(
)
260
.
0
(
746
.
1
206
.
0
)
260
.
0
(
206
.
0
95
.
0
,
16
95
,.
1
t
n
s
t
d
d
n
19.6 Output D is correct since we are looking for a 90% CI. STATA automatically gives a 95% CI unless told otherwise. 20.1 We wish to test the hypothesis H0: μ1=μ2 versus H1: μ1
≠
μ2. Use test A or C from List A, depending on assumptions: The two sample t-test for unknown but equal variances The two sample t-test for unknown but unequal variances . Pr(T < t) = 0.2201 Pr(|T| > |t|) = 0.4402 Pr(T > t) = 0.7799
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Ho: mean(diff) = 0 degrees of freedom = 16
mean(diff) = mean(right - left) t = -0.7915
diff 17 -.2058824 .2601217 1.072509 -.7573156 .3455509
left 17 -1.382353 .59645 2.459226 -2.64677 -.1179354
right 17 -1.588235 .7100063 2.927431 -3.093381 -.0830891
Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
Paired t test
. ttest right == left
Pr(T < t) = 0.2201 Pr(|T| > |t|) = 0.4402 Pr(T > t) = 0.7799
Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0
Ho: mean(diff) = 0 degrees of freedom = 16
mean(diff) = mean(right - left) t = -0.7915
diff 17 -.2058824 .2601217 1.072509 -.6600245 .2482598
left 17 -1.382353 .59645 2.459226 -2.423685 -.3410206
right 17 -1.588235 .7100063 2.927431 -2.827824 -.3486468
Variable Obs Mean Std. Err. Std. Dev. [90% Conf. Interval]
Paired t test
. ttest right=left, level(90)
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 17 20.2 There is enough sample size for the CLT to hold so there is not need for a non parametric test. To make an informed decision from list A and to be able to choose A or C we need to conduct a test for homogeneity of variances. 2
2
2
1
1
2
2
2
1
0
:
;
:
H
H
We will use the F test for the equality of two variances to decide whether to use a two-
sample t-test with equal or unequal variances. 975
,.
60
,
57
975
,.
60
,
24
60
,
57
2
88
.
1
~
14
.
3
8
.
4
5
.
8
F
F
F
F
F
Therefore, p<.05 and we must use the two-sample t test with unequal variances so the answer is use test C from list A. 20.3 To compute the appropriate df. We have that: 𝑑
′
ൌ
൬
𝑠
ଵ
ଶ
𝑛
ଵ
𝑠
ଶ
ଶ
𝑛
ଶ
൰
ଶ
൬
𝑠
ଵ
ଶ
𝑛
ଵ
ൗ
൰
ଶ
𝑛
ଵ
െ
1
൬
𝑠
ଶ
ଶ
𝑛
ଶ
ൗ
൰
ଶ
𝑛
ଶ
െ
1
ൌ
ሺ
1.246
0.378
ሻ
ଶ
1.246
ଶ
57
0.378
ଶ
60
ൌ
2.635
0.0296
ൌ
89.0 𝑑
"
ൌ
89
20.4 To find a 95% CI for the difference in mean blood lead levels between the high and low groups at 24 months. 𝑥
ଵ
െ 𝑥
ଶ
േ 𝑡
଼ଽ
,.
ଽହ
ඨ
𝑠
ଵ
ଶ
𝑛
ଵ
𝑠
ଶ
ଶ
𝑛
ଶ
ൌ
7.7
െ
5.4
േ 𝑡
଼ଽ
,.
ଽହ
ሺ
1.274
ሻ
ൌ
2.3
േ 𝑡
଼ଽ
,.
ଽହ
ሺ
1.274
ሻ
We find t89, .975 using book, Stata or excel =1.987. Therefore the 95% CI is given by: 2.3
േ
1.987
ሺ
1.274
ሻ ൌ
2.3
േ
2.53
ൌ ሺെ
0.23; 4.83
ሻ
20.5 Because the confidence interval contains 0 we can conclude that the mean blood-
levels do not significantly differ between low and high cord blood-lead groups or we do not have enough sample size/power to detect it.
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 18 21.1) The power to estimate the difference between the observed blood-lead levels at 18 months between the low and high groups was 0.14.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 19 21.2) The sample size to estimate the difference between the observed blood-lead levels at 6 months between the low and high groups using a type I error level of 0.01, and assuming a mean change between baseline and the 6 months measurements in the low group of 4.6 and in the high group of 7.0, with a type II error level of 0.20 is 374. 22.1) The ophthalmology investigators have a power of 0.5433 for the study with the 17 participants. 22.2) The ophthalmology investigators need 247 participants to measure their spherical refraction in both eyes for the assumptions given of seeking to detect an effect size of 0.08 of difference between both eyes, from a change in the mean spherical refraction of -0.206, an associated standard deviation of 1.07, a correlation between on the mean
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 20 spherical refraction of both eyes of 0.935, a type I error of 0.05, a type II error of 0.10 and a two-sided test. 23. Pulmonary Disease – Data. Group 1 consists of 589 nonsmoking Children and Group 2 consists of 65 children who smoke. The forced expiratory volume (FEV), a measure of lung function, is measured for all children in each group. Given the research question: Does lung function differ in children who smoke versus children who don’t smoke? We want to check if Group 1 and Group 2 have equal Variances using Stata and we obtained the output below. How do you compute the p-value by hand of this test? This is the null or the alternative hypothesis: 𝐻
0
: 𝜎
ଵ
ଶ
ൌ 𝜎
ଶ
ଶ
vs 𝐻
1: 𝜎
ଵ
ଶ
് 𝜎
ଶ
ଶ
with test statistic: 𝐹 ൌ
௦
భ
మ
௦
మ
మ
The result of the test statistic is F = 1.2861
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 21 To estimate the p-value we need to make a plot to know where is the p-value, in our book we have 2 pictures below and we need to select which one to use for our problem: (a) (b) Because F=1.2861 we need to use panel (a) instead of panel (b). Now we need to plug in the graph of panel (a) the result of our test statistic: Finding the degrees of freedom: n1-1=589-1=588 and n2-1=65-1=64. Now we need to find the area under the curve in the F distribution in our textbook: …
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
THE UNIVERSITY OF TEXAS HEALTH SCIENCE CENTER AT HOUSTON SCHOOL OF PUBLIC HEALTH. PH1700 INTERMEDIATE BIOSTATISTICS 22 The rectangle is where we should find the value of our test statistic. The graph from the row with p=0.90 indicates that above 1.29 there is an area of 0.10 this is one tail in the table: The result of the test statistic is 1.2861 which is <1.29. Therefore, one sided of the p-
value is >0.10. Then 2
ൈ
Pr ሺ𝐹
1.2861
ሻ
>2*0.1>0.20 which is similar to what is given by stata.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill