Alexander Klemp - Biometrics Lab 5_Fall2023 WC.docx
pdf
keyboard_arrow_up
School
Beloit College *
*We aren’t endorsed by this school
Course
247
Subject
Industrial Engineering
Date
Jan 9, 2024
Type
Pages
6
Uploaded by BailiffSnow15934
Prof Cary & Werner
Fall 2023
Biometrics Lab 5 (100 pts)
Name(s):Alexander Klemp
You may work individually or collaboratively to develop your answers to this lab. If you work
collaboratively on any question, please
clearly identify the contribution of each member to each
question
. Submit one answer for your group and make sure that the file name clearly
identifies the group members.
Word identification:
Fill in the blank with the term that is defined. (2 points each)
1.
Interaction hypothesis
The type of null hypothesis that tests for the dependence of the effects
of one factor on the levels of another factor.
2.
Randomized block
design
An experimental design in which all subjects within a block are
randomly assigned to each treatment level.
3.
Tukey HSD
A multiple comparison test used to determine which means are
significantly different from each other when sample sizes are not equal.
This test maintains the probability of committing a Type I error at the
significance level.
4.
Completely
randomized Design
A two-factor analysis of variance with only one individual per cell.
5.
Fixed effect ANOVA
A two-factor analysis of variance in which the levels of one factor are
specifically chosen and the levels of the other factor are random.
6. Visual examination of the data collected from an experiment measuring the effects of two
factors on wing length in birds (cm) convinces you that the data exhibit multiplicative effects.
Include any formulas in your answers below. (4 points)
a. In order to analyze the data using a two-factor analysis of variance, how should the data be
mathematically manipulated prior to analysis?
You would apply a log based transformation
𝑋' = 𝑙𝑜𝑔(𝑋 − 1)
b. Your analysis identified the L1 and L2 of your 95% CI to be: 1.633 and 1.696, respectively.
Return this CI to its original units.
antilog(1.633)-1=41.95364
antilog(1.696)-1=48.65923
Prof Cary & Werner
Fall 2023
7.
Load in the birthwt dataset from the MASS package. The variable bwt measures the
birthweight in grams of the infants in the study. A group of researchers found that there was a
significant difference in bwt based on the smoking status of the mother. They want to know if
they add information about a mother's history of hypertension if that will better explain the
model.
Part I: Visual review of your data (14 pts)
a.
Identify the name of the factor(s) and levels in this model:
a.
Factor A: Smoking
i.
Smoking and Non Smoking
b.
Factor B: Hypertension History
i.
Hypertension and no Hypertension
b.
Is this experimental design replicated or unreplicated?
The experimental design is replicated
c.
What is the appropriate parametric statistical test to analyze the results of this experiment
Why is it appropriate? Be complete in your answer.
A two-way ANOVA testing for interaction between Smoking and Hypertension. It's appropriate
because there are two factors, and we are looking for an interaction between Smoking and
Hypertension on birthweight.
d.
What are the null hypotheses for this statistical test? Use notation specific to this
example.
H
0
: There is no significant interaction between smoking status and hypertension history
e.
Use Excel to create a “publication quality” bar chart with variation that summarizes the
data in this experiment. You do not need to include a figure caption, but be sure that all
axes are labeled and that you’ve included an appropriate legend (key) within the graph.
1 is yes 0 is No
Prof Cary & Werner
Fall 2023
f.
Based on the graph, describe the results
you expect
to obtain from analyzing the data
(i.e., how does one mean compare to the other in relative terms? Do you expect that
means were significantly different? Do you expect an interaction?). No statistical analysis
is necessary to answer this question; simply review your graph and formulate an expected
result.
I expect the means of birth weights with hypertension will all be lower than the
hypertension and the mean birth weight with smoking will also be lower
Part II: Test Assumption (16 pts)
To analyze the results of this experiment, you must first investigate the assumptions of the
test—you may assume that the samples were randomly and independently collected (Assumption
1).
Assumption 2:
a. Identify this assumption and write the H
0
here:
H
0
: The variances of birth weights are equal across groups.
b. Run the appropriate analyses in R, then in sentence format, report your decision to
reject or accept the null hypotheses and your conclusions. You should include the test
statistics and the associated p-values. Include your R commands and output.
Assumption 3:
a. Identify this assumption and write the H
0
here:
H
0
:The data are normally distributed within each group.
b. Run the appropriate analyses in R, then in sentence format, report your decision to
reject or accept the null hypotheses and your conclusions. You should include the test
statistics and the associated p-values. Include your R commands and output.
> model <- lm(bwt ~ smoke, data = birthwt)
> shapiro.test(residuals(model))
Shapiro-Wilk normality test
data:
residuals(model)
W = 0.99031, p-value = 0.2318
Part III: Run the statistical analysis (30 pts)
a. Using the conclusion from Part II, what is the appropriate omnibus statistical test to analyze
the results of this experiment?
A two-way ANOVA
b. Has your null hypothesis changed? If so, please restate it here.
There is no significant interaction between smoking status and history of hypertension on infant
birth weight. Additionally, there are no main effects of smoking status and history of
hypertension on infant birth weight.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Prof Cary & Werner
Fall 2023
c. Run your analysis using R and paste your command and output below.
> model <- lm(bwt ~ smoke + ht, data = birthwt)
> anova(model)
Analysis of Variance Table
Response: bwt
Df
Sum Sq Mean Sq F value
Pr(>F)
smoke
1
3625946 3625946
7.1529 0.008151 **
ht
1
2056920 2056920
4.0577 0.045411 *
Residuals 186 94286790
506918
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Answer the questions below.
d. Are the effects of smoking status and history of hypertension on infant birth weight
independent? Support your answer with the calculated value of the appropriate test statistic, its
degrees of freedom, its p value, and information on whether to reject the null hypothesis.
Because the p-value is < .05 The effects are not significant
e. Is there a main effect of smoking status on infant birth weight? Your answer should include
conclusions statements with biological context and be supported with the calculated value of the
appropriate test statistic, its degrees of freedom, its p value, and information on whether to reject
the null hypothesis. If the null hypothesis is rejected, include information about the differences
among the means (this may require an additional test).
Because the p-value is < .05 there is no main effects
f. Is there a main effect of the history of hypertension on infant birth weight? Your answer should
include conclusions statements with biological context and be supported with the calculated
value of the appropriate test statistic, its degrees of freedom, its p value, and information on
whether to reject the null hypothesis. If the null hypothesis is rejected, include information about
the differences among the means (this may require an additional test).
Because the p-value is < .05 there is no main effects
g. How did your expectations from Part I compare to the results you obtained in Part III (d, e, f)?
Prof Cary & Werner
Fall 2023
Part IV. Scientific Writing (20 pts)
a.
Write a brief methods section that describes the data collection and the data analysis. You
may add plausible methodology details to ‘fill in the blanks’ for any information not
provided in the study description (located in R). (8 pts)
To collect our data we randomly sampled mothers and their newborns. We did this by selecting
mothers at random from the Bayside Medical Center in Springfield during 1986. They were
selected at random from the medical information of the mothers and their newborns.
b. Write a brief results section that describes the results of your analysis. (7pts)
The analysis revealed that the assumption of homogeneity of variances was met (Levene's test,
p > 0.05), and the normality of residuals was upheld (Shapiro-Wilk test, p > 0.05). The
two-way ANOVA indicated a significant interaction between smoking status and history of
hypertension (p < 0.05), suggesting that the effects were not independent. Additionally, a
main effect of smoking status (p < 0.05) and a main effect of history of hypertension (p <
0.05) were observed.
c. Include a figure in the results section (you may choose to reuse your figure from Part I or
make a different type). Use Excel/R to create a publication quality figure to illustrate your
data and conclusions; include an appropriate figure legend (caption) with a conclusion
statement supported by statistical output. Be sure to identify any statistical differences
among the data in your graph. (5 pts)
The bar chart illustrates the mean birthweights of infants based on the joint effects of smoking
status and history of hypertension. with significance set at p < 0.05.
Prof Cary & Werner
Fall 2023
Part V. Effect size & Power (6 pts)
To review your findings and better design a future study, you decide to run a power analysis on
the effect of history of hypertension on infant birth weight (g). Answer the questions below by
running the appropriate analysis in R (include your R commands and output below).
a. Run the one-way ANOVA.
> modelone <- aov(bwt ~ ht, data = birthwt)
> summary(modelone)
Df
Sum Sq Mean Sq F value Pr(>F)
ht
1
2130425 2130425
4.072
0.045 *
Residuals
187 97839231
523204
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
b. D
etermine the effect size.
> eta_squared(modelone)
Parameter | Eta2 |
95% CI
-------------------------------
ht
| 0.02 | [0.00, 1.00]
c. Determine the power of the analysis you conducted and report it here.
d. Use a power of 0.85 to determine the number of samples needed to detect a difference in means in
the test that you just ran. How many samples would the researchers need per group?
> pwr.anova.test(k= 4, n =NULL , f = .02, sig.level = 0.05, power = 0.85)
Balanced one-way analysis of variance power calculation
k = 4
n = 7689.178
f = 0.02
sig.level = 0.05
power = 0.85
NOTE: n is number in each group
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help