Alexander Klemp - Biometrics Lab 5_Fall2023 WC.docx

pdf

School

Beloit College *

*We aren’t endorsed by this school

Course

247

Subject

Industrial Engineering

Date

Jan 9, 2024

Type

pdf

Pages

6

Uploaded by BailiffSnow15934

Report
Prof Cary & Werner Fall 2023 Biometrics Lab 5 (100 pts) Name(s):Alexander Klemp You may work individually or collaboratively to develop your answers to this lab. If you work collaboratively on any question, please clearly identify the contribution of each member to each question . Submit one answer for your group and make sure that the file name clearly identifies the group members. Word identification: Fill in the blank with the term that is defined. (2 points each) 1. Interaction hypothesis The type of null hypothesis that tests for the dependence of the effects of one factor on the levels of another factor. 2. Randomized block design An experimental design in which all subjects within a block are randomly assigned to each treatment level. 3. Tukey HSD A multiple comparison test used to determine which means are significantly different from each other when sample sizes are not equal. This test maintains the probability of committing a Type I error at the significance level. 4. Completely randomized Design A two-factor analysis of variance with only one individual per cell. 5. Fixed effect ANOVA A two-factor analysis of variance in which the levels of one factor are specifically chosen and the levels of the other factor are random. 6. Visual examination of the data collected from an experiment measuring the effects of two factors on wing length in birds (cm) convinces you that the data exhibit multiplicative effects. Include any formulas in your answers below. (4 points) a. In order to analyze the data using a two-factor analysis of variance, how should the data be mathematically manipulated prior to analysis? You would apply a log based transformation 𝑋' = 𝑙𝑜𝑔(𝑋 − 1) b. Your analysis identified the L1 and L2 of your 95% CI to be: 1.633 and 1.696, respectively. Return this CI to its original units. antilog(1.633)-1=41.95364 antilog(1.696)-1=48.65923
Prof Cary & Werner Fall 2023 7. Load in the birthwt dataset from the MASS package. The variable bwt measures the birthweight in grams of the infants in the study. A group of researchers found that there was a significant difference in bwt based on the smoking status of the mother. They want to know if they add information about a mother's history of hypertension if that will better explain the model. Part I: Visual review of your data (14 pts) a. Identify the name of the factor(s) and levels in this model: a. Factor A: Smoking i. Smoking and Non Smoking b. Factor B: Hypertension History i. Hypertension and no Hypertension b. Is this experimental design replicated or unreplicated? The experimental design is replicated c. What is the appropriate parametric statistical test to analyze the results of this experiment Why is it appropriate? Be complete in your answer. A two-way ANOVA testing for interaction between Smoking and Hypertension. It's appropriate because there are two factors, and we are looking for an interaction between Smoking and Hypertension on birthweight. d. What are the null hypotheses for this statistical test? Use notation specific to this example. H 0 : There is no significant interaction between smoking status and hypertension history e. Use Excel to create a “publication quality” bar chart with variation that summarizes the data in this experiment. You do not need to include a figure caption, but be sure that all axes are labeled and that you’ve included an appropriate legend (key) within the graph. 1 is yes 0 is No
Prof Cary & Werner Fall 2023 f. Based on the graph, describe the results you expect to obtain from analyzing the data (i.e., how does one mean compare to the other in relative terms? Do you expect that means were significantly different? Do you expect an interaction?). No statistical analysis is necessary to answer this question; simply review your graph and formulate an expected result. I expect the means of birth weights with hypertension will all be lower than the hypertension and the mean birth weight with smoking will also be lower Part II: Test Assumption (16 pts) To analyze the results of this experiment, you must first investigate the assumptions of the test—you may assume that the samples were randomly and independently collected (Assumption 1). Assumption 2: a. Identify this assumption and write the H 0 here: H 0 : The variances of birth weights are equal across groups. b. Run the appropriate analyses in R, then in sentence format, report your decision to reject or accept the null hypotheses and your conclusions. You should include the test statistics and the associated p-values. Include your R commands and output. Assumption 3: a. Identify this assumption and write the H 0 here: H 0 :The data are normally distributed within each group. b. Run the appropriate analyses in R, then in sentence format, report your decision to reject or accept the null hypotheses and your conclusions. You should include the test statistics and the associated p-values. Include your R commands and output. > model <- lm(bwt ~ smoke, data = birthwt) > shapiro.test(residuals(model)) Shapiro-Wilk normality test data: residuals(model) W = 0.99031, p-value = 0.2318 Part III: Run the statistical analysis (30 pts) a. Using the conclusion from Part II, what is the appropriate omnibus statistical test to analyze the results of this experiment? A two-way ANOVA b. Has your null hypothesis changed? If so, please restate it here. There is no significant interaction between smoking status and history of hypertension on infant birth weight. Additionally, there are no main effects of smoking status and history of hypertension on infant birth weight.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Prof Cary & Werner Fall 2023 c. Run your analysis using R and paste your command and output below. > model <- lm(bwt ~ smoke + ht, data = birthwt) > anova(model) Analysis of Variance Table Response: bwt Df Sum Sq Mean Sq F value Pr(>F) smoke 1 3625946 3625946 7.1529 0.008151 ** ht 1 2056920 2056920 4.0577 0.045411 * Residuals 186 94286790 506918 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Answer the questions below. d. Are the effects of smoking status and history of hypertension on infant birth weight independent? Support your answer with the calculated value of the appropriate test statistic, its degrees of freedom, its p value, and information on whether to reject the null hypothesis. Because the p-value is < .05 The effects are not significant e. Is there a main effect of smoking status on infant birth weight? Your answer should include conclusions statements with biological context and be supported with the calculated value of the appropriate test statistic, its degrees of freedom, its p value, and information on whether to reject the null hypothesis. If the null hypothesis is rejected, include information about the differences among the means (this may require an additional test). Because the p-value is < .05 there is no main effects f. Is there a main effect of the history of hypertension on infant birth weight? Your answer should include conclusions statements with biological context and be supported with the calculated value of the appropriate test statistic, its degrees of freedom, its p value, and information on whether to reject the null hypothesis. If the null hypothesis is rejected, include information about the differences among the means (this may require an additional test). Because the p-value is < .05 there is no main effects g. How did your expectations from Part I compare to the results you obtained in Part III (d, e, f)?
Prof Cary & Werner Fall 2023 Part IV. Scientific Writing (20 pts) a. Write a brief methods section that describes the data collection and the data analysis. You may add plausible methodology details to ‘fill in the blanks’ for any information not provided in the study description (located in R). (8 pts) To collect our data we randomly sampled mothers and their newborns. We did this by selecting mothers at random from the Bayside Medical Center in Springfield during 1986. They were selected at random from the medical information of the mothers and their newborns. b. Write a brief results section that describes the results of your analysis. (7pts) The analysis revealed that the assumption of homogeneity of variances was met (Levene's test, p > 0.05), and the normality of residuals was upheld (Shapiro-Wilk test, p > 0.05). The two-way ANOVA indicated a significant interaction between smoking status and history of hypertension (p < 0.05), suggesting that the effects were not independent. Additionally, a main effect of smoking status (p < 0.05) and a main effect of history of hypertension (p < 0.05) were observed. c. Include a figure in the results section (you may choose to reuse your figure from Part I or make a different type). Use Excel/R to create a publication quality figure to illustrate your data and conclusions; include an appropriate figure legend (caption) with a conclusion statement supported by statistical output. Be sure to identify any statistical differences among the data in your graph. (5 pts) The bar chart illustrates the mean birthweights of infants based on the joint effects of smoking status and history of hypertension. with significance set at p < 0.05.
Prof Cary & Werner Fall 2023 Part V. Effect size & Power (6 pts) To review your findings and better design a future study, you decide to run a power analysis on the effect of history of hypertension on infant birth weight (g). Answer the questions below by running the appropriate analysis in R (include your R commands and output below). a. Run the one-way ANOVA. > modelone <- aov(bwt ~ ht, data = birthwt) > summary(modelone) Df Sum Sq Mean Sq F value Pr(>F) ht 1 2130425 2130425 4.072 0.045 * Residuals 187 97839231 523204 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 b. D etermine the effect size. > eta_squared(modelone) Parameter | Eta2 | 95% CI ------------------------------- ht | 0.02 | [0.00, 1.00] c. Determine the power of the analysis you conducted and report it here. d. Use a power of 0.85 to determine the number of samples needed to detect a difference in means in the test that you just ran. How many samples would the researchers need per group? > pwr.anova.test(k= 4, n =NULL , f = .02, sig.level = 0.05, power = 0.85) Balanced one-way analysis of variance power calculation k = 4 n = 7689.178 f = 0.02 sig.level = 0.05 power = 0.85 NOTE: n is number in each group
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help