Nichols-assignment-g-anova

pdf

School

University Of Arizona *

*We aren’t endorsed by this school

Course

376

Subject

Mechanical Engineering

Date

Jan 9, 2024

Type

pdf

Pages

Uploaded by EarlMule3901

1 BME 376: Assignment G- Nichols • This assignment is focused on hypothesis testing using Analysis of Variance (ANOVA). All answers should be entered in the space provided below each question. • Upload a completed word document to D2L 1. [6 points] Patients suffering from rheumatic diseases such as osteoporosis often suffer critical losses in bone mineral density (BMD). Alendronate is one medication prescribed to build or prevent further loss of BMD. A study looked at 96 women taking alendronate to determine if a difference existed in the mean percent change in BMD among five different primary diagnosis classifications. Group 1 patients were diagnosed with rheumatoid arthritis (RA). Group 2 patients were patients with diseases including lupus and other vasculitis diseases (LUPUS). Group 3 patients had polymyalgia rheumatica or temporal arthritis (PMRTA). Group 4 patients had osteoarthritis (OA) and group 5 patients had osteoporosis (O) with no other rheumatic diseases identified in the medical record. Changes in BMD among the five groups are shown in rheumatic.csv (download from D2L). Do these data provide sufficient evidence to indicate that the mean bone mineral density is different across the five patient groups ? Assume the level of significance,  = 0.05. Also, if there are significant differences in mean BMD across groups, determine which groups differ significantly from each other. Step 0: Identify and state an appropriate statistical test One-Way ANOVA Step 1: Identify the dependent and independent variables and state the null hypothesis and alternate hypothesis Note: Write in full sentences i.e., write out hypotheses in the context of the question provided. Independent (treatment) variable: Group Dependent (outcome) variable: BMD Value Null hypothesis H 0 : 𝑢 1 = 𝑢 2 = 𝑢 3 = 𝑢 4 = 𝑢 5 Alternate hypothesis H A : Not all means are equal. Step 2: Compute test statistic (Variance Ratio) Note: You are encouraged to use R to perform any calculations required for the test statistic (variance ratio). You may then verify the manually computed test statistic with the one generated in the next step. Also, copy-and-paste R commands (avoid including screenshots). # Include R commands here Rheumatic <- read.csv("rheumatic.csv") Rheumatic1 <- filter(rheumatic, GROUP == 1) Rheumatic2 <- filter(rheumatic, GROUP == 2) Rheumatic3 <- filter(rheumatic, GROUP == 3) Rheumatic4 <- filter(rheumatic, GROUP == 4) Rheumatic5 <- filter(rheumatic, GROUP == 5)

2 RheumaticSSA<-(37*(mean(rheumatic1$BMD)-mean(rheumatic$BMD))^2)+(9*(mean(rheumatic2$BMD)- mean(rheumatic$BMD))^2)+(16*(mean(rheumatic3$BMD)-mean(rheumatic$BMD))^2)+(24*(mean(rheumatic4$BMD)- mean(rheumatic$BMD))^2)+ (10*(mean(rheumatic5$BMD)-mean(rheumatic$BMD))^2)+ RheumaticMSA <- rheumaticSSA/4 RheumaticSSW<- var(rheumatic1$BMD)*36+var(rheumatic2$BMD)*8+var(rheumatic3$BMD)*15+var(rheumatic4$BMD)*23+ var(rheumatic5$BMD)*9 RheumaticMSW <- rheumaticSSW/91 Ratio <- RheumMSA/rheumaticMSW # Paste R outputs here 2.277178 Step 3: Statistical decision Use R to run the test. # Include R commands here Rheumatic <- read.csv("rheumatic.csv") Summary(rheumatic) # Paste R outputs here BMD GROUP Min.: -9.6460 Min.: 1.000 1st Qu.: 0.8562 1st Qu.: 1.000 Median: 4.2445 Median: 3.000 Mean: 4.8240 Mean: 2.594 3rd Qu.: 7.4357 3rd Qu.: 4.000 Max.:25.6550 Max.: 5.000 Sapply(rheumatic, class) BMD GROUP "numeric" "integer" Rheumatic$GROUP <- factor(rheumatic$GROUP) anova(lm(rheumatic$BMD ~ rheumatic$GROUP)) Analysis of Variance Table Response: rheumatic$BMD Df Sum Sq Mean Sq F value Pr(>F) rheumatic$GROUP 4 355.5 88.864 2.2772 0.06697 . Residuals 91 3551.1 39.024 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Oneway.test(rheumatic$plate~rheumatic$Group) One-way analysis of means (not assuming equal variances) data: rheumatic$BMD and rheumatic$GROUP F = 2.0513, num df = 4.000, denom df = 30.536, p-value = 0.1118 Step 4: Final decision Note: Write in full sentences, i.e., expand on what rejecting or not rejecting the null hypothesis means in the context of the question provided. Accept the null hypothesis because the p-value is greater than 0.05 and conclude that mean bone density is the same across the five patient groups. 2. [6 points] A study reported the effects of chewing one piece of nicotine gum (containing 2 mg nicotine) on tic frequency in patients whose Tourette’s syndrome. Note: Tourette syndrome is a

3 neurological disorder characterized by repetitive, stereotyped, involuntary movements and vocalizations called tics. Tic frequencies were measured one each patient under four conditions: (1) baseline, (2) 30 minutes of gum chewing, (3) 0 to 30 minutes after gum chewing, (4) 30 to 60 minutes after gum chewing. A total of 10 patients were included in the study. Can we conclude that the mean number of tics differs among the four conditions? Assume the level of significance,  = 0.05. Download data from D2L ( ticfreq.csv ). Step 0: Identify and state an appropriate statistical test. Repeated ANOVA Step 1: Identify the dependent and independent variables and state the null hypothesis and alternate hypothesis Note: Write in full sentences i.e., write out hypotheses in the context of the question provided. Independent (treatment) variable: Time After Chewing Gum Dependent (outcome) variable: Number of Tics Null hypothesis H 0 : 𝑢 ???𝑒𝑙𝑖𝑛𝑒 = 𝑢 2 = 𝑢 3 = 𝑢 4 Alternate hypothesis H A : Not all means are equal. Step 2: Compute the test statistics. Note: You are encouraged to use R to perform any calculations required for the test statistic (variance ratio). You may then verify the manually computed test statistic with the one generated in the next step. Also, copy-and-paste R commands (avoid including screenshots). # Include R commands here ticfreq <- read.csv("ticfreq.csv") ticfreq$TIME <- factor(ticfreq$TIME) ticfreq$SUBJ <- factor(ticfreq$SUBJ) baseline <- subset(ticfreq, ticfreq$TIME == "BASELINE") gum <- subset(ticfreq, ticfreq$TIME == "GUM") min30 <- subset(ticfreq, ticfreq$TIME == "MIN30") min60 <- subset(ticfreq, ticfreq$TIME =="MIN60") ticstotal <- var((ticfreq$PATIENT))*(nrow(ticfreq$PATIENT)-1) sstime <- ((mean(baseline$TIC)-mean(ticfreq$TIC))^2+(mean(gum$TIC)-mean(ticfreq$TIC))^2+(mean(min30$TIC)-mean(ticfreq$ TIC))^2+(mean(min60$TIC)-mean(ticfreq$TIC))^2)*10 ticsstime <- ((mean(baseline$TIC)-mean(ticfreq$TIC))^2+(mean(gum$TIC)-mean(ticfreq$TIC))^2+(mean(min30$TIC)-mean(ticfre q$TIC))^2+(mean(min60$TIC)-mean(ticfreq$TIC))^2)*10 patient1 <- subset(ticfreq, ticfreq$PATIENT == 1) patient2 <- subset(ticfreq, ticfreq$PATIENT == 2) patient3 <- subset(ticfreq, ticfreq$PATIENT == 3) patient4 <- subset(ticfreq, ticfreq$PATIENT == 4) patient5 <- subset(ticfreq, ticfreq$PATIENT == 5) patient6 <- subset(ticfreq, ticfreq$PATIENT == 6) patient7 <- subset(ticfreq, ticfreq$PATIENT == 7) patient8 <- subset(ticfreq, ticfreq$PATIENT == 8) patient9 <- subset(ticfreq, ticfreq$PATIENT == 9) patient10 <- subset(ticfreq, ticfreq$PATIENT == 10) ticssubject <- ((mean(patient1$TIC)-mean(ticfreq$TIC))^2+(mean(patient2$TIC)-mean(ticfreq$TIC))^2+(mean(patient3$TIC)-me an(ticfreq$TIC))^2+(mean(patient4$TIC)-mean(ticfreq$TIC))^2+(mean(patient5$TIC)-mean(ticfreq$TIC))^2+(mean(patient6$TIC

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

4 )-mean(ticfreq$TIC))^2+(mean(patient7$TIC)-mean(ticfreq$TIC))^2+(mean(patient8$TIC)-mean(ticfreq$TIC))^2+(mean(patient9 $TIC)-mean(ticfreq$TIC))^2+(mean(patient10$TIC)-mean(ticfreq$TIC))^2)*4 ticsse <- ticstotal -ticsstime - ticssubject ticmstime <- ticsstime /3 ticmssubject <- ticssubject /9 ticmserror <- ticsse/27 ticvar <- ticmstime/ticmserror # Paste R outputs here -0.7489465 Step 3: Statistical decision Use R to run the test. # Include R commands here ticfreq <- read.csv("ticfreq.csv sapply(ticfreq, class) # Paste R outputs here PATIENT TIME TIC "integer" "character" "integer" ticfreq$TIME <- factor(ticfreq$TIME) ticfreq$SUBJ <- factor(ticfreq$SUBJ) summary(aov(ticfreq$TIC~ticfreq$TIME + Error(ticfreq$PATIENT/ticfreq$TIME))) Error: ticfreq$PATIENT Df Sum Sq Mean Sq F value Pr(>F) Residuals 1 11898 11898 Error: ticfreq$PATIENT:ticfreq$TIME Df Sum Sq Mean Sq ticfreq$TIME 3 65190 21730 Error: Within Df Sum Sq Mean Sq F value Pr(>F) ticfreq$TIME 3 155097 51699 0.905 0.449 Residuals 32 1827335 57104 Step 4: Final decision Note: Write in full sentences, i.e., expand on what rejecting or not rejecting the null hypothesis means in the context of the question provided. The p-value is greater than 0.05 so we accept the null hypothesis. This means that there is no significant difference in mean tic frequency across the four treatment groups. 3. [6 points] A study examined the beta-leucocyte count (x 10 9 per L) in 51 subjects with colorectal cancer and 19 healthy controls. The cancer patients were also classified into Dukes’ classification (A, B, C) for colorectal cancer which gives doctors a guide to the risk, following surgery, of the cancer coming back or spreading to other parts of the body. An additional category (D) identified patients with disease that had not been completely resected. So, overall, we have five groups – A, B, C, D, and control. Perform an analysis of these data ( colorectal.csv on D2L) in which you identify the sources of variability and determine if these data provide sufficient evidence to indicate that, on average, leucocyte counts differ among the five categories . Assume  = 0.01. Also , u se Tukey’s procedure to test for significant differences between individual pairs of sample means.

5 Step 0: Identify and state an appropriate statistical test. One-Way ANOVA Step 1: Identify the dependent and independent variables and state the null hypothesis and alternate hypothesis Note: Write in full sentences i.e., write out hypotheses in the context of the question provided. Independent (treatment) variable: Classification of Colorectal Cancer Dependent (outcome) variable: Leucocyte Count Null hypothesis H 0 : 𝑢 ℎ𝑒?𝑙?ℎ𝑦 = 𝑢 ? = 𝑢 ? = 𝑢 ? = 𝑢 ? Alternate hypothesis H A : Not all means are equal. Step 2: Compute the test statistics. Note: You are encouraged to use R to perform any calculations required for the test statistic (variance ratio). You may then verify the manually computed test statistic with the one generated in the next step. Also, copy-and-paste R commands (avoid including screenshots). # Include R commands here A <- filter(colorectal, Group == "A") B <- filter(colorectal, Group == "B") C <- filter(colorectal, Group == "C") D <- filter(colorectal, Group == "D") healthy <- filter(colorectal, Group == "healthy") crssa<-(8*(mean(A$Count)-mean(colorectal$Count))^2)+(18*(mean(B$Count)-mean(colorectal$Count))^2)+ (16*(mean(C$Count)-mean(colorectal$Count))^2)+(9*(mean(D$Count)-mean(colorectal$Count))^2)+ (24*(mean(healthy$Count)-mean(colorectal$Count))^2) crmsa <- crssa/4 crssw <- var(A$Count)*7+var(B$Count)*17+var(C$Count)*15+var(D$Count)*8+var(healthy$Count)*23 crmsw <- crssw /70 crmsa/crmsw # Paste R outputs here 7.038628 Step 3: Statistical decision Use R to run the test. # Include R commands here Colorectal <- read.csv("colorectal.csv") Summary(colorectal) # Paste R outputs here Count Group Min.: 2.600 Length:75 1st Qu.: 5.600 Class: character Median: 6.900 Mode: character Mean: 7.147 3rd Qu.: 8.300 Max. :13.700 Sapply(colorectal, class)

6 Count Group "numeric" "character" anova(lm(colorectal$Count ~colorectal$Group)) Analysis of Variance Table Response: colorectal$Count Df Sum Sq Mean Sq F value Pr(>F) colorectal$Group 4 86.725 21.6812 7.0386 8.026e-05 *** Residuals 70 215.622 3.0803 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 oneway.test(colorectal$Count ~ colorectal$Group) One-way analysis of means (not assuming equal variances) data: colorectal$Count and colorectal$Group F = 7.3463, num df = 4.000, denom df = 24.993, p-value = 0.0004684 Step 4: Final decision Note: Write in full sentences, i.e., expand on what rejecting or not rejecting the null hypothesis means in the context of the question provided. We reject the null hypothesis because the p-value is less than 0.01. This means that on average the leucocyte counts differ significantly between each classification of colorectal cancer. 4. [6 points] A study of pulmonary effects on guinea pigs exposed 18 ovalbumin-sensitized guinea pigs and 18 non-sensitized guinea pigs to three treatments: regular air, benzaldehyde, and acetaldehyde. At the end of exposure, the guinea pigs were anesthetized, and allergic responses were assessed in bronchoalveolar lavage (BAL). Data ( guinea.csv ) shows the alveolar cell count (x 10 6 ) by treatment group for the ovalbumin-sensitized and non-sensitized guinea pigs. After eliminating sensitization effects, can we conclude, based on these data, that mean BAL significantly varies between the three treatment groups ? Assume the level of significance,  = 0.05. Hints: • Sensitization is a blocking factor, implying we have 2 blocks (sensitized and sensitized), 18 subjects per block. This would mean that variability across blocks is characterized as (sum of squares of each block mean from the grand mean) * ( 18 subjects in each block) • There are 12 subjects per treatment. So, variability across treatments is characterized as (sum of squares of each treatment group mean from the grand mean) * ( 12 subjects in each treatment group) • Since we have more than one subject in each block, the residual degrees of freedom are: (total_df – treatment_df – block_df). Step 0: Identify and state an appropriate statistical test. Two-Way ANOVA Step 1: Identify the dependent and independent variables and state the null hypothesis and alternate hypothesis Note: Write in full sentences i.e., write out hypotheses in the context of the question provided. Independent (treatment) variable: Treatment (Regular Air, Benzaldehyde, Acetaldehyde)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

7 Dependent (outcome) variable: Alveolar Cell Count Null hypothesis H 0 : The mean BAL is the same for all treatment groups. Alternate hypothesis H A : The mean BAL is not the same for all treatment groups. Step 2: Compute the test statistics. Note: You are encouraged to use R to perform any calculations required for the test statistic (variance ratio). You may then verify the manually computed test statistic with the one generated in the next step. Also, copy-and-paste R commands (avoid including screenshots). # Include R commands here guinea$Sens <- factor(guinea$Sens, labels = c("0","1")) guinea$Treat <- factor(guinea$Treat, labels = c("1","2","3")) sapply(guinea, class) # Paste R outputs here Sens Treat Count "factor" "factor" "numeric" sstotal <- var(guinea$Count)*(nrow(guinea)-1) desense <- subset(guinea, Sens == 0) sense <- subset(guinea, Sens == 1) ssblock <- ((mean(desense$Count)-mean(guinea$Count))^2+(mean(sense$Count)-mean(guinea$Count))^2)*18 act <- subset(guinea, Treat == 1) air <- subset(guinea, Treat == 2) benz <- subset(guinea, Treat == 3) sstreat <- ((mean(act$Count)-mean(guinea$Count))^2+(mean(air$Count)-mean(guinea$Count))^2+(mean(benz$Count)-mean(g uinea$Count))^2)*12 sse <- sstotal - ssblock - sstreat mse <- sse /34 mstreat <- sstreat / 2 ratio <- mstreat / mse ratio 10.70355 Step 3: Statistical decision Use R to run the test. # Include R commands here summary(aov(Count~Treat+Sens, data=guinea)) # Paste R outputs here Df Sum Sq Mean Sq F value Pr(>F) Treat 2 7689 3844 10.07 0.000404 *** Sens 1 7906 7906 20.72 7.28e-05 *** Residuals 32 12212 382 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Step 4: Final decision Note: Write in full sentences, i.e., expand on what rejecting or not rejecting the null hypothesis means in the context of the question provided. We reject the null hypothesis because the p-value is less than 0.05. This means that the guineas pigs BAL differs between each treatment group.

8 5. [1 point] Statement of Attribution and Challenges: Please list and describe any resource or external support used to complete this assignment. You should clearly specify and attribute any source of support, including but not limited to, the use of online and physical materials, software tools and websites, human tutors, digital tutors, programming assistants, and any other form of artificial intelligence (AI) tools. You are also welcome to include questions or general comments/challenges that came up while you were doing this assignment. Bella and I worked together on this. I also used the notes very closely to assist with my work.