19BCE1567_EDA_LAB4
pdf
keyboard_arrow_up
School
University of South Carolina *
*We aren’t endorsed by this school
Course
MISC
Subject
Statistics
Date
Apr 3, 2024
Type
Pages
3
Uploaded by JudgeDeerMaster933
19BCE1567 03/02/2022 SARA KULKARNI L21+L22 SLOT: PROF LAKHSMI PATHI EDA LAB 4 Tasks for Week-4: Analysis of Variance (ANOVA) Aim:
Perform ANOVA test and determine the statistical differences between the means of individual groups given in the data ALGORITHM: 1.
Start 2.
Read the data into the data variable 3.
Group the data with respect to color using the group_by command in dplyr library, summarize the count and mean for the column responses 4.
Generate the ANOVA model using the ANOVA command, display the summary 5.
The F-value is less than 0.05, we reject the null hypothesis 6.
Using Tukey HSD (Tukey Honest Significant Differences) we compare the p-adj value for the 3 groups (red,blue,green) STATISTICS: > head(data)
block color response
1 a red 1.9
2 b red 2.6
3 c red 3.4
4 d red 0.8
5 e red 5.3
6 f red 1.5
> group_by(data,color) %>% summarise(count = n(),mean = mean(response, na.rm = TRUE))
# A tibble: 3 x 3
color count mean
<chr> <int> <dbl>
1 blue 24 10.6 2 green 24 8.53
3 red 24 2.49
> ANOVA <- aov(response~color, data = data)
> summary(ANOVA)
Df Sum Sq Mean Sq F value Pr(>F) color 2 857.2 428.6 14.81 4.44e-06 ***
Residuals 69 1996.4 28.9 ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> TukeyHSD(ANOVA)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = response ~ color, data = data)
$color
diff lwr upr p adj
green-blue -2.101667 -5.821045 1.617711 0.3709119
red-blue -8.140417 -11.859795 -4.421039 0.0000049
red-green -6.038750 -9.758128 -2.319372 0.0006628
INFERENCE: We can infer that on performing ANOVA , we get the p-value as 4.44e-06 (which is less than 0.05) , so we can reject the null hypothesis.
Using Tukey HSD (Tukey Honest Significant Differences) we compare the p-adj values of the different groups PROGRAM: # To clear the environment rm(list=ls()) # To create the data data <-read.csv("color-
anova.csv") head(data) library(dplyr) # To group the data group_by(data,color) %>% summarise(count = n(),mean = mean(response, na.rm = TRUE)) # ANOVA ANOVA <- aov(response~color, data = data) summary(ANOVA) TukeyHSD(ANOVA)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Related Questions
Look at the MS Excel file (titled “Easy_Tough”) available in the Blackboard. This file carries data about average MathSAT scores of 297 colleges and universities located in the USA and they are all classified into two groups of colleges and universities. One group of institutions are tough to graduate for students and another group of institutions are easy to graduate for students. Using MS Excel, perform analysis of variance test to examine whether the mean MathSAT score between these two groups are same or not.
this is the data on excel
Easy to graduate colleges
Tough to graduate colleges
547
495
550
526
451
480
470
533
617
429
554
529
524
568
620
550
650
535
519
506
486
540
494
509
513
459
615
526
524
377
525
543
608
490
537
513
550
466
520
544
540
502
497
480
520
531
533
470
455
400
525
494
603
395
578
445
459
460
525
372
575
504
564
480
509
520
530
542
670
522
590
440
590
510
532
419
585
330
514…
arrow_forward
Is there a relationship between one’s gender and whether one owns a dog, cat, or reptile? Use the data provided in the table below to answer the following question. -Show all the relevant statistical output
How do I enter this data in SPSS and which test do I run? I input one column for each level but I am so confused and my output is weird looking. Do I run the One Way ANOVA in order to be able to calculate the effect size?
Dog
Cat
Reptile
Row Totals
Male
20
17
11
48
Female
25
23
5
53
Column totals
45
40
16
101
arrow_forward
Sketch a possible smoothed histogram for a data set that has one mode and whose mean is greater than its median.
arrow_forward
Courtney and Lexi wondered if the distribution of color was the same for name-brand gummy bears (Haribo Gold) and store-
brand gummy bears (Great Value). To investigate, they randomly selected 6 bags of each type and counted the number of
gummy bears of each color. The data are presented in the table.
Courtney and Lexi use a chi-square test for homogeneity to assess differences between the gummy bear brands. Do these data
provide convincing evidence that the distributions of color differ for name-brand gummy bears and store-brand gummy bears
at the a = 0.05 level?
Survey Type
Which of the following is false?
Name
Store
Total
Red
137
212
349
O Because the P-value of 0.7698 > a = 0.05, we fail
Green
53
104
157
to reject Ho. There is not convincing evidence of a
difference in the distribution of color for name-
Yellow
50
85
135
Color
Orange
brand and store-brand gummy bears.
81
127
208
White
52
O x? = 1.81
94
146
Total
O The hypotheses are Ho: There is no difference in
373
622
995
the…
arrow_forward
[Dataset: NES2004A_Student, Variables: enviro_therm, age] The results of an independent samples t-test are shown below. Are older people (those 30 or older) less sympathetic to the environmental movement than younger people (those younger than 30)? Or do younger people and older people not differ significantly in their feelings toward environmentalists? Using the SPSS results, test your hypothesis.
Table 3: Group Statistics
Respondent age
N
Mean
Std. Deviation
Std. Error Mean
Feeling Thermometer: environmentalists
>= 30
838
65.99
20.268
.700
< 30
205
66.20
19.923
1.391
Table 4: Independent Sample Test
Levene's Test for Equality of Variance
t-test for Equality of Means
F
Sig.
t
df
Sig.
Mean Difference
Feeling Thermometer: environmentalists
Equal variances assumed
.095
.758
-.132
1041
.895
-.208
Equal variances not assumed
-.134
315.445
.894
-.208
Formulate a null…
arrow_forward
Plz choose correct option. And explain.
arrow_forward
In a completely randomized design, 12 experimental units were used for the first treatment 15 for the second treatment, and 20 for the third treatment. Comple the following analysis of variance (to 2 decimals, if necessary).
Treatments F ?
Treatment P-Value ?
Thank you
arrow_forward
A research team used a latin square design to test three drugs A, B, C for their effect in alleviating the symptoms of a chronic disease. Three patients are available for a trial and each will be available for three weeks. The data for drug effects are given in the parentheses. Please make an ANOVA table including source of variation, sum of squares, degree of freedom, mean square, F-ratio and p-values.
arrow_forward
Please help
arrow_forward
{200, 400, 800, 1000, 2000} 1. Calculate mean and variance2. Normalize data by min-max normalization with min = 0 & max = 103. In z-score normalization, what value should first number (200) betransformed to?
arrow_forward
What source of variation is found in an ANOVA summary table for a within subjects design that is not in an ANOVA summary table for a between subjects design. What happens to this source of variation in a between-subjects design?
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Related Questions
- Look at the MS Excel file (titled “Easy_Tough”) available in the Blackboard. This file carries data about average MathSAT scores of 297 colleges and universities located in the USA and they are all classified into two groups of colleges and universities. One group of institutions are tough to graduate for students and another group of institutions are easy to graduate for students. Using MS Excel, perform analysis of variance test to examine whether the mean MathSAT score between these two groups are same or not. this is the data on excel Easy to graduate colleges Tough to graduate colleges 547 495 550 526 451 480 470 533 617 429 554 529 524 568 620 550 650 535 519 506 486 540 494 509 513 459 615 526 524 377 525 543 608 490 537 513 550 466 520 544 540 502 497 480 520 531 533 470 455 400 525 494 603 395 578 445 459 460 525 372 575 504 564 480 509 520 530 542 670 522 590 440 590 510 532 419 585 330 514…arrow_forwardIs there a relationship between one’s gender and whether one owns a dog, cat, or reptile? Use the data provided in the table below to answer the following question. -Show all the relevant statistical output How do I enter this data in SPSS and which test do I run? I input one column for each level but I am so confused and my output is weird looking. Do I run the One Way ANOVA in order to be able to calculate the effect size? Dog Cat Reptile Row Totals Male 20 17 11 48 Female 25 23 5 53 Column totals 45 40 16 101arrow_forwardSketch a possible smoothed histogram for a data set that has one mode and whose mean is greater than its median.arrow_forward
- Courtney and Lexi wondered if the distribution of color was the same for name-brand gummy bears (Haribo Gold) and store- brand gummy bears (Great Value). To investigate, they randomly selected 6 bags of each type and counted the number of gummy bears of each color. The data are presented in the table. Courtney and Lexi use a chi-square test for homogeneity to assess differences between the gummy bear brands. Do these data provide convincing evidence that the distributions of color differ for name-brand gummy bears and store-brand gummy bears at the a = 0.05 level? Survey Type Which of the following is false? Name Store Total Red 137 212 349 O Because the P-value of 0.7698 > a = 0.05, we fail Green 53 104 157 to reject Ho. There is not convincing evidence of a difference in the distribution of color for name- Yellow 50 85 135 Color Orange brand and store-brand gummy bears. 81 127 208 White 52 O x? = 1.81 94 146 Total O The hypotheses are Ho: There is no difference in 373 622 995 the…arrow_forward[Dataset: NES2004A_Student, Variables: enviro_therm, age] The results of an independent samples t-test are shown below. Are older people (those 30 or older) less sympathetic to the environmental movement than younger people (those younger than 30)? Or do younger people and older people not differ significantly in their feelings toward environmentalists? Using the SPSS results, test your hypothesis. Table 3: Group Statistics Respondent age N Mean Std. Deviation Std. Error Mean Feeling Thermometer: environmentalists >= 30 838 65.99 20.268 .700 < 30 205 66.20 19.923 1.391 Table 4: Independent Sample Test Levene's Test for Equality of Variance t-test for Equality of Means F Sig. t df Sig. Mean Difference Feeling Thermometer: environmentalists Equal variances assumed .095 .758 -.132 1041 .895 -.208 Equal variances not assumed -.134 315.445 .894 -.208 Formulate a null…arrow_forwardPlz choose correct option. And explain.arrow_forward
- In a completely randomized design, 12 experimental units were used for the first treatment 15 for the second treatment, and 20 for the third treatment. Comple the following analysis of variance (to 2 decimals, if necessary). Treatments F ? Treatment P-Value ? Thank youarrow_forwardA research team used a latin square design to test three drugs A, B, C for their effect in alleviating the symptoms of a chronic disease. Three patients are available for a trial and each will be available for three weeks. The data for drug effects are given in the parentheses. Please make an ANOVA table including source of variation, sum of squares, degree of freedom, mean square, F-ratio and p-values.arrow_forwardPlease helparrow_forward
- {200, 400, 800, 1000, 2000} 1. Calculate mean and variance2. Normalize data by min-max normalization with min = 0 & max = 103. In z-score normalization, what value should first number (200) betransformed to?arrow_forwardWhat source of variation is found in an ANOVA summary table for a within subjects design that is not in an ANOVA summary table for a between subjects design. What happens to this source of variation in a between-subjects design?arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill