Assignment #8

docx

School

University of Toronto *

*We aren’t endorsed by this school

Course

343

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

7

Uploaded by MagistrateRookMaster3708

Report
1. Independent variables: The first independent variable is Age and it is divided into a younger and older category. The second variable is Film and it is divided into “Despicable me” and “Momento.” Dependent variable: The dependent variable is engagement. Null hypothesis: There is no mean difference between age and film on movie engagement. Alternative hypothesis: There is a mean difference between age and film on movie engagement. Structure of the ANOVA: I will be using the factor() function because I can label the two independent variables to help R run the test easily. Our tests include categorical variables with different levels. This function will allow me to specify the levels of the factor variables. The factor() function will help me examine and compare labeled group means in a simple way. 2. > AGEmovies = factor(Movies$age, levels = c("young", "old")) > AGEmovies [1] young young young young young young young young young young young young young young young young young [18] young young young old old old old old old old old old old old old old old [35] old old old old old old Levels: young old > FILMmovies= factor(Movies$film, levels = c("Despicable me", "Memento")) > FILMmovies [1] Despicable me Despicable me Despicable me Despicable me Despicable me Despicable me Despicable me [8] Despicable me Despicable me Despicable me Memento Memento Memento Memento [15] Memento Memento Memento Memento Memento Memento Despicable me [22] Despicable me Despicable me Despicable me Despicable me Despicable me Despicable me Despicable me [29] Despicable me Despicable me Memento Memento Memento Memento Memento [36] Memento Memento Memento Memento Memento Levels: Despicable me Memento
> by(Movies$engagement, AGEmovies, stat.desc) AGEmovies: young nbr.val nbr.null nbr.na min max range sum median 20.0000000 0.0000000 0.0000000 10.0000000 37.0000000 27.0000000 430.0000000 20.5000000 mean SE.mean CI.mean.0.95 var std.dev coef.var 21.5000000 1.6598985 3.4742076 55.1052632 7.4232919 0.3452694 --------------------------------------------------------------------------------- AGEmovies: old nbr.val nbr.null nbr.na min max range sum median 20.0000000 0.0000000 0.0000000 3.0000000 36.0000000 33.0000000 371.0000000 17.5000000 mean SE.mean CI.mean.0.95 var std.dev coef.var 18.5500000 2.0176524 4.2229949 81.4184211 9.0232157 0.4864267 > by(Movies$engagement, FILMmovies, stat.desc) FILMmovies: Despicable me nbr.val nbr.null nbr.na min max range sum median 20.0000000 0.0000000 0.0000000 3.0000000 24.0000000 21.0000000 296.0000000 15.0000000 mean SE.mean CI.mean.0.95 var std.dev coef.var 14.8000000 1.2806248 2.6803786 32.8000000 5.7271284 0.3869681 --------------------------------------------------------------------------------- FILMmovies: Memento nbr.val nbr.null nbr.na min max range sum median 20.0000000 0.0000000 0.0000000 14.0000000 37.0000000 23.0000000 505.0000000 24.5000000 mean SE.mean CI.mean.0.95 var std.dev coef.var 25.2500000 1.5941918 3.3366817 50.8289474 7.1294423 0.2823542 This data shows me the differences of engagement levels between the variables. For example, the mean engagement for younger participants =21.5, whereas the mean engagement for older participants = 18.55. With these values I can see that younger participants on average engage more than older participants. Additionally, the standard deviation of engagement for younger participants = 7.4232919 which is a smaller deviation than older participants with a std.dev= 9.0232157. This shows me that the data points among younger participants are closer together and not as distant from its mean in comparison to older participants. Moreover, the mean of engagement for despicable me = 14.8000000 and the mean for Memento = 25.2500000.
> by(Movies$engagement, list(Movies$age, Movies$film), stat.desc,basic=FALSE) : old : Despicable me median mean SE.mean CI.mean.0.95 var 14.0000000 12.4000000 1.8330303 4.1466026 33.6000000 std.dev coef.var 5.7965507 0.4674638 ------------------------------------------------------ : young : Despicable me median mean SE.mean CI.mean.0.95 var 17.0000000 17.2000000 1.5114379 3.4191100 22.8444444 std.dev coef.var 4.7795862 0.2778829 ------------------------------------------------------ : old : Memento median mean SE.mean CI.mean.0.95 var 24.0000000 24.7000000 2.3288051 5.2681232 54.2333333 std.dev coef.var 7.3643284 0.2981509 ------------------------------------------------------ : young : Memento median mean SE.mean CI.mean.0.95 var 25.500000 25.800000 2.289105 5.178314 52.400000 std.dev coef.var 7.238784 0.280573 To get the descriptives considering the DV and all the IV I will run the by() function. Despicable me - old Mean = 12.4000000 Std.dev = 5.7965507 Despicable me - young Mean = 17.2000000 Std.dev = 4.7795862 Memento - old Mean = 24.7000000
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Std.dev = 7.3643284 Memento - young Mean = 25.800000 Std.dev = 7.238784 > leveneTest(Movies$engagement, AGEmovies, center = median) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 1 0.8761 0.3552 38 DF = 1, F value = 0.8761, p value = 0.3552 - We have met the assumptions of homogeneity of variance and we should accept the null that states no significant difference between the group means. The p value is larger than 0.05 and the fvalue is smaller than 1 showing no significant correlation. > leveneTest(Movies$engagement, FILMmovies, center = median) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 1 1.8051 0.1871 38 DF = 1, F value = 1.8051, p value = 0.1817 - We have met the assumptions of homogeneity of variance and we should accept the null that states no significant difference between the group means. The p value is larger than 0.05 and the fvalue is smaller than 1 showing no significant correlation. > leveneTest(Movies$engagement, interaction (AGEmovies, FILMmovies), center =median) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 0.8311 0.4856 36 DF = 3, F value = 0.8311, p value = 0.4856 - We have met the assumptions of homogeneity of variance and we should accept the null that states no significant difference between the group means. The p value is larger than 0.05 and the fvalue is smaller than 1 showing no significant correlation.
3. > BOXPLOT<-ggplot(Movies, aes(AGEmovies,Movies$engagement)) > BOXPLOT+geom_boxplot()+facet_wrap(~film)+labs(x="age",y="engagement") 4. The medians, represented by the central lines within the boxes, exhibit a greater separation for "Despicable Me" compared to "Memento." The boxes for "Despicable Me" are less overlapping, suggesting a wider spread of engagement ratings, whereas the boxes for "Memento" show more overlap, indicating a narrower range.The overlapping boxes in "Memento" imply a lack of statistical significance due to minimal variability and extensive overlap. Despite a lesser degree of overlap in the boxes for "Despicable Me," the groups still share common ranges, resulting in a lack of statistical significance and similar overall patterns. For "Despicable Me," both young and old participants display shorter boxes, suggesting similar engagement ratings. In contrast, for "Memento," the box is longer for younger participants, indicating more varied engagement ratings, while the shorter box for older participants suggests more uniform ratings.Confirming these observations, additional tests such as Levene's test show no significant differences between the groups, meeting the assumption of homogeneity. The visual inspection of the boxplot further supports this, as the overlapping boxes signal non-
significance. With no statistical significance observed, the null hypothesis stands, asserting no meaningful differences between the groups. 5. Yes I will running followup tests because If Levene's test for the equality of variances shows non-significant results, it implies that the assumption of homogeneity is upheld, and the variances are approximately equal across the groups. Conducting t-tests without adjustments for unequal variance is appropriate. I believe that running follow up teste is necessary majority of the times and should be enforced regardless of results from an ANOVA. > Followup = aov (Movies$engagement ~ Movies$age + Movies$film + Movies$age*Movies$film, data = Movies) > summary(Followup) Df Sum Sq Mean Sq F value Pr(>F) Movies$age 1 87.0 87.0 2.135 0.153 Movies$film 1 1092.0 1092.0 26.785 8.78e-06 *** Movies$age:Movies$film 1 34.2 34.2 0.839 0.366 Residuals 36 1467.7 40.8 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > emmeans(Followup, specs = ~age*film) age film emmean SE df lower.CL upper.CL old Despicable me 12.4 2.02 36 8.3 16.5 young Despicable me 17.2 2.02 36 13.1 21.3 old Memento 24.7 2.02 36 20.6 28.8 young Memento 25.8 2.02 36 21.7 29.9 Confidence level used: 0.95 > pairs(emmeans, adjust = "holm", simple= "age") film = Despicable me: contrast estimate SE df t.ratio p.value old - young -4.8 2.86 36 -1.681 0.1014 film = Memento: contrast estimate SE df t.ratio p.value old - young -1.1 2.86 36 -0.385 0.7023
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
> pairs(emmeans, adjust = "holm", simple= "film") age = old: contrast estimate SE df t.ratio p.value Despicable me - Memento -12.3 2.86 36 -4.307 0.0001 age = young: contrast estimate SE df t.ratio p.value Despicable me - Memento -8.6 2.86 36 -3.012 0.0047