Q4

pdf

School

University of Waterloo *

*We aren’t endorsed by this school

Course

442

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

4

Uploaded by BailiffJellyfishMaster1195

Report
STAT 442/842 CM 762: Assignment 2 DUE: Wednesday November 2, 2022 by 11:59pm EST NOTES This assignment has four parts - All programming questions. Each part should be submitted as a separate PDF document. Your assignment must be submitted by the due date listed at the top of this document, and it must be submitted electronically in .pdf format via Crowdmark. This means that your responses for different question parts should begin on separate pages of your .pdf file. Note that your .pdf solution file must have been generated by R Markdown. Additionally: Organization and comprehensibility is part of a full solution. Consequently, points will be deducted for solutions that are not organized and incomprehensible. Furthermore, if you submit your assignment to Crowdmark, but you do so incorrectly in any way (e.g., you upload your Question 2 solution in the Question 1 box), you will receive a 5% deduction (i.e., 5% of the assignment’s point total will be deducted from your point total). 1
Question 4 - Violin Plots with Embedded Statistical Comparisons (12 marks) Subset the NBA Player Boxscore dataset to only include the players K. Towns , K. Irving , T. Young , L. James , and K. Durant . dat2 = subset(dat, athlete_short_name %in% c("K. Towns","K. Irving","T. Young","L. James","K. Durant")) Make a side-by-side violin plot of these five players’ points ( pts ) scored in each game with the ggbetween command (4 marks) with the dots making the team_color of each player (5 marks) (This will need a dive into theme ) with an appropriate set of labels and titles (3 marks) See: https://r-graph-gallery.com/web-violinplot-with-ggstatsplot.html for guidance on how to display comparison stats. (The package ggstatsplot takes a long time to install.) Your graph should look like this with five violins instead of three. library(tidyverse) ## -- Attaching packages --------------------------------------- tidyverse 1.3.2 -- ## v ggplot2 3.3.6 v purrr 0.3.4 ## v tibble 3.1.8 v dplyr 1.0.10 ## v tidyr 1.2.1 v stringr 1.4.1 ## v readr 2.1.3 v forcats 0.5.2 ## -- Conflicts ------------------------------------------ tidyverse_conflicts() -- ## x dplyr::filter() masks stats::filter() ## x dplyr::lag() masks stats::lag() 2
library(ggstatsplot) ## You can cite this package as: ## Patil, I. (2021). Visualizations with statistical details: The ' ggstatsplot ' approach. ## Journal of Open Source Software, 6(61), 3167, doi:10.21105/joss.03167 library(palmerpenguins) data <- read.csv( ' NBA_Player_Boxscore_2021-22.csv ' ) data2 <- subset(data, athlete_short_name %in% c( "K. Towns" , "K. Irving" , "T. Young" , "L. James" , "K. Durant" filter(athlete_display_name != ' Thaddeus Young ' ) %>% arrange(athlete_short_name) colors = data2 %>% group_by(athlete_display_name) %>% summarise( team_color = first(team_color)) %>% arrange(athlete_display_name) %>% pull(team_color) colors = paste0( ' # ' ,colors) p4 <- ggbetweenstats( data= data2, x = athlete_display_name, y = pts, xlab = "Athlete Names" , ylab = "Points" , title = "Comparison of Points across top athletes" ) + scale_color_hue() + scale_color_manual( values = colors) ## Scale for ' colour ' is already present. Adding another scale for ' colour ' , ## which will replace the existing scale. ## Scale for ' colour ' is already present. Adding another scale for ' colour ' , ## which will replace the existing scale. p4 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
μ mean = 24.17 μ mean = 29.63 μ mean = 26.70 μ mean = 30.16 μ mean = 27.38 p Holm - adj. = 0.02 p Holm - adj. = 2.67e-03 20 40 60 Karl-Anthony Towns (n = 81) Kevin Durant (n = 59) Kyrie Irving (n = 33) LeBron James (n = 57) Trae Young (n = 82) Athlete Names Points Pairwise test: Games-Howell , Bars shown: significant F Welch ( 4 , 126.63 29 = 5.97 , p = 1.97e-04 , ϖ p 2 = 0.13 , CI 95% [0.04 , 1.00] , n obs = 312 Comparison of Points across top athletes log e ( BF 01 29 = -2.84 , R 2 Bayesian posterior = 0.06 , CI 95% HDI [0.00 , 0.10] , r Cauchy JZS = 0.71 4