quiz

Rmd

School

University of Washington *

*We aren’t endorsed by this school

Course

200

Subject

Statistics

Date

Feb 20, 2024

Type

Rmd

Pages

Uploaded by CommodoreSeahorseMaster945

--- title: "text" author: "han wang" date: "2024/1/31" output: pdf_document: default word_document: default html_document: df_print: paged --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(tidyverse) ## Include the command to load the data here!!!! ``` * * * ## Question 1 ### a) mean ： 41 median ： 28 mode: 26 ### b) spread: 55 Sample Standard Deviation:19.9 Interquartile Range 27 ### c) The distribution of the given numbers is skewed to the right 3. This is because the mean is greater than the median, and the tail of the distribution is longer on the right side. ### d) in this case, the distribution is skewed to the right and has some outliers, so the median is a better measure of center. The interquartile range (IQR) is a more robust measure of spread,It is less sensitive to outliers and gives a better sense of the spread of the middle 50% of the data ### e) New Mean: The new mean will be 6.34 New Median: The new median will be 4.4 New Sample Standard Deviation: The new sample standard deviation will be 2.83 New Interquartile Range: The new interquartile range will be 3.04 ## Question 2 :

###b) # Load the dataset data <- read.csv("Handout1.csv") # Calculate the mean, median, IQR, and sd of COMMIT mean_commit <- mean(data$COMMIT) median_commit <- median(data$COMMIT) iqr_commit <- IQR(data$COMMIT) sd_commit <- sd(data$COMMIT) # Print the results cat("Mean of COMMIT: ", mean_commit, "\n") cat("Median of COMMIT: ", median_commit, "\n") cat("Interquartile range of COMMIT: ", iqr_commit, "\n") cat("Standard deviation of COMMIT: ", sd_commit, "\n") ### c) # Load the dataset data <- read.csv("Handout1.csv") # Create the contingency table table <- table(data$SCHTYPE, data$SEX) # Print the table print(table) ### d) # Load the dataset data <- read.csv("Handout1.csv") # Subset the data by sex males <- data[data$SEX == "M", ] females <- data[data$SEX == "F", ] # Calculate the mean, median, IQR, and sd of COMMIT for males and females mean_commit_males <- mean(males$COMMIT) median_commit_males <- median(males$COMMIT) iqr_commit_males <- IQR(males$COMMIT) sd_commit_males <- sd(males$COMMIT) mean_commit_females <- mean(females$COMMIT) median_commit_females <- median(females$COMMIT) iqr_commit_females <- IQR(females$COMMIT) sd_commit_females <- sd(females$COMMIT) # Print the results cat("Males:\n") cat("Mean of COMMIT: ", mean_commit_males, "\n") cat("Median of COMMIT: ", median_commit_males, "\n") cat("Interquartile range of COMMIT: ", iqr_commit_males, "\n") cat("Standard deviation of COMMIT: ", sd_commit_males, "\n") cat("\nFemales:\n") cat("Mean of COMMIT: ", mean_commit_females, "\n") cat("Median of COMMIT: ", median_commit_females, "\n") cat("Interquartile range of COMMIT: ", iqr_commit_females, "\n") cat("Standard deviation of COMMIT: ", sd_commit_females, "\n") # Create the boxplots

boxplot(males$COMMIT, main="Males", ylab="COMMIT") boxplot(females$COMMIT, main="Females", ylab="COMMIT") ### Question 3 ###a) The data suggests that the distribution of the number of friends of users of this new social media site is right-skewed. This is because the average friend count is higher than the median friend count, which indicates that there are some users with a very high number of friends that are pulling the mean up. ###b) The life span of the general population is unimodal and skewed to the right. This is because most people live to an old age, but there are some people who live to a very old age, which pulls the distribution to the right. ###c) The standard deviation will be exactly zero when all the values in the dataset are the same. The standard deviation cannot be negative because it is a measure of the spread of the data, and the spread cannot be negative. ###d) The sampling method that would be used to select microchips from a production line for inspection for bent probes is systematic sampling. This is because every 100th chip is selected for inspection, which is a fixed interval. Systematic sampling is a type of probability sampling where every kth element in the population is selected for inclusion in the sample, where k is a fixed interval.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version