Assignment-2_F2023

Rmd

School

Toronto Metropolitan University *

*We aren’t endorsed by this school

Course

123

Subject

Medicine

Date

Dec 6, 2023

Type

Rmd

Pages

6

Uploaded by AmbassadorValorFrog11

Report
--- title: 'CIND 123 - Data Analytics: Basic Methods' author: output: html_document: default word_document: default pdf_document: default --- <center> <h1> Assignment 2 (10%) </h1> </center> <center> <h3> [Insert your full name] </h3> </center> <center> <h3> [Insert course section & student number] </h3> </center> --- ## Instructions This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. Review this website for more details on using R Markdown <http://rmarkdown.rstudio.com>. Use RStudio for this assignment. Complete the assignment by inserting your R code wherever you see the string "#INSERT YOUR ANSWER HERE". When you click the **Knit** button, a document (PDF, Word, or HTML format) will be generated that includes both the assignment content as well as the output of any embedded R code chunks. Submit **both** the rmd and generated output files. Failing to submit both files will be subject to mark deduction. ## Sample Question and Solution Use `seq()` to create the vector $(100,97\ldots,4)$. ```{r} seq(100, 3, -3) ``` ## Question 1 (40 points)
The Titanic Passenger Survival Data Set provides information on the fate of passengers on the fatal maiden voyage of the ocean liner "Titanic." The dataset is available from the Department of Biostatistics at the Vanderbilt University School of Medicine (https://biostat.app.vumc.org/wiki/pub/Main/DataSets/titanic3.csv ) in several formats. Store the Titanic Data Set `titanic_train` using the following commands. ```{r} install.packages("titanic") library(titanic) titanicDataset <- read.csv(file = "https://biostat.app.vumc.org/wiki/pub/Main/DataSets/titanic3.csv ", stringsAsFactors = F) str(titanicDataset) ``` a) Extract and show the columns `cabin`, `age`, `embarked` and `pclass` into a new data frame of the name 'titanicSubset'. (5 points) ```{r} #INSERT YOUR ANSWER HERE titanicSubset <- titanicDataset[,c("cabin", "age", "embarked", "pclass")] titanicSubset ``` b) Numerical data: Use the count() function from the `dplyr` package to display the total number of passengers that survived or not. (5 points) HINT: To count the occurrences of survived or not in the titanicDataset data frame using the `dplyr` package, you can use the pipe operator (%>%) to chain operations. ```{r} #INSERT YOUR ANSWER HERE library(dplyr) titanicDataset %>% count(survived) ```
c) Categorical data: Use count() and group_by() functions from the `dplyr` package to calculate the number of passengers by `embarked`. (5 points) HINT: Use group_by() first then pipe the result to count() to calculate the number of passengers. ```{r} #INSERT YOUR ANSWER HERE library(dplyr) titanicDataset %>% group_by(embarked) %>% count(embarked) ``` d) Find the passengers in data frame whose embarked information is an empty character (""), and fill them by the most frequent embarked value. (3 points) ```{r} #INSERT YOUR ANSWER HERE ``` e) Use the aggregate() function to calculate the 'survivalCount' of each `embarked` and calculate the survival rate of each embarked. Then draw the conclusion on which embarked has the higher survival rate. (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` f) Use boxplot to display the distribution of fare for each pcalss and infer which passenger class is more expensive. (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` g) Calculate the average fare for three pclass and describe if the calculation agrees with the box plot. (5 points)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
```{r} #INSERT YOUR ANSWER HERE ``` h) Use the for loop and if control statements to list the menss failure is 0.05. We know that if one engine fails, the whole system stops. a) What is the probability that the system operates without failure? (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` b) Use the Binomial approximation to calculate the probability that at least 3 engines are defective? (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` c) What is the probability that the second engine (B) is defective given the first engine (A) is not defective, i.e., P(B is defective|A is not defective), while we know that the first and second engines are independent. (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` ## Question 3 (25 points) On average, John visits his parents 4 times a month a) Find the probabilities that John visits his parents 1 to 6 times in a month? (5 points) ```{r} #INSERT YOUR ANSWER HERE ```
b) Find the probability that John visits his parents 3 times or more in a month? (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` c) Compare the similarity between Binomial and Poisson distribution. (15 points @ 5 point each) 1) Create 100,000 samples for a Binomial random variable using parameters described in Question 2 2) Create 100,000 samples for a Poisson random variable using parameters described in Question 3 3) then illustrate on how well the Poisson probability distribution approximates the Binomial probability distribution. HINT: use multhist() from the 'plotrix' package ```{r} #INSERT YOUR ANSWER HERE ``` ## Question 4 (20 points) Write a script in R to compute the following probabilities of a normal random variable with mean 9 and variance 25 a) The probability that it lies between 8.2 and 17.3 (inclusive) (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` b) The probability that it is greater than 15.02 (5 points) ```{r} #INSERT YOUR ANSWER HERE ```
c) The probability that it is less than or equal to 11.8 (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` d) The probability that it is less than 10 or greater than 13 (5 points) ```{r} #INSERT YOUR ANSWER HERE ``` END of Assignment #2.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help