in Rstudio for lab: Normal distribution. ## Lab report #### Load data: ```{r load-data} #load(url("https://stat.duke.edu/~mc301/data/ames.RData")) download.file("http://www.openintro.org/stat/data/bdims.RData", destfile = "bdims.RData") load("bdims.RData") head(bdims) mdims <- subset(bdims, sex == 1) fdims <- subset(bdims, sex == 0) ``` #### Set a seed: ``` {r set-seed} set.seed(42) ``` #### Exercise 1: ```{r ex1} hist(mdims$hgt, col = 'red') hist(fdims$hgt, col = 'blue') mhgtmean <- mean(mdims$hgt) fhgtmean <- mean(fdims$hgt) mhgtsd <- sd(mdims$hgt) fhgtsd <- sd(fdims$hgt) ``` #### Exercise 2: ```{r ex2} hist(fdims$hgt, probability = TRUE, ylim = c(0, 0.06)) x <- 140:190 y <- dnorm(x = x, mean = fhgtmean, sd = fhgtsd) lines(x = x, y = y, col = "blue") ``` #### Exercise 3: ```{r ex3} qqnorm(fdims$hgt) qqline(fdims$hgt) sim_norm <- rnorm(n = length(fdims$hgt), mean = fhgtmean, sd = fhgtsd) qqnorm(sim_norm) qqline(sim_norm) ``` #### Exercise 4: ```{r ex4} qqnormsim(fdims$hgt) ``` #### Exercise 5: ```{r ex5} hist(fdims$wgt, col = 'blue') qqnorm(fdims$wgt) qqline(fdims$wgt) ``` Write out two probability questions that you would like to answer; one regarding female heights andone regarding female weights. Calculate the those probabilities using both the theoretical normaldistribution as well as the empirical distribution (four probabilities in all). Which variable, heightor weight, had a closer agreement between the two methods? Why do you think this variable hadtheoretical normal probabilities closer the emperical probabilities?
in Rstudio for lab:
## Lab report #### Load data: ```{r load-data}
#load(url("https://stat.duke.edu/~mc301/data/ames.RData")) download.file("http://www.openintro.org/stat/data/bdims.RData", destfile = "bdims.RData") load("bdims.RData") head(bdims) mdims <- subset(bdims, sex == 1) fdims <- subset(bdims, sex == 0)
```
#### Set a seed:
```
{r set-seed} set.seed(42)
```
#### Exercise 1:
```{r ex1}
hist(mdims$hgt, col = 'red') hist(fdims$hgt, col = 'blue') mhgtmean <-
```
#### Exercise 2:
```{r ex2} hist(fdims$hgt,
```
#### Exercise 3:
```{r ex3}
qqnorm(fdims$hgt)
qqline(fdims$hgt)
sim_norm <- rnorm(n = length(fdims$hgt), mean = fhgtmean, sd = fhgtsd)
qqnorm(sim_norm)
qqline(sim_norm)
```
#### Exercise 4:
```{r ex4}
qqnormsim(fdims$hgt)
```
#### Exercise 5:
```{r ex5}
hist(fdims$wgt, col = 'blue')
qqnorm(fdims$wgt)
qqline(fdims$wgt)
```
Write out two probability questions that you would like to answer; one regarding female heights and
one regarding female weights. Calculate the those probabilities using both the theoretical normal
distribution as well as the empirical distribution (four probabilities in all). Which variable, height
or weight, had a closer agreement between the two methods? Why do you think this variable had
theoretical normal probabilities closer the emperical probabilities?
Unlock instant AI solutions
Tap the button
to generate a solution