Prof. Garcia
SDS 201: Lecture notes
February 26th, 2018
Agenda
1. HW #4 due on Wednesday (will do 1 problem)
2. Initial Project Proposals due in one week
3. More on hypothesis testing
4. Simulation
Warm-up: Hypothesis Testing for the Mites
Our goal for this randomization simulatio was
to assess the likelihood that exposure to mites was associated,
to a statistically significant
degree,
with a decrease in wilt disease after exposure to Verticillium, a fungus that causes wilt disease.
library
(mosaic)
tally
(outcome
~
treatment,
data
= Mites)
##
treatment
## outcome
mites no mites
##
no wilt
15
4
##
wilt
11
17
tally
(outcome
~
treatment,
data
= Mites,
format
=
"proportion"
)
##
treatment
## outcome
mites
no mites
##
no wilt 0.5769231 0.1904762
##
wilt
0.4230769 0.8095238
tbl
<-
tally
(outcome
~
treatment,
data
= Mites,
format
=
"proportion"
)
obs_diff_prop
<-
tbl[
2
,
2
]
-
tbl[
2
,
1
]
obs_diff_prop
## [1] 0.3864469
null_dist
<-
do
(
5000
)
*
tally
(outcome
~
shuffle
(treatment),
data
= Mites)
null_dist
<-
null_dist
%>%
mutate
(
prop_wilt_nomites
= wilt.no.mites
/
(wilt.no.mites
+
no.wilt.no.mites))
%>%
mutate
(
prop_wilt_mites
= wilt.mites
/
(wilt.mites
+
no.wilt.mites))
%>%
mutate
(
diff_prop
= prop_wilt_nomites
-
prop_wilt_mites)
ggplot
(
data
= null_dist,
aes
(diff_prop))
+
geom_histogram
(
bins
=
10
)
qdata
(
~
diff_prop,
p
=
c
(
0.025
,
0.975
),
data
= null_dist)
##
quantile
p
## 2.5%
-0.3021978 0.025
## 97.5%
0.3003663 0.975
2
*
pdata
(
~
diff_prop,
q
= obs_diff_prop,
data
= null_dist,
lower.tail
=
FALSE
)
## [1] 0.002
1. What was the
null hypothesis
for your simulation?
2. What was the
test statistic
?
3. Where did the test statistic lie in the
null distribution
?
4. Did this evidence cause you to
reject
or
fail to reject
the null hypothesis?
5. Write
one
sentence to your grandpa summarizing what you’ve learned about mites and wilt
disease.