lab01_solutions
pdf
keyboard_arrow_up
School
University of California, Berkeley *
*We aren’t endorsed by this school
Course
102
Subject
Computer Science
Date
Feb 20, 2024
Type
Pages
19
Uploaded by ProfComputer848
lab01_solutions
October 4, 2022
1
Lab 1: Basics of Testing
Welcome to the first Data 102 lab!
The goals of this lab are to get familiar with concepts in decision theory. We will learn more about
testing, p-values and FDR control.
The code you need to write is commented out with a message
“TODO: fill…”
. There is additional
documentation for each part as you go along.
1.1
Collaboration Policy
Data science is a collaborative activity. While you may talk with others about the labs, we ask that
you
write your solutions individually
.
If you do discuss the assignments with others please
include their names
in the cell below.
1.2
Submission
To submit this assignment, rerun the notebook from scratch (by selecting Kernel > Restart & Run
all), and then print as a pdf (File > download as > pdf) and submit it to Gradescope.
For full credit, this assignment should be completed and submitted before Friday,
September 9, 2022 at 11:59 PM. PST
1.3
Collaborators
Write the names of your collaborators in this cell.
<Collaborator Name> <Collaborator e-mail>
2
Setup
Let’s
begin
by
importing
the
libraries
we
will
use.
You
can
find
the
documentation
for
the
libraries
here:
*
matplotlib:
https://matplotlib.org/3.1.1/contents.html
*
numpy:
https://docs.scipy.org/doc/ * pandas: https://pandas.pydata.org/pandas-docs/stable/ * seaborn:
https://seaborn.pydata.org/
[1]:
import
matplotlib.pyplot
as
plt
import
numpy
as
np
import
pandas
as
pd
import
seaborn
as
sns
1
import
scipy.stats
from
scipy.stats
import
norm
import
hashlib
%
matplotlib
inline
sns
.
set(style
=
"dark"
)
plt
.
style
.
use(
"ggplot"
)
def
get_hash
(num):
# <- helper function for assessing correctness
return
hashlib
.
md5(
str
(num)
.
encode())
.
hexdigest()
3
Question 1: Hypothesis testing, LRT, decision rules, P-values.
The first question looks at the basics of testing. You will have to put yourself in the shoes of a
detective who is trying to use ‘evidence’ to find the ‘truth’. Given a piece of evidence
𝑋
your job
will be to decide between two hypotheses. The two hypothesis you consider are:
The null hypothesis:
𝐻
0
∶ 𝑋 ∼ 𝒩(0, 1)
The alternative hypothesis:
𝐻
1
∶ 𝑋 ∼ 𝒩(2, 1)
Granted you don’t know the truth, but you have to make a decision that maximizes the True
Positive Probability and minimizes the False Positive Probability.
In this exercise you will look at:
- The intuitive relationship between Likelihood Ratio Test
and decisions based on thresholding
𝑋
. - The performance of a level-
𝛼
test. - The distribution of
p-values for samples from the null distribution as well as samples from the alternative.
Let’s start by plotting the distributions of the null and alternative hypothesis.
[2]:
# NOTE: you just need to run this cell to plot the pdf; don't change this code.
def
null_pdf
(x):
return
norm
.
pdf(x,
0
,
1
)
def
alt_pdf
(x):
return
norm
.
pdf(x,
2
,
1
)
# Plot the distribution under the null and alternative
x_axis
=
np
.
arange(
-4
,
6
,
0.001
)
plt
.
plot(x_axis, null_pdf(x_axis), label
=
'$H_0$'
)
# <- likelihood under the
␣
↪
null
plt
.
fill_between(x_axis, null_pdf(x_axis), alpha
= 0.3
)
plt
.
plot(x_axis, alt_pdf(x_axis),
label
=
'$H_1$'
)
# <- likelihood alternative
plt
.
fill_between(x_axis, alt_pdf(x_axis), alpha
= 0.3
)
2
plt
.
xlabel(
"X"
)
plt
.
ylabel(
"Likelihood"
)
plt
.
title(
"Comparison of null and alternative likelihoods"
);
plt
.
legend()
plt
.
tight_layout()
plt
.
show()
By inspecting the image above we can see that if the data lies towards the right, then it seems
more plausible that the alternative is true. For example
𝑋 ≥ 1.64
seems much less likely to belong
to the null pdf than the alternative pdf.
3.0.1
Likelihood Ratio Test
In class we said that the optimal test is the Likelihood Ratio Test (LRT), which is the result of the
celebrated Neyman-Pearson Lemma. It says that the optimal level
𝛼
test is the one that rejects
the null (aka makes a discovery, favors the alternative) whenever:
𝐿𝑅(𝑥) ∶=
𝑓
1
(𝑥)
𝑓
0
(𝑥)
≥ 𝜂
where
𝜂
is chosen such that the false positive rate is equal to
𝛼
.
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
3.0.2
But how does this result fit with the intuition that we should set a decision
threshold based on the value of
𝑋
directly?
This exercise will formalize that intuition:
Let’s start by computing the ratio of the likelihoods. The likelihood of
𝑋 ∼ 𝒩(𝜇, 𝜎)
is:
𝑓
𝜎,𝜇
(𝑥) =
1
𝜎
√
2𝜋
𝑒
−
(𝑥−𝜇)
2
2𝜎
2
Luckily
scipy
has a nifty function to compute the likelihood of gaussians
scipy.norm.pdf(x, mu,
sigma)
3.1
Part 1.a: Calculate likelihood ratios
Complete the function below that computes the likelihood ratio for any value
x
.
[3]:
# TODO: fill in the missing expression for the likelihood ratio in the function
␣
↪
below
def
calculate_likelihood_ratio
(x):
"""
Computes the likelihood ratio between the alternative and null hypothesis.
Inputs:
x: value for which to compute the likelihood ratio
Outputs:
lr : the likelihood ratio at point x
"""
L0
=
null_pdf(x)
L1
=
alt_pdf(x)
LR
=
L1
/
L0
# TODO: fill the likelihood ratio
return
LR
[4]:
# Compute the likelihood ratio for X=1.64
X
=1.64
LR
=
calculate_likelihood_ratio(X)
print
(LR)
assert
(get_hash(LR)
==
'f9983e1a6585502f3006cb6d1c1edec3'
)
print
(
"Test passed!"
)
3.59663972556928
Test passed!
Let’s plot the likelihood ratios for different values of
𝑋
:
[5]:
# The code below plots the LR for different values of X
# Once you've filled in `calculate_likelihood_ratio` run this cell and inspect
␣
↪
the plot
4
x_axis
=
np
.
arange(
-1
,
3
,
0.001
)
plt
.
plot(x_axis, calculate_likelihood_ratio(x_axis))
plt
.
vlines(X,
0
, LR, linestyle
=
"dotted"
, color
=
'k'
)
plt
.
hlines(LR,
-1
, X, linestyle
=
"dotted"
, color
=
'k'
)
plt
.
scatter(X, LR,
30
, color
=
'k'
)
plt
.
xlabel(
"X"
)
plt
.
ylabel(
"Likelihood Ratio"
)
plt
.
title(
"Comparison of null and alternative likelihoods"
);
plt
.
tight_layout()
plt
.
show()
The plot above illustrates that deciding based on LRT with
𝜂 = 3.6
(the dotted horizontal line)
is equivalent to deciding in the favor of the alternative whenever
𝑋 ≥ 1.64
(the dotted vertical
line). The set
[1.64, +∞)
is called the
rejection region
of the test, because for all X values in the
rejection region the test rejects the null in the favor of the alternative. This illustrates that our
intuition was correct.
When thinking in terms of likelihood ratios it seems very tricky to compute the False Positive Rate
(FPR), however in this case we can bypass that by testing based on the value of
𝑋
.
5
The figure below illustrates pictorially the FPR when testing based on the threshold
𝑋 ≥ 1.64
[6]:
x_axis
=
np
.
arange(
-4
,
5
,
0.001
)
plt
.
plot(x_axis, null_pdf(x_axis), label
=
'$H_0$'
)
# <- likelihood under the
␣
↪
null
plt
.
plot(x_axis, alt_pdf(x_axis),
label
=
'$H_1$'
)
# <- likelihood alternative
rejection_region
=
np
.
arange(X,
5
,
0.001
)
# <- truncate the true rejection
␣
↪
region for plotting purposes
plt
.
fill_between(rejection_region, null_pdf(rejection_region), alpha
= 0.3
,
␣
↪
label
=
"FPR"
)
plt
.
xlabel(
"X"
)
plt
.
ylabel(
"Likelihood"
)
plt
.
title(
"Comparison of null and alternative likelihoods"
);
plt
.
legend()
plt
.
tight_layout()
plt
.
show()
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Under the null hypothesis it is still possible to observe values in the tail of distribution.
The
probability of that happening is exactly FPR (and it is illustrated by the shaded area under the
null curve). Assume that we are using the test that rejects for all
𝑋 ≥ 𝜏
(in the example above,
𝜏 = 1.64
). FPR can then be computed as:
𝐹𝑃𝑅(𝜏) = ℙ{𝑋 > 𝜏|𝐻
0
is true
} = 1 − ℙ{𝑋 < 𝜏|𝐻
0
is true
} = 1 − 𝐹(𝜏)
where
𝐹(⋅)
denotes the CDF of the null distributions, which in this case is the standard gaussian.
3.2
Part 1.b: Calculate the probability of False Positives
In the cell below calculate the FPR for this test.
Hint
, the cdf of a standard normal might come in handy for this function:
scipy.stats.norm.cdf
[7]:
# TODO: fill in the missing expression for FPR
def
calculate_fpr
(tau):
"""
Calculates the FPR for the test based on thresholding X>=tau.
It assumes that the null distribution is the standard gaussian N(0,1)
Inputs:
tau: test threshold
Outputs:
fpr: false positive rate
"""
fpr
= 1 -
norm
.
cdf(tau)
# TODO: fill in
return
fpr
[8]:
# Once you've filled `calculate_fpr` you can run this to test for correctness
thresholds
=
[
0
,
0.5
,
1
,
2
,
3
]
fpr_vals
=
calculate_fpr(thresholds)
hash_list
=
[
'd310cb367d993fb6fb584b198a2fd72c'
,
'77d8304f0ac6b94895e8061eff588d52'
,
'df26eb4f07782680a6d98b89313aadc1'
,
'e575e59c20eeeda5d6a29f5ed3e1e2c7'
,
'f233728ef2007eef8f97f8432837feee'
]
assert
[get_hash(fpr)
==
hash_list[i]
for
(i, fpr)
in
enumerate
(fpr_vals)]
print
(
"Test passed!!!"
)
Test passed!!!
Let’s now compute the FPR for a test that rejects whenever
𝑋 ≥ 1.64
[9]:
# Calculate the false positive rate
X
= 1.64
fpr
=
calculate_fpr(X)
print
(fpr)
7
0.050502583474103746
Looks like we got really lucky!!! The threshold we choose has a FPR of
∼ 5%
.
3.3
Part 1.c: Make a level-
𝛼
decision rule
Complete the function below to create a decision rule that has
𝐹𝑃𝑅 = 𝛼
.
Given that we are
working with Gaussian this test is exactly the Z-score test that you might have learned in previous
classes.
Hint
,
the
inverse
cdf
of
a
standard
normal
might
come
in
handy
for
this
function:
scipy.stats.norm.ppf
[10]:
# TODO: complete this function
def
make_decision
(X, alpha):
"""
Makes a decision whether to reject the null hypothesis for point X at level
␣
↪
alpha
Inputs:
X: point at which to test
alpha: desired FPR rate for the decision rule (also known as
␣
↪
significance level)
Outputs:
decision: {0, 1} or {False, True}
1/True: reject the null
0/False: fail to reject the null
"""
threshold
=
norm
.
ppf(
1 -
alpha)
# TODO: compute the threshold for which the
␣
↪
FPR of the test is equal to alpha (see Hint)
decision
=
X
>=
threshold
# TODO: compute the decision; 1 stands for
␣
↪
rejecting the null
return
decision
[11]:
# Once you've filled `make_decision` run this to perform this test
# for a few values of X at different levels alpha
X_vals
=
np
.
array([
0
,
0.5
,
1
,
2
,
3
])
alphas
=
np
.
array([
0.01
,
0.05
,
0.1
,
0.2
])
for
alpha
in
alphas:
decisions
=
make_decision(X_vals, alpha)
print
(
f'At FPR=
{
alpha
}
the null hypothesis is rejected for these X values:
␣
↪
{
X_vals[decisions
==1
]
}
'
)
At FPR=0.01 the null hypothesis is rejected for these X values: [3.]
At FPR=0.05 the null hypothesis is rejected for these X values: [2. 3.]
At FPR=0.1 the null hypothesis is rejected for these X values: [2. 3.]
8
At FPR=0.2 the null hypothesis is rejected for these X values: [1. 2. 3.]
[12]:
# Running correctness tests: Do not modify
hashes
=
[
'cfcd208495d565ef66e7dff9f98764da'
,
'c4ca4238a0b923820dcc509a6f75849b'
]
hashes_ids
=
[
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
1
,
0
,
1
,
1
,
1
,
1
,
1
,
1
,
1
]
import
itertools
inputs
=
itertools
.
product(X_vals, alphas)
outputs
=
[
int
(make_decision(
*
input
))
for
input
in
inputs]
for
(i, output)
in
enumerate
(outputs):
assert
(get_hash(output)
==
hashes[hashes_ids[i]])
print
(
"Test passed!!!"
)
Test passed!!!
3.4
Part 1.d: Compute P-values
Let’s take a step back and look at what have we accomplished. We came up with a decision rule
that rejects the null hypothesis for a piece of evidence
𝑋
. The test is parametrized at a level
𝛼
chosen a-priori to reflect our aversion to False Positives.
However, testing returns a binary output:
Reject
or
Fail to reject
(1 or 0). In the example above,
at level
𝛼 = 0.01
we reject the null only for
𝑋 = 3
, however at level
𝛼 = 0.05
we reject the null
for
𝑋 = 2
as well. We have already seen that increasing the FPR increases the rejection region
of the test.
However you might wonder for
𝑋 = 2
, what is the smallest
𝛼
level, such that the
corresponding test rejects the null hypothesis in the favor of the alternative?
P-values try to answer exactly that question:
“Given a point
𝑋
, and a family of tests parametrized by
𝛼
, what is the smallest
𝛼
for
which the test rejects the null?”
?(𝑋) =
min
𝛼
∶ 𝐷𝑒𝑐𝑖𝑠𝑖??
𝛼
(𝑋) = 1
Hence, P-values tell us something more than just a binary accept/reject answer.
The P-value
associated with the point
𝑋
quantifies the
strength of the evidence in the favor of rejecting the null
.
Small P-values suggest that the evidence is significant, while large P-values suggest that there is
little evidence.
3.4.1
In the cell below write a function that computes the P-values, for a point
𝑋
.
Hint
: You already wrote that function in one of the previous exercises, it just had a different name.
[13]:
# TODO: complete this function
def
calculate_p_value
(X):
"""
Calculates the P-values for the point X
Inputs:
X: data point
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Outputs:
p_value: P(X)
"""
p_value
= 1 -
norm
.
cdf(X)
# TODO: fill in
return
(p_value)
[14]:
# Once you've filled `calculate_p_value`, run this to compute P-Values for a
␣
↪
few X samples.
X_vals
=
np
.
array([
0
,
0.5
,
1
,
2
,
3
])
for
X
in
X_vals:
print
(
f'X =
{
X
}
, P(X) =
{
calculate_p_value(X)
}
'
)
X = 0.0, P(X) = 0.5
X = 0.5, P(X) = 0.3085375387259869
X = 1.0, P(X) = 0.15865525393145707
X = 2.0, P(X) = 0.02275013194817921
X = 3.0, P(X) = 0.0013498980316301035
[15]:
# Running correctness tests: Do not modify
p_vals
=
calculate_p_value(X_vals)
hash_list
=
[
'd310cb367d993fb6fb584b198a2fd72c'
,
'77d8304f0ac6b94895e8061eff588d52'
,
'df26eb4f07782680a6d98b89313aadc1'
,
'e575e59c20eeeda5d6a29f5ed3e1e2c7'
,
'f233728ef2007eef8f97f8432837feee'
]
assert
[get_hash(pv)
==
hash_list[i]
for
(i, pv)
in
enumerate
(p_vals)]
print
(
"Test passed!!!"
)
Test passed!!!
3.5
Part 1.e: Distribution of p-values
Now, we are going to imagine that we have a bunch of samples (each drawn either from the null
distribution or the alternative distribution). We want to predict whether each sample was generated
from
𝐻
0
or
𝐻
1
by looking at its p-value. As a reminder, the two hypothesis to consider are:
The null hypothesis:
𝐻
0
∶ 𝑋 ∼ 𝒩(0, 1)
The alternative hypothesis:
𝐻
1
∶ 𝑋 ∼ 𝒩(2, 1)
Assume there are
? = 10000
draws, approximatively 80% of which are nulls (Reality = 0).
[16]:
# NOTE: you just need to run this cell to instantiate variables; don't change
␣
↪
this code.
rs
=
np
.
random
.
RandomState(
0
)
10
n
= 10000
# roughly 80% of the data comes from the null distribution
# true_values is an n-dimensional array of indicators, where "1" means that x
␣
↪
is from the alternative
true_values
=
rs
.
binomial(
1
,
0.2
, n)
# null distribution is N(0, 1) and alternative distribution is N(2, 1)
x_obs
=
rs
.
randn(n)
+ 2*
true_values
sns
.
histplot(x_obs[np
.
where(true_values
== 0
)],
label
=
"samples from null $H_0$
␣
↪
distribution"
, kde
=
False
, color
=
'orange'
)
sns
.
histplot(x_obs[np
.
where(true_values
== 1
)],
label
=
"samples from alt. $H_1$
␣
↪
distribution"
, kde
=
False
, color
=
'blue'
)
plt
.
xlabel(
"x"
)
plt
.
ylabel(
"frequency"
)
plt
.
legend(bbox_to_anchor
=
(
1
,
1
));
[17]:
# NOTE: you just need to run this cell and understand what it does; no code to
␣
↪
modify or write here.
# calculate the p-values for each individual hypothesis
p_values
=
calculate_p_value(x_obs)
bins
=
np
.
linspace(
0
,
1
,num
=20
)
sns
.
histplot(p_values[np
.
where(true_values
== 0
)],
label
=
"samples from null
␣
↪
$H_0$ distribution"
, kde
=
False
, bins
=
bins, color
=
'orange'
)
11
sns
.
histplot(p_values[np
.
where(true_values
== 1
)],
label
=
"samples from alt.
␣
↪
$H_1$ distribution"
, kde
=
False
, bins
=
bins)
plt
.
legend(bbox_to_anchor
=
(
1
,
1
))
plt
.
xlabel(
"p-value"
)
plt
.
ylabel(
"frequency"
);
3.5.1
What do you notice?
TODO
:
fill in (<=2 sentences) of your observations. Your answer should connect what
you see in the graph with the ideas we’ve been talking about in class.
The p-value is always uniform under the null hypothesis.
Since the distribution of the p-values
drawn from H1 is heavily skewed towards zero, we can be fairly certain they are not drawn from
H0 even without seeing the labels.
4
Question 2: Multiple Testing - Procedures to control false dis-
covery rate.
In the previous example we looked primarily at controling row-wise quantities. And specifically we
came up with a decision rules that controls the false positive rate to a desired level
𝛼
.
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Now we are switching perspectives and are thiking about a column-wise quantity. Our goal is to
control the probability of false discoveries in this decision-making process for multiple hypothesis
testing.
We will use three methods for making discoveries:
1. Naive thresholding (ignoring that multiple testing is happening)
2. Using Bonferroni correction to account for multiple testing
3. The Benjamini-Hochberg procedure for multiple testing
For each method, we will assess the decisions made on a simulated data set.
4.1
Part 2.a: Fill in the following functions regarding confusion matrices.
These functions will be important for reporting your results in a standardized way; later code
assumes that you have implemented them so start here.
[18]:
# TODO: complete this function
def
report_results
(predicted_discoveries, truth):
"""
Produces a dictionary with counts for the true positives, true negatives,
false negatives, and false positives from the input `predicted_discoveries`
and `truth` arrays.
Inputs:
predicted discoveries: array of 0/1 values where 1 indicates a
␣
↪
"discovery".
truth: array of 0/1 values where 1 indicates a draw from the alternative.
Outputs: a dictionary of TN, TP, FN, and FP counts.
"""
# populate the following dictionary with counts (NOT rates)
# TODO: fill in each of these counts
TP_count
=
sum
(predicted_discoveries
*
truth)
# TODO: fill in
TN_count
=
sum
((
1-
predicted_discoveries)
*
(
1-
truth))
# TODO: fill in
FP_count
=
sum
((predicted_discoveries)
*
(
1-
truth))
# TODO: fill in
FN_count
=
sum
((
1-
predicted_discoveries)
*
(truth))
# TODO: fill in
results_dictionary
=
{
"TN_count"
: TN_count,
"TP_count"
: TP_count,
"FN_count"
: FN_count,
"FP_count"
: FP_count,
}
# this function is defined for you below
print_confusion_matrix(results_dictionary)
return
results_dictionary
13
# TODO: complete this function
def
print_false_discovery_fraction
(results_dictionary):
total_predicted_discoveries
=
results_dictionary[
"FP_count"
]
+
␣
↪
results_dictionary[
"TP_count"
]
# TODO: fill in
false_predicted_discoveries
=
results_dictionary[
"FP_count"
]
# TODO: fill in
# TODO: fill in - compute the false discovery fraction from the `results`
␣
↪
dictionary
false_discovery_frac
=
false_predicted_discoveries
/
↪
total_predicted_discoveries
# TODO: fill in
print
(
"total discoveries:
{0}
"
.
format(total_predicted_discoveries))
print
(
"fraction of discoveries which were actually false:
{0:.3f}
"
.
↪
format(false_discovery_frac))
return
total_predicted_discoveries, false_discovery_frac
def
print_confusion_matrix
(res_dict):
# This is a helper function to print the confusion matrix. You don't need
␣
↪
to modify this code.
results_df
=
pd
.
DataFrame(data
=
{
"Decision = 0"
: [res_dict[
'TN_count'
],
␣
↪
res_dict[
'FN_count'
]],
"Decision = 1"
:
[res_dict[
'FP_count'
],
␣
↪
res_dict[
'TP_count'
]]},
index
=
[
"Truth = 0"
,
"Truth = 1"
])
print
(results_df)
4.2
Part 2.b: Naive thresholding
Here we will investigate the result of using the threshold
𝛼 = 0.05
to test each hypothesis indepen-
dently, ignoring that we are in a multiple testing scenario.
Fill in the code for the function below to test each hypothesis at significance level
𝛼
.
Hint
this is very simular to the
make_decision
function you wrote in Problem 1. There the input
to the test was the sample value
𝑋
, however here the input is P-value:
?(𝑋)
.
[19]:
# TODO: calculate decisions based on thresholding
def
naive_alpha_threshold
(p_values, alpha):
"""
Returns decisions on p-values using naive (uncorrected) thresholding.
Inputs:
p_values: array of p-values
alpha: threshold (significance level)
Returns:
decisions: binary array of same length as p-values, where
␣
↪
`decisions[i]` is 1
14
if `p_values[i]` is deemed significant at level `alpha`, and 0 otherwize
"""
# TODO: fill in all
decisions
=
p_values
<=
alpha
# TODO: fill in
return
decisions
[20]:
# Once you've filled in `naive_alpha_threshold`, run this cell to print the
␣
↪
results.
# set alpha
alpha
= 0.05
# Using the p-values from Part 1.e, we compute the decision according to the
␣
↪
naive function
naive_decisions
=
naive_alpha_threshold(p_values, alpha)
results
=
report_results(naive_decisions,true_values)
print
()
print_false_discovery_fraction(results)
print
()
Decision = 0
Decision = 1
Truth = 0
7628
400
Truth = 1
679
1293
total discoveries: 1693
fraction of discoveries which were actually false: 0.236
4.3
Part 2.c: Bonferroni Correction
Here we will investigate the result of using Bonferroni-corrected p-values to declare discoveries.
First, implement the Bonferroni procedure in the function below.
Recall that for testing
?
hypotheses with family-wise error rate (FWER)
≤ 𝛼
, the resulting proce-
dure is to test each hypothesis with significance
𝛼
𝑛
.
[21]:
# TODO: calculate the decisions based on the bonferroni correction procedure.
def
bonferroni
(p_values, alpha_total):
"""
Returns decisions on p-values using the Bonferroni correction.
Inputs:
p_values: array of p-values
alpha_total: desired family-wise error rate (FWER = P(at least one
␣
↪
false discovery))
Returns:
15
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
decisions: binary array of same length as p-values, where
␣
↪
`decisions[i]` is 1
if `p_values[i]` is deemed significant, and 0 otherwise
"""
# TODO: fill in all
n
=
len
(p_values)
decisions
=
p_values
<=
alpha
/
n
# TODO: fill in
return
decisions
[22]:
# Once you've filled in `bonferroni`, run this cell to print the results.
bonferroni_decisions
=
bonferroni(p_values, alpha)
results
=
report_results(bonferroni_decisions,true_values)
print
()
print_false_discovery_fraction(results)
print
()
Decision = 0
Decision = 1
Truth = 0
8028
0
Truth = 1
1957
15
total discoveries: 15
fraction of discoveries which were actually false: 0.000
4.4
Part 2.d: Benjamini-Hochberg
Now we will investigate the result of implementing Benjamini-Hochberg procedure for multiple
hypothesis testing. First, implement the Benjamini-Hochberg procedure in the function below.
Recall that for testing
?
hypotheses with false discovery rate (FDR)
≤ 𝛼
, the resulting procedure
is to find the largest
𝑘
such that the
𝑘
𝑡ℎ
-largest of the
?
p-values is less than or equal to
𝑘
𝛼
𝑛
:
𝑃
(𝑘)
≤ 𝑘
𝛼
?
We then declare a discovery for all p-values with value less than or equal to this
𝑘
𝑡ℎ
p-value.
[23]:
# TODO: calculate decisions based on Benjamini-Hochberg procedure
def
benjamini_hochberg
(p_values, alpha):
"""
Returns decisions on p-values using Benjamini-Hochberg.
Inputs:
p_values: array of p-values
alpha: desired FDR (FDR = E[# false positives / # positives])
Returns:
16
decisions: binary array of same length as p-values, where
␣
↪
`decisions[i]` is 1
if `p_values[i]` is deemed significant, and 0 otherwise
"""
# TODO: fill in all
n
=
len
(p_values)
sorted_p
=
np
.
sort(p_values)
max_k
=
max
([k
for
k
in
range
(n)
if
sorted_p[k]
<=
(k
+ 1
)
*
(alpha
/
n)])
threshold
=
sorted_p[max_k]
decisions
=
p_values
<=
threshold
return
decisions
Now, assess the result of applying the Benjamini Hochberg procedure to the simulated data.
[24]:
# Once you've filled in `benjamini_hochberg`, run this cell to print the
␣
↪
results.
bh_decisions
=
benjamini_hochberg(p_values, alpha)
bh_results
=
report_results(bh_decisions,true_values)
print
()
print_false_discovery_fraction(bh_results)
Decision = 0
Decision = 1
Truth = 0
8011
17
Truth = 1
1581
391
total discoveries: 408
fraction of discoveries which were actually false: 0.042
[24]:
(408, 0.041666666666666664)
4.5
Part 2.e: Conclusions
Finally, write a short (<= 4 sentences) summary comparing the three different methods from this
problem.
TODO:
fill in your comparison.
The naive version has a mucher larger than alpha false discovery rate due to multiple testing.
Bonferonni is highly conservative for controlling FDR (because that’s not what it controls) with
only 15 discoveries but all of them correct.
Under family-wise error rate, we have ~95% chance
to have all discoveries simultaneously be correct, so that’s not surprising. B-H comes close to our
desired FDR (because that’s directly what it controls).
17
4.6
Final tests
If all the tests below pass you can assume you have successfuly completed the testable parts of the
lab. Don’t worry about understanding the code below; just make sure no asserts fail.
[25]:
def
assert_discoveries
(results,
true_vales,
true_positives_hash,
false_positives_hash,
true_negatives_hash,
false_negatives_hash,
false_discovery_frac_hash):
def
get_hash
(num):
return
hashlib
.
md5(
str
(num)
.
encode())
.
hexdigest()
res_dict
=
report_results(results, true_values)
assert
(get_hash(res_dict[
'TP_count'
])
==
true_positives_hash)
assert
(get_hash(res_dict[
'FP_count'
])
==
false_positives_hash)
assert
(get_hash(res_dict[
'TN_count'
])
==
true_negatives_hash)
assert
(get_hash(res_dict[
'FN_count'
])
==
false_negatives_hash)
_, false_discovery_frac
=
print_false_discovery_fraction(res_dict)
print
(get_hash(false_discovery_frac))
assert
(get_hash(false_discovery_frac)
==
false_discovery_frac_hash)
print
()
assert_discoveries(naive_decisions,
true_values,
true_positives_hash
=
"7c82fab8c8f89124e2ce92984e04fb40"
,
false_positives_hash
=
"18d8042386b79e2c279fd162df0205c8"
,
true_negatives_hash
=
"7bb16972da003e87724f048d76b7e0e1"
,
false_negatives_hash
=
"ca9c267dad0305d1a6308d2a0cf1c39c"
,
false_discovery_frac_hash
=
"87600af3ad8560db2ef1ec43fc0e9877"
)
assert_discoveries(bonferroni_decisions,
true_values,
true_positives_hash
=
"9bf31c7ff062936a96d3c8bd1f8f2ff3"
,
false_positives_hash
=
"cfcd208495d565ef66e7dff9f98764da"
,
true_negatives_hash
=
"f8e918489f1e0a81ff11312f4d0630c1"
,
false_negatives_hash
=
"277a78fc05c8864a170e9a56ceeabc4c"
,
false_discovery_frac_hash
=
"30565a8911a6bb487e3745c0ea3c8224"
)
assert_discoveries(bh_decisions,
true_values,
true_positives_hash
=
"5a4b25aaed25c2ee1b74de72dc03c14e"
,
false_positives_hash
=
"70efdf2ec9b086079795c442636b55fb"
,
true_negatives_hash
=
"e1226495c14f1a62ae17aa76c1f0d457"
,
false_negatives_hash
=
"88a199611ac2b85bd3f76e8ee7e55650"
,
18
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
false_discovery_frac_hash
=
"eac9ed4c22d75c17b5211c4c2468bd52"
)
print
(
"All tests passed! You are awesome!!!"
)
Decision = 0
Decision = 1
Truth = 0
7628
400
Truth = 1
679
1293
total discoveries: 1693
fraction of discoveries which were actually false: 0.236
87600af3ad8560db2ef1ec43fc0e9877
Decision = 0
Decision = 1
Truth = 0
8028
0
Truth = 1
1957
15
total discoveries: 15
fraction of discoveries which were actually false: 0.000
30565a8911a6bb487e3745c0ea3c8224
Decision = 0
Decision = 1
Truth = 0
8011
17
Truth = 1
1581
391
total discoveries: 408
fraction of discoveries which were actually false: 0.042
eac9ed4c22d75c17b5211c4c2468bd52
All tests passed! You are awesome!!!
[ ]:
19
Related Documents
Recommended textbooks for you

Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781285196145
Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:Cengage Learning


Fundamentals of Information Systems
Computer Science
ISBN:9781337097536
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781305971776
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning
Recommended textbooks for you
- Systems ArchitectureComputer ScienceISBN:9781305080195Author:Stephen D. BurdPublisher:Cengage LearningDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781305627482Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781285196145Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos CoronelPublisher:Cengage Learning
- Fundamentals of Information SystemsComputer ScienceISBN:9781337097536Author:Ralph Stair, George ReynoldsPublisher:Cengage LearningPrinciples of Information Systems (MindTap Course...Computer ScienceISBN:9781305971776Author:Ralph Stair, George ReynoldsPublisher:Cengage Learning

Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781285196145
Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:Cengage Learning


Fundamentals of Information Systems
Computer Science
ISBN:9781337097536
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781305971776
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning