data 8 hw07
pdf
keyboard_arrow_up
School
University of California, Berkeley *
*We aren’t endorsed by this school
Course
8
Subject
Statistics
Date
Feb 20, 2024
Type
Pages
10
Uploaded by ProfComputer848
Data 8 - hw07 - email@berkeley.edu
**Question 1.** Define the null hypothesis and alternative hypothesis for this investigation. *Hint: Dont forget that your null hypothesis should fully describe a probability model that we can use for simulation later.*
The null hypothesis is that the spammers are choosing their area codes randomly from all available area code(200-999).
The alternative hypothesis is that the spammers are picking Yanay's area code(781) on purpose to trick him that
someone from his area is calling.
Page 1
Data 8 - hw07 - email@berkeley.edu
**Question 5.** Using the results from Question 4, generate a histogram of the empirical distribution of the number of
times you saw the area code 781 in your simulation. **NOTE: Use the provided bins when making the histogram**
bins = np.arange(0,5,1) # Use these provided bins
simulation_result = Table().with_column("number of 781", test_statistics_under_null).hist(0, bins = bins)
Page 2
Data 8 - hw07 - email@berkeley.edu
**Question 7.** Suppose you use a P-value cutoff of 1%. What do you conclude from the hypothesis test? Why?
Since the p-value is 0.00185, which is smaller than the p-value cutoff of 1%, we have evidence to reject the null
hypothesis. The alternative hypothesis that the spammers are intentionally using Yanay's area code is supported.
Page 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Data 8 - hw07 - email@berkeley.edu
**Question 8.** Define the null hypothesis and alternative hypothesis for this investigation.
*Reminder: Dont forget that your null hypothesis should fully describe a probability model that we can use for simulation
later.*
The null hypothesis is that the spammers are choosing their area codes randomly from all possible area codes between
200-999. The alternative hypothesis is that the spammers are intentionally choosing area codes of the 8 places that
Yanay has recently been to(781, 617, 509, 510, 212, 858, 339, 626).
Page 4
Data 8 - hw07 - email@berkeley.edu
**Question 11.** Using the results from Question 10, generate a histogram of the empirical distribution of the number of
times you saw any of the area codes of the places Yanay has been to in your simulation. **NOTE: Use the provided bins
when making the histogram**
bins_visited = np.arange(0,6,1) # Use these provided bins
area_visited_table = Table().with_column("numbers of area codes visited", visited_test_statistics_under_null).hist(0, bins
= bins_visited)
Page 5
Data 8 - hw07 - email@berkeley.edu
**Question 13.** Suppose you use a P-value cutoff of 0.05% (**Note: thats 0.05%, not our usual cutoff of 5%**). What
do you conclude from the hypothesis test? Why?
The p-value of 0.2% is higer than the p-value cutoff of 0.05%, therefore, the null hypothesis is supported by the data.
This provides evidence that the spammers are choosing area codes randomly.
Page 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Data 8 - hw07 - email@berkeley.edu
**Question 14.** Is `p_value`:
* (a) the probability that the spam calls favored the visited area codes,
* (b) the probability that they didn't favor, or
* (c) neither
If you chose (c), explain what it is instead.
b
Page 7
Data 8 - hw07 - email@berkeley.edu
**Question 15.** Is 0.05% (the P-value cutoff):
* (a) the probability that the spam calls favored the visited area codes,
* (b) the probability that they didn't favor, or
* (c) neither
If you chose (c), explain what it is instead.
c. The p-value cutoff of 0.05% is not probability. It is a value that we choose to determine whether or not the probability
of null hypothesis being correct is small enough for us to reject it. If the p-value is smaller than the cutoff, we can choose
to say that the null hypothesis is not supported by the data, thus allowing us to reject it. Page 8
Data 8 - hw07 - email@berkeley.edu
**Question 16.** Suppose you run this test for 4000 different people after observing each person's last 50 spam calls.
When you reject the null hypothesis for a person, you accuse the spam callers of favoring the area codes that person
has visited. If the spam callers were not actually favoring area codes that people have visited, can we compute how
many times we will incorrectly accuse the spam callers of favoring area codes that people have visited? If so, what is the
number? Explain your answer. Assume a 0.05% P-value cutoff.
There will be 2 spam callers incorrectly accused for favoring area codes that people have visited, acoording to the
p-value cutoff of 0.05% times the total number of samples(4000).
Page 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Data 8 - hw07 - email@berkeley.edu
**Question 19.** Generate 1,000 simulated test statistic values. Assign `test_stats` to an array that stores the result of
each of these trials. *Hint*: Use the function you defined in Question 18.
We also provided code that'll generate a histogram for you after generating a 1000 simulated test statistic values.
trials = 1000
test_stats = make_array()
for i in np.arange(trials):
test_stats = np.append(test_stats, simulate_one_stat())
# here's code to generate a histogram of values and the red dot is the observed value
Table().with_column("Simulated Proportion Difference", test_stats).hist("Simulated Proportion Difference");
plt.plot(observed_diff_proportion, 0, 'ro', markersize=15);
Page 10
Related Documents
Recommended textbooks for you
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337282291/9781337282291_smallCoverImage.gif)
Recommended textbooks for you
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337282291/9781337282291_smallCoverImage.gif)