hw03

pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

035

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

14

Uploaded by CoachMandrill4154

Report
Question 5. In the code cell below, create a visualization that will help us determine if there is an association between birth rate and death rate during this time interval. It may be helpful to create an intermediate table containing the birth and death rates for each state. (4 Points) Things to consider: • What type of chart will help us illustrate an association between 2 variables? • How can you manipulate a certain table to help generate your chart? • Check out the Recommended Reading for this homework! In [38]: # In this cell, use birth_rates and death_rates to generate your visualization birth_rates_2015 = pop . column( 'BIRTHS' ) / pop . column( '2015' ) death_rates_2015 = pop . column( 'DEATHS' ) / pop . column( '2015' ) birth_death = Table() . with_column( 'Birth Rates' , birth_rates_2015) . with_column( 'Death Rates' , d birth_death . scatter( 'Birth Rates' , 'Death Rates' ) 1
2
Question 1. Produce a histogram that visualizes the distributions of all ride times in Boston using the given bins in equal_bins . (4 Points) Hint: See Chapter 7.2 if you’re stuck on how to specify bins. In [42]: equal_bins = np . arange( 0 , 120 , 5 ) boston . hist( 'ride time' , bins = equal_bins) 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4
Question 2. Now, produce a histogram that visualizes the distribution of all ride times in Manila using the given bins. (4 Points) In [43]: equal_bins = np . arange( 0 , 120 , 5 ) manila . hist( 'ride time' , bins = equal_bins) # Don't delete the following line! plt . ylim( 0 , 0.05 ); 5
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 6. Identify one difference between the histograms, in terms of the statistical properties. > Hint : Without performing any calculations, can you comment on the average or skew of each histogram? (4 Points) In my opinion, One of the differences between histograms is skewness. Skewness implies asymmetry in the distribution. If one histogram has a longer tail than the other, it likely indicates skewness in that direction. 7
8
Question 7. Why is your solution in Question 6 the case? Based on one of the following two readings, why are the distributions for Boston and Manila different? (4 Points) Boston reading Manila reading Hint: Try thinking about external factors of the two cities that may be causing the difference! The readings provide some potential factors – try to connect them to the ride time data. Differences in skewness between the histograms may be due to external factors specific to each city, as suggested by the Boston and Manila measurements. Factors such as weather may affect ride times differently in each city, leading to distinct distribution patterns in the histograms. 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10
Question 2. State at least one reason why you chose the histogram from Question 1. Make sure to clearly indicate which histogram you selected (ex: “I chose histogram A because …”). (5 Points) I chose histogram B because with the x column there is no gap in the data and the most data from -1 to 0. This is consistent with histogram B, so t.hist(‘x’) is histogram B. 11
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 4. State at least one reason why you chose the histogram from Question 3. Make sure to clearly indicate which histogram you selected (ex: “I chose histogram A because …”). (5 Points) I chose histogram A because t.hist(‘y’) shows a gap between data -0.5 and 0.5. I knew this because when I looked at the y column and read horizontally, I saw an area with no data. 13
14