Handout 2.2 Constructing Histograms

docx

School

College of San Mateo *

*We aren’t endorsed by this school

Course

20

Subject

Mathematics

Date

Feb 20, 2024

Type

docx

Pages

4

Uploaded by HighnessDonkey2838

Report
Math 80 – Statistics Constructing Histograms Handout 2.2 HISTOGRAMS OF SINGLE QUANTITATIVE DATA SET In the last handout, we analyzed data from the 2006–2007 NBA season when the league changed to a new synthetic basketball. The NBA responded to pressure from the players by changing back to the traditional leather basketball on January 1, 2007. We also examined whether the change back to the traditional ball seemed to be associated with differences in the distribution of total points scored by the two teams in the games during the last week of 2006 and the first week of 2007. It is possible that the change in basketballs might have been associated with a change in the difference in points scored by the teams. But there might be other explanations. In this lesson, we will consider the distributions of a different variable. “Is there a home-court advantage in the NBA?” To answer this question, we will look at the difference in the teams’ scores, which will be calculated by taking the home team’s score and subtracting the visiting team’s score. For example, if the final score of a game for the home team is 110 and for the visiting team is 90, then the difference is 20 (=110-90). And if the final score of a game for the home team is 88 and for the visiting team is 100, then the difference is –12 (= 88 – 100). The following tables show the differences in scores for the games during the last week of 2006 and the first week of 2007. 2006 Data 16 13 11 1 19 5 2 23 –7 8 10 8 -7 6 19 -3 7 6 10 15 –13 –7 6 23 –16 –25 6 –13 15 25 29 2 -4 10 -10 14 17 3 2 14 23 9 10 –1 10 26 9 –10 23 10 14 22 1 -11 10 2007 Data –6 –8 –1 4 24 –11 –8 5 12 –3 7 23 17 4 3 8 9 –15 2 –18 –2 11 3 9 –24 –4 14 -3 5 -7 -14 4 19 –9 –9 2 5 32 28 –5 –18 13 11 5 -12 -5 1 Based only on a visual examination of the data values above, can you determine if there’s a home-court advantage? 2 We will summarize the 2006 dataset by grouping it into bins. A table of bins is created below to help you group the final score differences into intervals of five points each (for example, 6 to 10 points, 11 to 15 points, and so on). The bins start with the lowest final score difference of –25 and end with the highest final score difference of 29. Last Week of 2006 Season Only Score Diff Frequency Relative Frequency –25 to –21 1 1/55 = 0.018 = 1.8% –20 to –16 1 1.8% –15 to –11 3 5.5% –10 to –6 5 9.1% –5 to –1 3 5.5% 0 to 4 6 10.9% 5 to 9 10 18.2% 10 to 14 12 21.8% 15 to 19 20 to 24 25 to 29 A For each value in the 2006 Data table, we will determine the bin it falls into. For example, the first final score difference of 16 belongs in the bin that represents the range 15 to 19. The frequency is equal to the number of values in that bin (interval). For example, the frequency is 5 for the bin labelled –10 to –6, as shown in the table. The relative frequency is the proportion of the total number of observations that fall in each range of the table. There were 55 games played in the final week of 2006, so the first relative frequency is 1/55 = 0.018 = 1.8% for the bin labeled –25 to –19. Now, complete the table. Can you think of a reason why we would want to examine relative frequency? STATWAY™ STUDENT HANDOUT PAGE 1
Math 80 – Statistics Constructing Histograms Handout 2.2 The table created on the previous page is called a frequency distribution table . A table is one way to display data. Another is a graph. 30 25 20 15 10 5 0 -5 -10 -15 -20 -25 12 10 8 6 4 2 0 Score Difference Frequency Frequency Histogram of 2006 Score Differences B Use the frequency column and the score difference ‘bins’ to construct a graph called a frequency histogram . Each grouping of score differences (-25 to -21, -20 to - 16, and so on) from the table above is used to create a bar on the graph. Draw the bars for each group in the table on the graph to the left. Look at the bars of the histogram carefully. What do the heights of each bar represent? 3 A relative frequency histogram represents the relative frequencies for the bins, instead of the frequencies. In other words, it corresponds to the percentage of the total number of score differences in each bin. Below are the graphs of the frequency and relative frequency histograms. Do you notice any similarities between the two histograms? In a case like this where the distribution extends further to the left than the right from the peak, we say the shape of the distribution is skewed to the left (or left-skewed). If the distribution extends further to the right we say it is skewed to the right (or right-skewed). If the distribution looks similar on both sides of the center, we say its shape is symmetric . 4 The following questions relate to important features of a graph such as a histogram. A How would you describe the shape of the distribution of final score differences in the last week of 2006? Is it symmetrical or does it extend more in one direction? B Estimate the value for the center of the distribution, and the typical range (or spread ) of the distribution. It is often a good idea to imagine what the histogram might look like before you make the graph. That way you’ll be less likely to be fooled by errors in the data or when you accidentally graph the wrong variable. Also, different features of the distribution may appear more obvious at different bin width choices. When you use technology, it’s usually easy to vary the bin width so you can make sure that a feature you think you see isn’t a consequence of a certain bin width choice. STATWAY™ STUDENT HANDOUT PAGE 2 30 25 20 15 10 5 0 -5 -10 -15 -20 -25 .25 .20 .15 .10 .05 0 2006 Score Difference Relative Frequency Relative Frequency Histogram of 2006 Score Difference
Math 80 – Statistics Constructing Histograms Handout 2.2 2006 Frequency 30 20 10 0 -10 -20 -30 20 15 10 5 0 Histogram of 2006 2006 Frequency 3 0 2 8 2 6 2 4 2 2 2 0 1 8 1 6 1 4 1 2 1 0 8 6 4 2 0 -2 -4 -6 -8 -1 0 -1 2 -1 4 -1 6 -1 8 - 2 0 -2 2 -2 4 -2 6 9 8 7 6 5 4 3 2 1 0 Histogram of 2006 Here are two more histograms created by choosing different bin widths (a wider one and a narrower one) for the score differences data from the last week of 2006. Another important feature to look for in a histogram is the outliers. We should always mention any stragglers, or outliers , that stand off away from the body of the distribution. Outliers can affect almost every method we discuss in this course, so we’ll always be on the lookout for them. An outlier can be the most informative part of our data, or it might just be an error. Don’t throw it away without comment. Treat it specially and discuss it when you tell about your data, or find the error and fix it if you are able. When we discuss variability, we’ll have a rule of thumb for deciding when a point might be considered an outlier. HOMEWORK 1 Here is a histogram of the score differences for the first week of 2007. 30 20 10 0 -10 -20 9 8 7 6 5 4 3 2 1 0 Score Difference Frequency Histogram of 2007 Score Differences A In how many games did the home team win by 10 or more points? 12 games B Interpret the height of the third bar. 4 They were 4 games whose score difference was between -15 to - 11 C How many games were played during the first week of 2007? 47 games D What percent of the games did the home team win by 10 or more points? 12/47=0.255 or 25.5% 2 The histogram below gives the distribution of the number of grams of fat for an item on the Burger King menu (source: Burger King’s Your Guide to Nutrition). Grams of Fat in Burger King's Menu 0 2 4 6 8 10 0 10 20 30 40 50 60 More Bin Frequency A How many items have 30 or fewer grams of fat? 19 items B What percent of the items have more than 40 grams of fat? 2/25 = 0.08 or 8% C Describe the histogram. When describing the histogram, make sure to discuss the center (use the word “typical” to represent the center), spread (use the word “majority” to represent the spread), and shape in the context of this problem. Use complete sentences. STATWAY™ STUDENT HANDOUT PAGE 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Math 80 – Statistics Constructing Histograms Handout 2.2 3 StatCrunch practice The February 2011 issue of Consumer Reports magazine provides the Overall Score ratings for exercise treadmills. There are two treadmill categories: non-folding and folding. Here are the Overall Score ratings for these two types of treadmill: Non-folding Treadmills 85 84 83 82 81 78 78 69 65 60 Folding Treadmills 81 79 76 76 75 75 75 74 73 73 73 72 71 71 71 70 70 70 70 69 66 66 65 65 64 63 61 61 50 50 50 Suppose your friend is going to purchase a treadmill, but cannot decide whether to purchase a non-folding or folding model. Regardless of the price, which type would you recommend for the highest quality? What evidence do you have to recommend one type over the other? To answer the above questions, you will use StatCrunch to help you construct side-by-side histograms to compare the distributions of the Overall Score ratings for the two types of treadmills. First you will learn how to enter data into StatCrunch and then you will create the histograms. Go to www.statcrunch.com and log in. Select the tab Open StatCrunch. It will bring you to a blank spreadsheet. Click on “ var1 ” and replace with “ Type for the 2 different types of treadmill. Click on “ var2 ” and replace with “ Rating ” for the treadmill rating. Now enter the dataset for the Non-folding treadmills, and then follow by the Folding treadmills. Your spreadsheet should look similar the table below. Row Type Rating var3 Var4 1 Non-folding 85 2 Non-folding 84 3 10 Non-folding 60 11 Folding 81 12 Folding 79 After you’ve entered all the data, you are ready to make histograms. Click on Graph and select Histogram . You should now see a dialog box. Under Select Column(s) , click on Rating . Under Group by , select Type . Before you click on Compute , you can choose different width for the bin size. You can also make adjustments after you’ve viewed your histograms. Use the histograms to help you answer the questions above. If you’d like, you can click on the tab Data and select Save to save this dataset. Copy the histograms into a MS Word (or Google doc) and answer the following questions: A Regardless of the price, which type would you recommend for the highest quality? B What evidence do you have to recommend one type over the other? STATWAY™ STUDENT HANDOUT PAGE 4