Written Homework Ch. 2

docx

School

Southern Utah University *

*We aren’t endorsed by this school

Course

1040

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

3

Uploaded by DrDanger6869

Report
Written Homework Chapter 2 Due Friday, January 26 by 11:59 PM. Be sure to show work or explain how you got your answer. Always feel free to email with a question, come to office hours, or set up another time to meet with me outside of class. 1) This homework assignment will use Excel. An Excel guide is available on Canvas in the Additional Resources module that (hopefully) has information on everything we will use Excel for this semester. Go to Canvas and download this Excel guide and reference it for the problems involving Excel on this assignment. Nothing else is required for this first “problem.” 2) On Canvas, you will find the super_bowls.xlsx file. The Super Bowls data set includes information about the number of points scored by the winning team at Super Bowls I through LV (1 through 55). a. Is the variable being measured here quantitative or categorical? If it is quantitative, is it continuous or discrete? If it is categorical, is it ordinal or nominal? Quantitative & Discrete b. Create an appropriate graphical display for these data. Either insert an image of the graph or draw a sketch of it. c. Based on your graph from part (b), is the mean or median the preferred measure of center? Is the IQR or the standard deviation the preferred measure of spread? Explain. The center prefers the median. However, I believe the IQR, which is 12, is the most helpful measure of spread when interpreting data. Both measures have the least amount of influence on the outlier. d. Use Excel to find some summary statistics to fill in the table below. Statistic Sample Size Mean SD Median Q 1 Q 3 Min Max Value 55 30 10 31 23 35 13 55 e. In one or two complete sentences , describe the distribution of the points scored variable.
Be sure to include context, center, spread, shape, number of peaks, and any unusual features. The data shows a slight skew to the right; however, the difference between the mean and median is insignificant, giving the data a symmetric spread. The SD of 10 is quite large, and an IQR of 12 indicates a more concentrated distribution towards the center. The Super Bowl winnings are divided into score ranges. The center is around 30 to 31 points. The spread spans from 13 to 55 points, with most points falling in the middle. The data has one peak to the center, and the graph skews right with a longer tail as the point increases. The data shows some gaps towards the higher scores, indicating the rarity of scores in the high forties and fifties. 3) This is a standard deviation contest. You must choose five numbers from the whole numbers 1 to 30, with repeats allowed . a. Choose five numbers that have the smallest possible standard deviation. 10,10,10,10,10 b. Choose five numbers that have the largest possible standard deviation. 1,1,1,30,30 c. Is more than one choice possible in either (a) or (b)? Explain. Yes, you could choose any number for a.) as long as its repeated five times. For b.), you could choose other variations to get a larger deviation also, like 1,30,30,30,1. 4) On Canvas, you will find the cars99.xlsx file. It contains information about the Cars data set. The Cars data set has information about many different vehicle models from 1999. We are going to investigate the highway mileage as it pertains to the type of vehicle. a. Create either a bar plot or pie chart (whichever you prefer) for the “Type” variable. Include or sketch the graph. Which type of vehicle is most represented here? Which type is least represented? b. Create side-by-side boxplots of the variable “Accel0_6,” the 0 to 60 mph acceleration time for the vehicle, grouped by “Type”. See the Excel guide for how to do this. You will need to create a few new columns to format the data correctly. Be sure to change the style of the chart so it is clear which boxplot is for which car type. c. Based on these boxplots, the individual vehicle with the slowest 0 to 60 acceleration time
belongs to which group (remember slower speeds have higher values for 0 to 60 mph time)? How about the vehicle with the fastest 0-60 acceleration time? We do not need to know the specific vehicle, just which group it is in. The family is the slowest. And Sport has the fastest. d. Which car type tended to have the slowest 0 to 60 acceleration time? Indicate how you know. Note that if you hover over each boxplot, it will give you some numerical information. Based on the box plot, it shows that the family-type vehicle has the slowest speed, with a whisker around the 13-second range. Although this is an outlier, the box chart shows speeds slightly above the 12-second mark in Q3. e. Which car type was the most consistent (least spread out) in terms of 0 to 60 acceleration time? Which was the least consistent? Explain how you know and give numeric evidence. The small car type has the most consistency because the box plot is the most compact, the whiskers are shorter, and the data seems to have fewer variations. It also has the smallest IQR of 0.65. The least consistent type would be the sports type. It has long whiskers and an IQR of 1.9. f. Consider the Small car group. If the boxplot has been made correctly, you’ll notice there are outliers for this group. Which vehicle is the most extreme outlier and what is its 0 to 60 acceleration time? You can find the exact value by hovering your cursor over the point on the graph. The outlier would be 8.1 and 9.3 for the small car group. g. Give a two or three sentence comparison of the distributions of the 0 to 60 acceleration times for the Family and Sports vehicle types. Be sure to include context, center, spread, shape, and any unusual features. You do not need to mention any other car type for this part. Comparing the two its clear that family is slower than sport types, with the sports having a wider range with its max and mins than the family type. The family type has more of a symmetrical shape where the sports car is skewed to the right. 5) A researcher wants to know the average GPA of every college student in the state. The researcher asks 93 college students their GPA and records the answers. In this scenario, identify each of the following: a. The variable of interest. The GPA of college students. b. The population of interest. Every college student in the state. c. The sample. 93 College students. d. The parameter of interest. The average GPA of the entire population of students in the state. e. The statistic of interest. The average GPA of students in the sample size of that state.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help