Section 3 Lab Document Key

pdf

School

Clemson University *

*We aren’t endorsed by this school

Course

3090

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

8

Uploaded by PrivateRock9572

Report
STAT 3090 S ECTION 3 L AB F ALL 2023 1 N AME : P URPOSE : It is as important to be able to communicate and explain data distributions as it is to be able to produce graphs and calculate statistics. The following questions will help you become proficient using your calculator and software to calculate descriptive statistics. They will also help you learn ways in which to communicate information about data distributions. O BJECTIVES : Upon successful completion of this activity, you wi ll be able to… Calculate descriptive statistics using calculator and software Describe a distribution of values Find the 5-number summary of a data set and draw a box plot by hand Use JMP to analyze data sets Use JMP to draw multiple box plots on the same axis Compare two distributions D IRECTIONS : Please answer the following questions. When requested show your work. When you are asked to explain an answer use complete sentences. Provide any requested output from JMP. Instructions for creating several types of graphs or tables and statistics can be found on Canvas in the file JMP Instructions.docx . Place your answers and output into this document. You may delete the JMP instructions prior to submitting. Part One 1. (18 points) More Candy Bars. We will calculate several descriptive statistics for the Candy Bars data set from the Section 2 lab. We will do our calculations by hand and using the 1-Var Stats function in your calculator. We will then use JMP to double check our work. a) So that you have less data values to enter into your calculator, take a random sample of size 10 from the full data set. (2 pts) Step 1: Open the Candy Bars data set in JMP and go to Tables >> Subset . Step 2: In the section for Rows , select Random-sample size: enter 10 for the Random sample size. Select OK . Step 3: Take a screen shot of your data table showing at least the first 5 columns. Paste your data table below. 2 pts for table with at least 5 columns.
STAT 3090 S ECTION 3 L AB F ALL 2023 2 b) What is the mode of the variable sugars from your sample? If there is no mode, simply state that. Remember to include units . (2 pts) The mode is 19 g. c) Using your sample, calculate the mean amount of sugars in the sample of candy bars by hand. Include the symbol for a sample mean and units in your answer. Show your work by filling in the formula for the mean. You may type your formula or handwrite it and insert a picture of it into this document. (2 pts) 𝑥̅ = 17 + 23 + 16 + 30 + 18 + 13 + 22 + 20 + 19 + 19 10 = 19.7 𝑔 d) Using your sample calculate the range . Include units in your answer. (2 pts) The range is 30 g 13 g = 17 g. e) Show how to fill in the formula to calculate the variance of the variable sugars using your sample data. Include the symbol for a sample variance . You do not need to calculate this value by hand. In the next step we will use the calculator to find the value. (2 pts) 𝑠 2 = (17 − 19.7) 2 + (23 − 19.7) 2 + ⋯ + (19 − 19.7) 2 10 − 1 = f) Carefully enter your sample data into a list in your calculator and use the 1-Var Stats function to help you calculate the sample variance . Include the symbol for the sample variance and units with your answer. (2 pts) 𝑠 2 = (17 − 19.7) 2 + (23 − 19.7) 2 + ⋯ + (19 − 19.7) 2 10 − 1 = 21.34 𝑔 2 g) Use JMP to calculate the descriptive statistics for your sample. (2 pts) Step 1: Go to Analyze >> Distribution . Make sure to do it on your sample, not the entire dataset. Step 2: Put the column for sugars into Y, Columns and select OK . (The screen shot used a different column, be sure to use sugars column) Step 3: Change the orientation of the output to a horizontal format. To do this, select the triangle to the left of the word Distributions in your output and choose Stack .
STAT 3090 S ECTION 3 L AB F ALL 2023 3 Step 4, Change the summary statistics to display the variance and range as well. Select the red triangle next to the Summary Statistics Table and then choose Customize Summary Statistics . Select the variance and range from the list of choices. Copy and paste the resulting Summary Statistics Table below. Mean 19.7 Std Dev 4.6200048 Std Err Mean 1.4609738 Upper 95% Mean 23.004952 Lower 95% Mean 16.395048 N 10 Variance 21.344444 Median 19 Mode 19 Range 17 Interquartile Range 5.5 h) Check your calculations for parts c), d) and f) against your JMP results. If they do not match, figure out where you made your mistake and correct it. Type DONE below when you have checked your work and everything is correct. (2 pts) Done i) Report the value of the standard deviation of your sample and write a sentence explaining the meaning of the standard deviation in the context of the sample data set. (2 pts) The standard deviation of my sample is 4.62 g. This means that the average deviation of each data point from the mean is 4.62 g. 2. WebAssign questions 1 - 3. (12 points) Part Two 3. (20 points) Cheerio
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STAT 3090 S ECTION 3 L AB F ALL 2023 4 Many manufacturing processes produce data that is approximately normally distributed (mound or bell shaped, symmetric and unimodal). Suppose the machine that fills Cheerios boxes is set to have a mean box weight of 21 oz and a standard deviation of 0.4 oz. The JMP file Cheerio Box Weights contains the weights of 115 randomly selected boxes each of cereal that Delectable Delights prepared for 2 shipments. a) Observe the column of Shipment 1. Note that the data in Shipment 1 are already ordered increasingly. Find the 5-number summary of this Shipment 1 data set by hand. (2 pts) Min: 20.02 g Q1: 20.66 g L p = 116(.25) = 29 Q2: 20.90 g Q3: 21.18 g L p = 116(.75) = 87 Max: 21.88 g b) Calculate IQR , lower fence , and upper fence by hand using the 5-number summary in Question 1. Determine whether there are any outliers . (2 pts) 𝐿𝐹 = ?1 − 1.5(𝐼??) = 20.66 − 1.5 ∗ 0.52 = 19.88 𝑈𝐹 = ?3 + 1.5(𝐼??) = 21.18 + 1.5 ∗ 0.52 = 21.96 There are no outliers. c) Draw a box plot by hand for this data set using what you obtained from parts a and b. (2 pts) d) Interpret the IQR you obtained from Question 2 based on the context. (2 pts) The range of the middle 50% of our data is 0.52 oz. Weight oz
STAT 3090 S ECTION 3 L AB F ALL 2023 5 e) Now let’s use JMP to analyze the data. Use JMP to draw a histogram along with quantiles table and summary statistics for the sample from shipment 1. (2 pts) Step 1, create a histogram: Go to Analyze >> Distribution. Put the column for Weight (oz) Shipment 1 into Y, Columns and select OK. Change the orientation of the output to a horizontal format. To do this, select the triangle to the left of the word Distributions in your output and choose Stack . Step 2, modify axis/labels: For y-axis, click the red triangle to the left of the column heading (in this case it is Weight (oz) Shipment 1). Select Histogram Options >> Count Axis or Prob Axis . This will place either the frequency or relative frequency axis on the histogram depends on what you want to plot. Once more click the red triangle and select Display Options >> Axes on Left to change the location of y-axis. Change the label of the y-axis to Frequency instead of Count, or Relative Frequency instead of Probability by single clicking on the default label. For x-axis, right click on the values on x-axis and select Add Axis Label , then enter the name “Weight (oz) Shipment 1”. (Optional) Step 3, aesthetics: If you would like to change the color of the bars, right click in the white space around the bars and select Histogram Color . Note, we cannot separate bars in histogram for quantitative data as what we did in bar chart for qualitative data, since x-axis is a continuous axis now. Step 4, copy/paste your result: Copy and paste all histogram, quantiles, and summary statistics here. You can use screenshot. (2 pts) f) Using the Quantiles Table of your output from Question 5 , find the percentage of the boxes in Shipment 1 with weights between 20.532 oz to 21.514 oz. Show your calculation. (2 points) The percentile rank of 21.514 is 90%. The percentile rank of 20.532 is 10%. The percentage of boxers that weigh between these two values is 90% - 10% = 80%.
STAT 3090 S ECTION 3 L AB F ALL 2023 6 g) Describe the distribution of the weights from the sample of the boxes of cereal. Remember to discuss the shape , center , and spread , and unusual values (if there are any). Be careful with the usage of resistant or non-resistant measures based on the shape of distribution. Include units where appropriate. (3 pts) The distribution of the weights of the Cheerio boxes in shipment 1 are approximately symmetric with a mean of 20.95 oz and a standard deviation of 0.38 oz. h) Suppose we believe the distribution is approximately symmetric. Use the empirical rule to fill in the following blanks: (3 pts) - __ 68 ___% of the boxes in Shipment 1 will have weights between 20.57 to 21.33 oz. - 95% of the boxes in Shipment 1 will have weights between __ 20.19 ___ to __ 21.71 ___ oz. i) Do you believe that the machine that filled the boxes in this sample from shipment 1 was working correctly? Why? (Hint: compare the statistics you got with the parameters for the machine. Also, recall that if a distribution is symmetric as is the case with the normal distribution, the mean and median is the same (2 pts) Example answer: I believe that the machine is working correctly for shipment 1. The sample mean is 20.95 oz and the sample standard deviation is 0.38 oz. Those are very close to the parameters set for the machine which were a mean of 21 oz and a standard deviation of 0.4 oz. 4. (12 points) Oh No Cheerio Harold is an employee who works in the shipping department. As he was preparing shipment 2, he felt that the boxes of cereal seemed lighter than usual. You are asked to compare a random sample of 115 boxes of cereal from shipment 2 to the sample from shipment 1. (2 pts) a) Step 1: Go to Graph >> Graph Builder . Choose Box Plot at top of graph builder. Step 2: Highlight both samples, then drag them to the x-axis. (How to highlight both samples: select one sample, then hold the shift key to select the second.) Step 3: On the left select the box for the 5 number summary. Step 4: Right click on the white space on graph and select Edit >> Copy Graph , then paste it here. (2 pts)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STAT 3090 S ECTION 3 L AB F ALL 2023 7 b) Write a few sentences to compare the shape, centers, spread and any unusual features for the two box plots using the output provided by JMP. (8 pts) Shipment 1 is roughly symmetric, while Shipment 2 is left skewed with outliers at 18.9, 18.96, 18.98 and 19.03. Shipment 1 is centered at the median of 20.9 oz which is larger than the median of Shipment 2 at 20.18. The IQR of Shipment 1 is 0.52 oz which is just slightly smaller than the IQR of Shipment 2 at 0.59 oz. Shipment 1 has no outliers, while Shipment 2 has four lower outliers. c) Do you think that the machine that filled the boxes for the second shipment may be malfunctioning? Explain. (2 points) It appears that the machine that filled the boxes for the second shipment is malfunctioning. Since the process is supposed to have a symmetric distribution, the median shipment weight should be similar to the required mean of 21 oz. We see that shipment 1 has a median of 20.88 oz, but shipment 2 has a median that is smaller. For that matter the maximum weight in shipment 2 is at 21 oz. If the machine was working Weight (oz) Shipment 1 & Weight (oz) Shipment 2 Weight (oz) Shipment 1 & Weight (oz) Shipment 2 19 20 21 22 Max 21.88 Q3 21.18 Med 20.90 Q1 20.66 Min 20.02 Max 21.00 Q3 20.52 Med 20.18 Q1 19.93 Min 18.90 Weight (oz) Shipment 1 Weight (oz) Shipment 2 2 points for describing shipment 1 as symmetric and shipment 2 as left-skewed 2 points for reporting correct median value for each (0 point if mean is reported) 2 points for reporting correct IQR value for each (0 point if standard deviation or other measure of variability is reported) 2 points for units in at least 2 of the reported statistics. Deduct 1 point if comparative language is not used at least twice.
STAT 3090 S ECTION 3 L AB F ALL 2023 8 correctly, we would expect the median, not the maximum, to be close to 21 oz. There are also 4 low outliers in shipment 2. 2 points for reasonable justification of conclusion about the filling machine 5. Complete Section 3 WebAssign. (38 points)