Caroline Hyde- JMP activity 1

pdf

School

University of South Carolina *

*We aren’t endorsed by this school

Course

3090

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

9

Uploaded by Carohyde

Report
STAT 3090 JMP A CTIVITY 1 S PRING 2022 1 N AME : Caroline Hyde O BJECTIVES : The purpose of this project is to introduce the statistical software JMP. Upon successful completion of this project, you wi ll be able to… Identify the type of variable in the context of a problem Create graphical displays in JMP Use the output provided by JMP to answer relevant questions Describe a distribution of values Find and interpret a given percentile Calculate a z-score Find the 5-number summary of a data set Use JMP to draw multiple box plots on the same axis Compare two distributions Select a random sample Understand the possible effects when a random sample is not used in a study Use JMP to draw two box plots on the same scale. Delectable Delights is a large consumer food manufacturer selling its products in retail stores nationwide. You have landed your first job after graduation from Clemson in their advertising division. Since you took statistics as a part of your coursework, you are often called upon to perform data analysis for the advertising division, as well as other divisions of the company. D IRECTIONS : Answer the following questions using complete sentences as though you were presenting your analysis to the employees of Delectable Delights. Please provide any appropriate output and/or screenshots from JMP. Instructions for creating several types of graphs or tables and statistics can be found on Canvas in the file JMP Instructions.docx . Paste your answers and any output into this document. If you would like to type formulas and your calculations into Word, you may find this YouTube video on using the Microsoft Equation Editor helpful. https://www.youtube.com/watch?v=GdF5kYIoh-U This assignment is worth 100 points. 1. M&M missing colors (15 pts) Delectable Delights is a distributor for M&M’s ® candies. The packages are supposed to have 20% of each colored candy, Red, Blue, Green, Yellow, and Brown. Recently they have received word from their customers that there are not enough Green candies in the packages. Ray Holtz, in the customer care center would like you to use a random sample of candies to see if the customer comments are valid. The file Candy, which is found on Canvas, contains the sample. Using the information found in the sample send a report to Ray of your findings.
STAT 3090 JMP A CTIVITY 1 S PRING 2022 2 a) Open the file Candy in JMP. What type of variable (Qualitative or Quantitative) is Candy and what is its level of measurement ? What is the sample size ? (5pts) Type: Qualitative Level of Measurement: Nominal Sample Size: 100 b) Using this data set, create a graphical display in JMP that highlights the proportions of the different colors of candy in the sample. The type of graph or table you choose needs to quickly convey the proportion or percentage of each color. There is more than one type of graph you could use. Copy and paste your graph or table here. Make sure to include titles or legends. Use the information in the file JMP Instructions.docx to help you make the graph or table of your choice. (5 pts) Hints: i. If you are using the Graph Builder function in JMP, you can change some of the titles by double-clicking on the name given by JMP and typing your own title. You can make changes to the legend by double-clicking in the region where the legend is located. ii. If you have selected Analyze >> Distribution to create your graph, you can double- click any of the titles to change the names. For example, you can change the title of the column Prob to Proportion or Relative Frequency since that column represents the relative frequency of each genre. Double-clicking on the horizontal axis allows you to make a variety of changes to that axis. Clicking the red triangle next to the name of the variable gives you other options, including the ability to order your data from largest to smallest categories (or vice versa). iii. Explore the various options available in JMP to change the look of your graph or table. There are also many videos available on YouTube on how to use JMP.
STAT 3090 JMP A CTIVITY 1 S PRING 2022 3 c) Using your graph or table from part b), report to Ray whether or not you suspect that there are too few Green candies. Be sure to reference your graph or table in your report. Keep in mind that this is a sample, not the entire population of M&M’s ® so we don’t expect the proportions to be exact. Later in the course when we come to methods for inferential statistics, we will learn how to determine if the proportion of green in the sample is reasonably close to the true percentage of green candy which is 20%. (5 pts) Hi Mr. Ray, It does look like our machines are under-producing green. Although, it is slightly lower than 20%, it is 18% which is fairly close to 20. However, when you look at the graph, yellow is 16% which seems to be more of a problem. 2. Cheerio’s (60 pts) Many manufacturing processes produce data that is approximately normally distributed (mound or bell shaped, symmetric and unimodal). The machine that fills the Cheerios boxes for Delectable Delights is set to have a mean box weight of 21 oz and a standard deviation of 0.4 oz. The JMP file Cheerio Box Weights contains the weights of 105 randomly selected boxes each of cereal that Delectable Delights prepared for 2 shipments. a) Use JMP to draw a histogram and provide basic summary statistics for the sample from shipment 1 ( use Analyze>>Distribution ). Change the orientation of the output to horizontal by selecting the red triangle next to the word Distributions and choose Stack . Add the relative frequency (prob) axis to the histogram. Paste the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STAT 3090 JMP A CTIVITY 1 S PRING 2022 4 histogram, the quantiles table and the summary statistics table below. You do not need to adjust any of the default settings, simply copy and paste the output. (5 pts) b) Describe the distribution of the weights from the sample of 105 boxes of cereal. Remember to discuss the shape, center, and spread, and any unusual values (if there are any unusual values). Include units where appropriate. (5 pts) The weights are normally distributed with a bell shape. distribution is unimodal with a center at approximately 21. The data is roughly symmetrical. There is one peak at 21.88. The median is 20.88 and the mean box weight is 20.92 oz. It has a standard deviation of 0.37 oz. c) Do you believe that the machine that filled the boxes in this sample from shipment 1 was working correctly? Why? (5 pts) Yes, I believe that the machine that filled the boxes was working pretty correctly since the mean and the standard deviation are fairly close to what was expected. The distribution was normal and it was bell shaped and pretty symmetrical. d) The weights for shipment 1 have been sorted smallest to largest for your convenience. What is the weight of box at the 5 th percentile ? Use the method on page 54 of the lecture guide to find the location of the value . (5 pts) 20.39 oz e) What does it mean to be at the 5 th percentile? (5 pts) 95% of the numbers fall above the number at the 5 th percentile and 5% are below the number at the 5 th percentile.
STAT 3090 JMP A CTIVITY 1 S PRING 2022 5 f) What is the z-score of the weight that you found in part d). Recall that the machine that fills the Cheerios boxes is set to have a mean box weight of 21 oz and a standard deviation of 0.4oz. Use these values of the mean and standard deviation to calculate the z-score . (5 pts) -0.1525 g) Find the 5-number summary of this data set by hand and include your work and all decimal places in your answer. Use the method on page 54 of the lecture guide to find the location of the values. (10 pts) Minimum- 20.02, Q1- 20.66, Median- 20.88, Q3- 21.155, Maximum- 21.88 h) What value does JMP give for Q3? Include 3 decimal places in your answer. (5 pts) 21.155 oz You will notice that JMP uses a slightly different method for calculating quantiles. You may also have learned a different method for calculating Q1 and Q3 in another statistics class. In other words, there is more than one way to calculate the quantiles or percentiles. In this course please use the method taught in chapter 4 or the JMP output depending on how a question is asked. i) Harold is an employee who works in the shipping department. As he was preparing shipment 2, he felt that the boxes of cereal seemed lighter than usual. You are asked to compare a random sample of 105 boxes of cereal from shipment 2 to the sample from shipment 1. Using the graph builder function in JMP draw two box plots on the same scale for shipment 1 and 2. Open graph builder and select both samples (select 1 sample, then hold down the shift key to select the second). Drag both samples to the x-axis. Choose the picture at the top of the graph that looks like 3 box plots. On the left select the check box for the 5-number summary. Sometimes the 5 number summary appears on top of one of the box plots. If that is the case, you can switch the order of the box plots. To do this, select the triangle on
STAT 3090 JMP A CTIVITY 1 S PRING 2022 6 - the left side next to the word “Variables”. This will list the two columns you are using in the box plot. Select the arrow next to one of the variables and move it up or down. This will change the order the box plots are displayed and make the 5 number summary more readable. Paste the resulting box plots below. (5 pts)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STAT 3090 JMP A CTIVITY 1 S PRING 2022 7 j) Using the box plots in part i), write a few sentences to compare the shape, centers and spread for the two box plots using the output provided by JMP. Be careful that you pay attention to the labels so you know which is shipment 1 and which is shipment 2. Do you think that the machine that filled the boxes for the second shipment may be malfunctioning? Why? (10 pts) Shipment 1 is more symmetrical, while shipment 2 is skewed slightly to the right. The center is the median. The median for shipment 1 is 20.88 oz and the median for shipment 2 is 20.17. The IQR for shipment 1 is 0.5 oz, while shipment 2 has an IQR of 0.49 oz. I do think the second machine is malfunctioning because we have a smaller weight coming out of shipment 2. While they have similar IQR’s, shipment 2 has three outliers of underfilled boxes showing that it’s made multiple large errors and therefore is malfunctioning. 3. Off to the Movies In this question, we will explore a possible effect of using a self-selected sample (which is a type of convenience sample) rather than a random sample. Take a look at the list of the 200 top grossing movies of 2018. The list can be found on Canvas in the file Top 200 Movies of 2018.docx . (25 points) (a) Select a sample of 10 movies that you saw (or wanted to see) in theaters and write the titles of those 10 movies in the table below, along with the amount they grossed in 2018. This is your self-selected sample. Notice that the listing of movies gives the gross income rounded to the nearest million, so that a movie listed as earning $191.5
STAT 3090 JMP A CTIVITY 1 S PRING 2022 8 really grossed $191,500,000. The order in which you write the movies in the table below does not matter. (5 points) Movie Title Gross Income (Millions) Sicario: Day of the Soldado 50.1 Bohemian Rhapsody 216.3 A Star is Born (2018) 215.3 Crazy Rich Asians 174.5 Mamma Mia! Here We Go Again 120.6 Black Panther 700.1 Beautiful Boy (2018) 7.6 Ralph Breaks the Internet 201.1 Incredibles 2 608.6 Spider-Man: Into The Spider-Verse 190.2 (b) Compute the average gross income for the sample of movies you selected. Show how you computed the average and include units in your answer. (5 points) Add up values- 2484.4, then divide by how many movies (10)… 2484.4/10= Average Gross Income = 248.44 million dollars -$248,440,000 (c) Now you will select a random sample of 10 movies. Open Microsoft Excel and enter the formula =RANDBETWEEN(1,200) into 10 cells. This will generate 10 random integers between 1 and 200. Alternatively, you can go to the website Random.org to generate 10 random integers between 1 and 200. The link to the random number generator is: https://www.random.org/integers . Record the random numbers in the ID column in the table below. If you have a repeated number, generate another random integer so that there are no duplicates in the ID column. Select the 10 movies from the population of 200 movies that correspond to these ID numbers. Record the Movie Title and Gross Income of these 10 movies in the table below. (5 points)
STAT 3090 JMP A CTIVITY 1 S PRING 2022 9 ID Movie Title Gross Income (Millions) 70 Uncle Drew 42.5 105 Show Dogs 17.9 91 Slender Man 30.6 52 Tag 54.7 62 Breaking In (2018) 46.8 108 Paul, Apostle of Christ 17.6 121 The Darkest Minds 12.7 35 Disney’s Christopher Robin 99.2 77 The 15:17 to Paris 36.3 190 Bharat Ane Nenu 2.7 (d) Compute the average gross income for the random sample of movies. Show how you computed the average and include units in your answer. (5 points) Add all of them up- 361… divide by 10 - 361/10= Average Gross Income = 36.1 million dollars -$36,100,000 (e) We often use statistics calculated from samples to estimate the true parameter value of a population. In this case we are considering the average gross income for the population of 200 movies, which is $55.1 million. i. Which of your samples came closest to the true average? (2 points) My random sample (sample #2) came closest to the true average because the average gross income for that data was 36.1 million dollars which is a lot closer to 55.1 million than the other sample. ii. Regardless of your answer in part i, when using the self-selected sample, are you likely to over-estimate or under-estimate the average gross? Why? (3 points) I over-estimated the average gross income for the movies because I forgot about the smaller, less grossing movies. I had some observational bias because all the ones I chose for my self-selected sample are the outliers because they were the highest grossing ones.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help