HW 6

docx

School

Rose-Hulman Institute Of Technology *

*We aren’t endorsed by this school

Course

445

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

9

Uploaded by Deerslayer0302

Report
Homework Set 6 Names or Team Name: Team ChemE EMGT 446, Six Sigma, Winter 22-23, 10 points; each problem is worth +0.5, unless stated otherwise. Topic: MSA terminology, Hypothesis Testing, Confidence Interval Interpretation, Analysis of Baseline Operation Times Instructions: Insert your answers directly into this Word document. You can submit one homework set per team. I’d like to see if you prefer the Instructions/Answers style or just one document for both. Resize the graphics so that they don’t take up an entire page. You can delete the intro below to save space. 1-4. On Homework Set 2, I asked you to identify the correct version of the six pop culture icons shown below. I used a check sheet to track of the number of students who got zero correct, one correct, two correct, …, six correct. The number of correct logo identifications out of six possible groups of logos is shown below for 37 students. No one missed all of them or got all of them correct. The mode, the most frequently occurring number of correct logo identifications is 3. Number of Correct Pop Icon Identifications from the article about that Mandela Affect 0: 1: 11111 (5) 2: 1111 (4) 3: 11111111111111 (14) 4: 11111111111 (11) 5: 111 (3) 6: 1. On average, how many correct logo identifications were there per student? Since the answer is an average, report the value correct to three decimal places. 3.081 2. Determine the median of correct logo identifications. 3
3. The number of correct logo identifications out of the n = 6 groups of logos (Curious George, Monopoly Man, …, c3po) is a discrete random variable – a person can only get x = 0, 1, …, 6 logos correct. If a person is just guessing which of three similar logos is the correct one, the probability that they will successfully pick it is 1/3. Let the discrete random variable X represent the number of correct logo identifications for the six groups of logos. We call X a binomial random variable; either the student guesses a logo correctly or not. The probability of correctly identifying x correct icons for the six groups of logos is: p ( x ) = ( 6 x ) ( 1 3 ) x ( 2 3 ) 6 x for x = 0 , 1 , 2 ,…, 6. For example, the probability that a student would correctly identify two logos correctly out of the group of six is: p ( 2 ) = ( 6 2 ) ( 1 3 ) 2 ( 2 3 ) 6 2 = 6 ! 4 ! 2 ! ( 1 9 ) ( 16 81 ) 0.329 What is the probability that a student correctly identifies four logos out of the group of six? Show your work. Provide your answer correct to three decimal places. p ( 4 ) = ( 6 4 ) ( 1 3 ) 4 ( 2 3 ) 6 4 = 6 ! 2 ! 4 ! ( 1 81 ) ( 4 9 ) 0.082 4. According to the binomial probability density function from Problem 3, if a student was purely guessing to determine the correct logo out of the six groups, the approximate proportions of students getting x = 0, 1, …, 6 correct are: x = 0 x = 1 x = 2 x = 3 x = 4 x = 5 x = 6 Pr ( X = x ) 0.088 0.263 0.329 0.219 0.082 0.016 0.001 In Homework Set 2, we obtained the number of correct matches out of the six groups for 37 students. If students were guessing, then we’d expect to see the following number of students getting x correct logo identifications: x = 0 x = 1 x = 2 x = 3 x = 4 x = 5 x = 6 Expected Values if guessing E(X) = 37 · Pr ( X = x ) 3.256 9.731 12.17 3 8.10 3 3.03 4 0.592 0.03 7 Actual Values from HW Set 2 0 5 4 14 11 3 0 Using the above information, we can perform a chi-squared goodness-of-fit test to determine if the number of correct logo identifications matches that of a binomial distribution with probability of success 1/3. H 0 : The students were guessing to obtain the correct logo identifications out of the 6 groups. In other words, the observed data matches the expected data that we’d obtain if the students were guessing. H a : The students were not purely guessing to obtain the correct logo identifications out of the 6 groups. In other words, the observed data doesn’t match the expected data that we’d obtain if the students were guessing – the students had some knowledge about the correct logos and didn’t guess at them all.
Conclusion: Does the evidence suggest that the students were able to determine some of their correct logo identifications by knowing them rather than just guessing? Yes No 5-6 . For the confidence interval bonus last week, I asked students to determine ten 90% confidence intervals for some unknown mean μ. I had made a column of random numbers from a normal distribution with mean μ = 123.456 and population standard deviation 1. By the end of the exercise, we had 200 confidence intervals in which 17 (~8.5%) of them did not contain μ = 123.456 and 183 (~91.5%) did. Click the link below to see the 200 intervals. https://docs.google.com/document/d/1LhXUPywpTjSov-Y82nAFCyUalNOJVmoLzCkRW9t3F3A/edit 5. True or False. If I have each person in our class of 46 students build one hundred 90% confidence intervals using the same data that was used for the bonus problem (normal with mean μ = 123.456 and σ = 1), then on average, I’d expect approximately 4140 of them to contain μ = 123.456 and approximately 460 of them to not contain μ. True False 6. Suppose instead that I have each of our 46 students build one hundred 95% confidence intervals using that same data that was used for the bonus problem. On average, approximately how many intervals will contain μ = 123.456? 4370 7. True or False. The greater the confidence we build into a confidence interval (e.g., go from building 90% CI’s to 95% CI’s), the tighter it becomes, thus being less likely to contain the true mean μ. True False 8. If we use a 1-sample t test to construct a non-100% confidence interval (e.g., 90%, 95%, 99%) for the true population mean μ, can we ever guarantee that it will contain μ? Note: I’m assuming we randomly select the data from the population of interest and the sample size n is greater than 1. Yes No Baseline Data Collection Exercise: Time to remove your patient’s ailments. You’ll be collecting data for your patient’s operations in the Google spreadsheet Winter 21-22 Operation Teams & Times . The spreadsheet has two tabs at the bottom of the page. To understand how teams collect their data, watch the video: Operation: Data Collection VIDEO . Record your patient’s surgery times in the “Baseline Surgery Times” tab. L14&15: Operation Instructions contain operational definitions for the Key Process Input Variables (KPIV, x ’s) and Key Process Output Variables (KPOV, Y ’s). KPIV columns (the x’s) have green shading while the KPOV have reddish shading .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
9-10. [+1] On the previous page, you can view the Key Process Input Variables that I chose for your operations. Briefly explain another realistic KPIV ( x ) that could be used in each of the following categories, e.g., did at least one team member have an 8 am class? YES or NO. [+0.25] Binary: At least one team member wears glasses. Y or N [+0.25] Ordinal: At least one team member is a junior at Rose. Y or N [+0.25] Nominal: At least one team member is male. Y or N [+0.25] Variable (measurement): Minimum food consumed the day of surgery. (calories) 11. Of the KPIV that you listed above, do you think any of them could realistically affect Y 3 = Surgery Time (total time from stop to finish without stopping your timer)? Briefly explain how one could affect Y 3 or why none of them would affect Y 3 . The mount of sleep could affect the time it takes to do surgery because if a person gets little to no sleep, then it will be harder for them to focus. 12. [+0] After completing your first round of surgeries, please specify at least one KPIV, KPOV, operational definition, procedure, or special case scenario that was not defined clearly enough for your team. I’m guessing there is something that every team had to decide on the spot since I wasn’t clear enough in some spots. What was some aspect of the surgery process that wasn’t clear? For example, I meant for you to start the timer when the surgeries began and to not stop it until the last surgery was complete. Did your team think you were supposed to stop the clock between surgeries? I meant to say, “round your sleep and coffee consumption to the nearest quarter hour.” I don’t think I said that anywhere, so the values for sleep and coffee that I’ve seen so far are rounded to the nearest hour or ounce, respectively. Our team thought we were supposed to stop the timer in between each surgery so no extra time would be added.
+5.5 Problems 13-16. In Winter 19-20, I had two team members, Buzz and Naveen, record the removal times per ailment (seconds) for their surgeon, Ethan. Below is a snapshot of the data collected by the team Wookies for the surgeon, Ethan, and the two appraisers (timekeepers) Buzz and Naveen. Below are the organ removal times recorded by each appraiser for Ethan’s surgeries. 13. [+0] Are the removal times attribute (categorical) or variable (measurement) data? Attribute Variable 14. [+0] Would we use a Gage R&R or an Attribute Agreement Analysis when performing a measurement system analysis (MSA) between the appraisers, Buzz & Naveen, for Ethan’s operation times? Gage R&R Attribute Agreement Analysis Whether correct or not, we’ll move forward assuming Gage R&R is the appropriate tool for comparing times. 15. By looking at the spreadsheet of data above, we can see that there is variation in Ethan’s times to remove organs. Write down the name of the part creating the largest percentage of part-to-part variation in removal times. Sandy ball bearing 16. Imagine that you’re playing Operation with any of the characters from page 3, such as Cavity Sam (shown below). Provide one possible (realistic) reason for part-to-part variation in removal times. The cavity for a smaller part could be much more narrow than the cavity of a bigger part, which would make the times different since the narrower cavity is more difficult to conduct surgery on. 17. According to the definitions of Repeatability and Reproducibility, which one is being measured when the appraisers Buzz and Naveen are comparing their times? Note that each part is removed once by Ethan and the time is recorded by both appraisers. Repeatability Reproducibility 18. [+0] Which ailment is creating the largest percentage of “Rep … bility” (the answer you selected in Problem 17) variation in the appraisers’ data. Memory chip Figure. Star Wars robot BB- 8.
+7 19. For another BB-8 surgery team, the two appraisers’ times for their surgeon removing ailments are noticeably different. For example, Ailment Surgeon Appraiser 1: Luke Appraiser 2: Princess Leia Memory Chip Obi-Wan Kenobi 10.45 secs 15.01 secs Sprung Screw Obi-Wan Kenobi 2.99 secs 4.57 secs Cranky Crank Shaft Obi-Wan Kenobi 8.51 secs 12.92 secs There is a problem with their measurement system, and they should NOT move onto the data analysis phase of the project. What would you recommend that they do at this point with so much measurement error? They need to strictly define when the timer needs to start and when it needs to end. 20-2x. Does the Wookies data suggest that the two appraisers, Buzz and Naveen, are interchangeable? So, for example, if Buzz is the only person available to track Ethan’s surgery times, can we assume his time is about the same as Naveen’s, tatistically speaking ? We’ll answer this question using hypothesis testing. Yes 20. [+0] We have two columns of timekeeper data – one set of times from Buzz and the other from Naveen. We want to compare them to see if they are statistically the same. Why would running a 2- sample t -test be the incorrect method for doing this analysis? Hint: The two columns are NOT … The two columns are not from different processes. 21. The two columns of surgery times, Buzz’s and Naveen’s, are paired by what? What “connects” the two times? Careful: ALL times are associated with the surgeon Ethan, but he is not the thing that pairs them. The surgeon The timekeepers The parts The instructor The virtual classroomMy mom +8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
22. Use Minitab’s Calc > Calculator to make a column of Time Differences of Buzz’s and Naveen’s times. You can subtract the columns in either order – it doesn’t matter. Statistically speaking, do the differences appear to be from a normal distribution? Briefly explain your answer, which may be a Minitab graphic or appropriate p -value. As a reminder, the null and alternative hypotheses for a normality test are: H 0 : Data is from a normal distribution vs H a : Data is not from a normal distribution The p-value is high at 0.198 so the data suggests that we should reject the null hypothesis. 23. We are trying to determine if the times are statistically the same for each timekeeper. We now have a column of differences. The data suggests that their times are the same if their mean difference is ____ (value)? Fill in the appropriate value (the blank space below) in the hypothesis test, where diff represents differences. H 0 : μ diff = 0 versus H a : μ diff 0 24. [+0] Perform the hypothesis test that you wrote down in Problem 23 in Minitab. Copy and paste the Minitab output below, including the test’s p -value. Make sure you perform the correct hypothesis test. Recall: You have n = 12 differences, you determined the differences were normal / not normal, and whether you decided this earlier or not – just assume the differences are IID.
25. [+0] At significance level α = 0.05, does the data suggest that the timekeepers’ removal times are statistically the same? Why or why not? Briefly explain your answer. [Hint: p -value] The data suggests that the times are statistically the same because the p-value is 0.994 +9 26. To further verify your conjecture, go to StatKey and enter the column of differences for a “CI for a single mean.” Make sure you have a header row, i.e., title. Bootstrap the differences at least 5000 times. Construct a 95% confidence interval for μ diff . Copy and paste your bootstrap confidence interval below. Does the confidence interval contain 0? Yes No To agree with your result in Problem 25, should it contain 0? Yes No 27. Is the standard deviation of the differences more than 0.33 seconds? Run the appropriate hypothesis test in Minitab and/or StatKey and copy and paste the results below, including the p -value or 95% confidence interval. Does the data suggest that σ is larger than 0.33?
Yes, the data suggests that the standard deviation is 0.357 which is higher than 0.33 Did you prefer one document with both the instructions and answers on it or do you like having two difference documents? It's easier to grade having just the answers, but is it easier to do with all the info in one place? +10 It was slightly quicker to have instructions and answers all on one sheet, but it was very wordy so our team likes two separate documents for instructions and answers.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help