MNET 315 Ch 11 Text Nonparametric Tests (Missing from book)

pdf

School

New Jersey Institute Of Technology *

*We aren’t endorsed by this school

Course

315

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

53

Uploaded by LieutenantKookabura2603

Report
C H A P T E R 11 582 Nonparametric Tests 11.1 The Sign Test 11.2 The Wilcoxon Tests Case Study 11.3 The Kruskal-Wallis Test 11.4 Rank Correlation 11.5 The Runs Test Uses and Abuses Real Statistics—Real Decisions Technology In a recent year, the most common form of reported identity theft was employment- or tax-related fraud, which accounted for 34% of cases. The second most common form was credit card fraud, which accounted for 33% of cases.
583 Where You’re Going In this chapter, you will study additional statistical tests that do not require the population distribution to meet any specific conditions. Each of these tests has usefulness in real-life applications. With the data above, the number of fraud complaints F and the number of identity theft victims V can be related by the regression equation V = 0.145 F + 429.103. The correlation coefficient is approximately 0.915, so there is a strong positive correlation. You can determine that the correlation is significant by using Table 11 in Appendix B. Further analysis of the data, however, can show that the variables do not appear to have a bivariate normal distribution, which is one of the requirements for using the Pearson correlation coefficient. So, although a simple correlation test might indicate a relationship between the number of fraud complaints and the number of identity theft victims, one might question the results because the data do not fit the requirements for the test. Similar tests you will study in this chapter, such as Spearman’s rank correlation test, will give you additional information. The Spearman’s rank correlation coefficient for this data is approximately 0.962. At a = 0.01, there is in fact a significant correlation between the number of fraud complaints and the number of identity theft victims for each state. Fraud complaints Identity theft victims x y Number of Fraud Complaints and Identity Theft Victims for 25 States 20,000 40,000 60,000 80,000 100,000 120,000 5,000 10,000 15,000 20,000 25,000 Where You’ve Been Up to this point in the text, you have studied dozens of different statistical formulas and tests that can help you in a decision-making process. Specific conditions had to be satisfied in order to use these formulas and tests. Suppose it is believed that as the number of fraud complaints in a state increases, the number of identity theft victims also increases. Can this belief be supported by actual data? The table below shows the numbers of fraud complaints and the numbers of identify theft victims for 25  randomly selected states in a recent year. (Source: Federal Trade Commission) Fraud complaints 39,344 45,528 33,745 21,117 7593 117,189 5768 7800 14,635 Identity theft victims 4007 8748 6203 4933 1484 12,787 789 1348 2532 Fraud complaints 5642 48,594 107,557 4600 25,636 7525 112,006 77,213 Identity theft victims 1170 8251 17,430 711 3993 1352 20,205 11,009 Fraud complaints 20,350 22,385 7206 2775 51,036 12,750 40,423 9948 Identity theft victims 3337 4312 1216 503 5718 2540 8310 1093
The Sign Test 11.1 584 CHAPTER 11 Nonparametric Tests What You Should Learn How to use the sign test to test a population median How to use the paired-sample sign test to test the difference between two population medians (dependent samples) The Sign Test for a Population Median The Paired-Sample Sign Test The Sign Test for a Population Median Many of the hypothesis tests studied so far have imposed one or more requirements for a population distribution. For instance, some tests require that a population must have a normal distribution, and other tests require that population variances be equal. What should you do when such requirements cannot be met? For these cases, statisticians have developed hypothesis tests that are “distribution free.” Such tests are called nonparametric tests. A nonparametric test is a hypothesis test that does not require any specific conditions concerning the shapes of population distributions or the values of population parameters. DEFINITION Nonparametric tests are usually easier to perform than corresponding parametric tests. They are, however, usually less efficient than parametric tests. Stronger evidence is required to reject a null hypothesis using the results of a nonparametric test. Consequently, whenever possible, you should use a parametric test. One of the easiest nonparametric tests to perform is the sign test. The only condition necessary to use a sign test is that the sample is randomly selected. The sign test is a nonparametric test that can be used to test a population median against a hypothesized value k. DEFINITION The sign test for a population median can be left-tailed, right-tailed, or two-tailed. The null and alternative hypotheses for each type of test are shown below. Left-tailed test: H 0 : median Ú k and H a : median 6 k Right-tailed test: H 0 : median k and H a : median 7 k Two-tailed test: H 0 : median = k and H a : median k To use the sign test, first compare each entry in the sample with the hypothesized median k . When the entry is below the median, assign it a - sign; when the entry is above the median, assign it a + sign; and when the entry is equal to the median, assign it a 0. Then compare the number of + and - signs. (The 0’s are ignored.) When there is a large difference between the number of + signs and the number of - signs, it is likely that the median is different from the hypothesized value and you should reject the null hypothesis. Study Tip For many nonparametric tests, statisticians test the median instead of the mean.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.1 The Sign Test 585 Table 8 in Appendix B lists the critical values for the sign test for selected levels of significance and sample sizes. When the sign test is used, the sample size n is the total number of + and - signs. When the sample size is greater than 25, you can use the standard normal distribution to find the critical values. When n 25, the test statistic for the sign test is x , the smaller number of + or - signs. When n 7 25, the test statistic for the sign test is z = 1 x + 0.5 2 - 0.5 n 1 n 2 where x is the smaller number of + or - signs and n is the sample size, i.e., the total number of + and - signs. Test Statistic for the Sign Test Because x is defined to be the smaller number of + or - signs, the rejection region is always in the left tail. Consequently, the sign test for a population median is always a left-tailed test or a two-tailed test. When the test is two-tailed, use only the left-tailed critical value. (When x is defined to be the larger number of + or - signs, the rejection region is always in the right tail. Right-tailed sign tests are presented in the exercises.) Performing a Sign Test for a Population Median In Words In Symbols 1. Verify that the sample is random. 2. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 3. Specify the level of significance. Identify a . 4. Determine the sample size n by n = total number of assigning + signs, - signs, and 0’s + and - signs to the sample data. 5. Determine the critical value. When n 25, use Table 8 in Appendix B. When n 7 25, use Table 4 in Appendix B. 6. Find the test statistic. When n 25, use x = smaller number of + or - signs. When n 7 25, use z = 1 x + 0.5 2 - 0.5 n 1 n 2 . 7. Make a decision to reject or fail If the test statistic is less than to reject the null hypothesis. or equal to the critical value, then reject H 0 . Otherwise, fail to reject H 0 . 8. Interpret the decision in the context of the original claim. GUIDELINES Study Tip Because the 0’s are ignored, there are two possible outcomes when comparing a data entry with a hypothesized median: a + or a - sign. If the median is k , then about half of the values will be above k and half will be below. As such, the probability for each sign is 0.5. Table 8 in Appendix B is constructed using the binomial distribution where p = 0.5. When n 7 25, you can use the normal approximation (with a continuity correction) for the binomial. In this case, use m = np = 0.5 n and s = 1 npq = 1 n 2 .
586 CHAPTER 11 Nonparametric Tests Using the Sign Test A website administrator for a company claims that the median number of visitors per day to the company’s website is no more than 1500. An employee doubts the accuracy of this claim. The numbers of visitors per day for 20 randomly selected days are listed below. At a = 0.05, can the employee reject the administrator’s claim? 1469 1462 1634 1602 1500 1463 1476 1570 1544 1452 1487 1523 1525 1548 1511 1579 1620 1568 1492 1649 SOLUTION The claim is “the median number of visitors per day to the company’s website is no more than 1500.” So, the null and alternative hypotheses are H 0 : median 1500 (Claim) and H a : median 7 1500. To compare each data entry with the hypothesized median 1500, subtract 1500 from each data entry and assign the appropriate sign or 0. For instance, here are the comparisons for the first row of data entries. 1469 - 1500 = - 31, assign a - sign 1462 - 1500 = - 38, assign a - sign 1634 - 1500 = + 134, assign a + sign 1602 - 1500 = + 102, assign a + sign 1500 - 1500 = 0, assign a 0 The results of comparing each data entry with the hypothesized median 1500 are shown. - - + + 0 - - + + - - + + + + + + + - + You can see that there are 7 - signs and 12 + signs. So, n = 12 + 7 = 19. Because n 25, use Table 8 in Appendix B to find the critical value. The test is a one-tailed test with a = 0.05 and n = 19. So, the critical value is 5. Because n 25, the test statistic x is the smaller number of + or - signs. So, x = 7. Because x = 7 is greater than the critical value, the employee should fail to reject the null hypothesis. Interpretation There is not enough evidence at the 5% level of significance for the employee to reject the website administrator’s claim that the median number of visitors per day to the company’s website is no more than 1500. TRY IT YOURSELF 1 A real estate agency claims that the median number of days a home is on the market in its city is greater than 120. A homeowner wants to verify the accuracy of this claim. The numbers of days on the market for 24 randomly selected homes are shown below. At a = 0.025, can the homeowner support the agency’s claim? 118 167 72 79 76 106 102 113 73 119 162 114 120 93 135 147 77 157 115 88 152 70 65 91 Answer: Page T1 EXAMPLE 1
SECTION 11.1 The Sign Test 587 Using the Sign Test An organization claims that the median annual attendance for museums in the United States is at least 39,000. A random sample of 125 museums reveals that the annual attendances for 79 museums were less than 39,000, the annual attendances for 42 museums were more than 39,000, and the annual attendances for 4 museums were 39,000. At a = 0.01, is there enough evidence to reject the organization’s claim? (Adapted from American Association of Museums) SOLUTION The claim is “the median annual attendance for museums in the United States is at least 39,000.” So, the null and alternative hypotheses are H 0 : median Ú 39,000 (Claim) and H a : median 6 39,000. Because n 7 25, use Table 4 in Appendix B, the Standard Normal Table, to find the critical value. Because the test is a left-tailed test with a = 0.01, the critical value is z 0 = - 2.33. Of the 125 museums, there are 79 - signs and 42 + signs. When the 0’s are ignored, the sample size is n = 79 + 42 = 121, and x = 42. With these values, the test statistic is z = 1 42 + 0.5 2 - 0.5 1 121 2 2 121 2 = - 18 5.5 - 3.27. The figure shows the location of the rejection region and the test statistic z . Because z is less than the critical value, it is in the rejection region. So, you reject the null hypothesis. z 3 4 2 1 0 1 2 3 4 z 0 = 2.33 α = 0.01 z 3.27 Interpretation There is enough evidence at the 1% level of significance to reject the organization’s claim that the median annual attendance for museums in the United States is at least 39,000. TRY IT YOURSELF 2 An organization claims that the median age of museum workers in the United States is 46 years old. A random sample of 95 museum workers reveals that 57  museum workers were less than 46 years old, 34 museum workers were more than 46 years old, and 4 museum workers were 46 years old. At a = 0.10, can you reject the organization’s claim? (Adapted from American Association of Museums) Answer: Page T1 EXAMPLE 2 Picturing the World For recent college graduates in the United States, a financial analyst claims that the median auto loan is $21,883. A random sample of recent college graduates reveals that the loans for 42 graduates were less than $21,883 and the loans for 35 graduates were greater than $21,883. (Adapted from lendedu.com) Would you use a parametric test or a nonparametric test to test the claim that for recent college graduates in the United States, the median auto loan is $21,883? Explain your reasoning. Study Tip When performing a two-tailed sign test, remember to use only the left-tailed critical value.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
588 CHAPTER 11 Nonparametric Tests The Paired-Sample Sign Test In Section 8.3, you learned how to use a t @ test for the difference between means of dependent samples. That test required both populations to be normally distributed. When the parametric condition of normality cannot be satisfied, you can use the paired-sample sign test to test the difference between two population medians. To perform the paired-sample sign test for the difference between two population medians, these conditions must be met. 1. A sample must be randomly selected from each population. 2. The samples must be dependent (paired). The paired-sample sign test can be left-tailed, right-tailed, or two-tailed. This test is similar to the sign test for a single population median. However, instead of comparing each data entry with a hypothesized median and recording a + , - , or 0, you find the difference between corresponding data entries and record the sign of the difference. Generally, to find the difference, subtract the entry representing the second variable from the entry representing the first variable. Then compare the number of + and - signs. (The 0’s are ignored.) When the number of + signs is approximately equal to the number of - signs, you should fail to reject the null hypothesis. When there is a large difference between the number of + signs and the number of - signs, you should reject the null hypothesis. Performing a Paired-Sample Sign Test In Words In Symbols 1. Verify that the samples are random and dependent. 2. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 3. Specify the level of significance. Identify a . 4. Determine the sample size n by n = total number of finding the difference for each + and - signs data pair. Assign a + sign for a positive difference, a - sign for a negative difference, and a 0 for no difference. 5. Determine the critical value. Use Table 8 in Appendix B. 6. Find the test statistic. x = smaller number of + or - signs 7. Make a decision to reject or fail If the test statistic is less than to reject the null hypothesis. or equal to the critical value, then reject H 0 . Otherwise, fail to reject H 0 . 8. Interpret the decision in the context of the original claim. GUIDELINES
SECTION 11.1 The Sign Test 589 Using the Paired-Sample Sign Test A psychologist claims that the number of repeat offenders will decrease when first-time offenders complete a particular rehabilitation course. You randomly select 10 prisons and record the number of repeat offenders during a two-year period. Then, after first-time offenders complete the course, you record the number of repeat offenders at each prison for another two-year period. The results are shown in the table below. At a = 0.025, can you support the psychologist’s claim? Prison 1 2 3 4 5 6 7 8 9 10 Before 21 34 9 45 30 54 37 36 33 40 After 19 22 16 31 21 30 22 18 17 21 SOLUTION To support the psychologist’s claim, use the null and alternative hypotheses below. H 0 : The number of repeat offenders will not decrease. H a : The number of repeat offenders will decrease. (Claim) The table below shows the sign of the differences between the “before” and “after” data. Prison 1 2 3 4 5 6 7 8 9 10 Before 21 34 9 45 30 54 37 36 33 40 After 19 22 16 31 21 30 22 18 17 21 Sign + + - + + + + + + + You can see that there is 1 - sign and there are 9 + signs. So, n = 1 + 9 = 10. Because the test is a one-tailed test with a = 0.025 and n = 10, the critical value is 1. The test statistic x is the smaller number of + or - signs. So, x = 1. Because x is equal to the critical value, you reject the null hypothesis. Interpretation There is enough evidence at the 2.5% level of significance to support the psychologist’s claim that the number of repeat offenders will decrease. TRY IT YOURSELF 3 A medical researcher claims that a new vaccine will decrease the number of colds in adults. You randomly select 14 adults and record the number of colds each has in a one-year period. After giving the vaccine to each adult, you again record the number of colds each has in a one-year period. The results are shown in the table below. At a = 0.05, can you support the researcher’s claim? Adult 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Before vaccine 3 4 2 1 3 6 4 5 2 0 2 5 3 3 After vaccine 2 1 0 1 1 3 3 2 2 2 3 4 3 2 Answer: Page T1 EXAMPLE 3
11.1 EXERCISES 590 CHAPTER 11 Nonparametric Tests For Extra Help: MyLab Statistics Building Basic Skills and Vocabulary 1. What is a nonparametric test? How does a nonparametric test differ from a parametric test? What are the advantages and disadvantages of using a nonparametric test? 2. When the sign test is used, what population parameter is being tested? 3. Describe the test statistic for the sign test when the sample size n is less than or equal to 25 and when n is greater than 25. 4. In your own words, explain why the hypothesis test discussed in this section is called the sign test. 5. Explain how to use the sign test to test a population median. 6. List the two conditions that must be met in order to use the paired-sample sign test. Using and Interpreting Concepts Performing a Sign Test In Exercises 7–22, (a) identify the claim and state H 0 and H a , (b) find the critical value, (c) find the test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim. 7. Credit Card Charges A financial service accountant claims that the median credit card balance of college students is more than $300. You randomly select the credit card accounts of 12 college students and record the balance for each account. The balances (in dollars) are listed below. At a = 0.01, can you support the accountant’s claim? (Adapted from Sallie Mae) 346.71 382.59 255.03 202.17 309.80 265.88 299.41 270.38 296.54 318.46 245.92 309.47 8. Temperature A meteorologist claims that the median daily high temperature  for the month of July in Pittsburgh is 83 ° Fahrenheit. The high temperatures (in degrees Fahrenheit) for 15 randomly selected July days in Pittsburgh are listed below. At a = 0.01, is there enough evidence to reject the meteorologist’s claim? (Adapted from U.S. National Oceanic and Atmospheric Administration) 74 79 81 86 90 79 81 83 81 74 78 76 84 82 85 9. Sales Prices of Homes A real estate agent claims that the median sales price of new privately owned one-family homes sold in a recent month is $253,000 or less. The sales prices (in dollars) of 10 randomly selected homes are listed below. At a = 0.05, is there enough evidence to reject the agent’s claim? (Adapted from National Association of Realtors) 262,600 300,100 269,200 249,400 183,400 253,500 325,600 223,500 241,300 271,300
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.1 The Sign Test 591 10. Temperature During a weather report, a meteorologist claims that the median daily high temperature for the month of January in San Diego is 66 ° Fahrenheit. The high temperatures (in degrees Fahrenheit) for 16  randomly selected January days in San Diego are listed below. At a = 0.01, can you reject the meteorologist’s claim? (Adapted from U.S. National Oceanic and Atmospheric Administration) 78 74 72 72 70 70 72 78 74 71 72 74 77 79 75 73 11. Credit Card Debt A financial services institution claims that the median amount of credit card debt for families holding such debts is at least $2300. In a random sample of 104 families with credit card debt, the debts of 60 families were less than $2300 and the debts of 44 families were greater than $2300. At a = 0.02, can you reject the institution’s claim? (Adapted from Board of Governors of the Federal Reserve System) 12. Financial Debt A financial services accountant claims that the median amount of financial debt for families holding such debts is less than $60,000. In a random sample of 70 families with financial debt, the debts of 24 families were less than $60,000 and the debts of 46 families were greater than $60,000. At a = 0.025, can you support the accountant’s claim? (Adapted from Board of Governors of the Federal Reserve System) 13. Social Media A research group claims that the median age of the users of a social media website is greater than 30 years old. In a random sample of 24 users, 11 are less than 30 years old, 10 are more than 30 years old, and 3 are 30 years old. At a = 0.01, can you support the research group’s claim? (Adapted from Pew Research Center) 14. Social Networking A research group claims that the median age of the users of a social networking website is less than 32 years old. In a random sample of 20 users, 5 are less than 32 years old, 13 are more than 32 years old, and 2 are 32 years old. At a = 0.05, can you support the research group’s claim? (Adapted from Pew Research Center) 15. Unit Size A renters’ organization claims that the median number of rooms in renter-occupied units is four. You randomly select 120 renter-occupied units and obtain the results shown below. At a = 0.05, can you reject the organization’s claim? (Adapted from U.S. Census Bureau) Unit size Number of units Fewer than 4 rooms 29 4 rooms 38 More than 4 rooms 53 Square footage Number of units Less than 1000 13 1000 2 More than 1000 7 TABLE FOR EXERCISE 15 TABLE FOR EXERCISE 16 16. Square Footage A renters’ organization claims that the median square footage of renter-occupied units is 1000 square feet. You randomly select 22 renter-occupied units and obtain the results shown above. At a = 0.10, can you reject the organization’s claim? (Adapted from U.S. Census Bureau) 17. Hourly Wages A labor organization claims that the median hourly wage of computer systems analysts is $41.93. In a random sample of 45 computer systems analysts, 18 earn less than $41.93 per hour, 25 earn more than $41.93 per hour, and 2 earn $41.93 per hour. At a = 0.01, can you reject the labor organization’s claim? (Adapted from U.S. Bureau of Labor Statistics)
592 CHAPTER 11 Nonparametric Tests 18. Hourly Wages A labor organization claims that the median hourly wage of podiatrists is at least $60.01. In a random sample of 23 podiatrists, 17 earn less than $60.01 per hour, 5 earn more than $60.01 per hour, and 1 earns $60.01 per hour. At a = 0.05, can you reject the labor organization’s claim? (Adapted from U.S. Bureau of Labor Statistics) 19. Lower Back Pain A physician claims that lower back pain intensity scores will decrease after receiving acupuncture treatment. The table shows the lower back pain intensity scores for eight patients before and after receiving acupuncture for eight weeks. At a = 0.05, is there enough evidence to support the physician’s claim? (Adapted from Archives of Internal Medicine) Patient 1 2 3 4 5 6 7 8 Intensity score (before) 59.2 46.3 65.4 74.0 79.3 81.6 44.4 59.1 Intensity score (after) 12.4 22.5 18.6 59.3 70.1 70.2 13.2 25.9 20. Lower Back Pain A physician claims that lower back pain intensity scores will decrease after taking anti-inflammatory drugs. The table shows the lower back pain intensity scores for 12 patients before and after taking anti-inflammatory drugs for 8 weeks. At a = 0.05, is there enough evidence to support the physician’s claim? (Adapted from Archives of Internal Medicine) Patient 1 2 3 4 5 6 Intensity score (before) 71.0 42.1 79.1 57.5 64.0 60.4 Intensity score (after) 60.1 23.4 86.2 62.1 44.2 49.7 Patient 7 8 9 10 11 12 Intensity score (before) 68.3 95.2 48.1 78.6 65.4 59.9 Intensity score (after) 58.3 72.6 51.8 82.5 63.2 47.9 21. Improving SAT Scores A tutoring agency claims that by completing a special course, students will improve their math SAT scores. In part of a study, 12 students take the math part of the SAT, complete the special course, then take the math part of the SAT again. The students’ scores are shown below. At a = 0.05, is there enough evidence to support the agency’s claim? Student 1 2 3 4 5 6 Score on first SAT 300 450 350 430 300 470 Score on second SAT 300 520 400 410 300 480 Student 7 8 9 10 11 12 Score on first SAT 530 200 200 350 360 250 Score on second SAT 700 250 390 350 480 300
SECTION 11.1 The Sign Test 593 22. SAT Scores A guidance counselor claims that students who take the SAT twice will improve their scores the second time they take the SAT. The table shows both math SAT scores for 12 students who took the SAT twice. At a = 0.01, can you support the guidance counselor’s claim? Student 1 2 3 4 5 6 Score on first SAT 440 510 420 450 620 450 Score on second SAT 440 570 510 470 610 450 Student 7 8 9 10 11 12 Score on first SAT 350 470 320 510 630 570 Score on second SAT 370 530 290 500 640 600 23. Feeling Your Age A research organization conducts a survey by randomly selecting adults and asking each, “How do you feel relative to your age?” The results are shown in the figure. (Adapted from Pew Research Center) Younger 11 Older 3 My age 9 (a) Use a sign test to test the null hypothesis that the proportion of adults who feel older is equal to the proportion of adults who feel younger. Assign a + sign to each adult who responded “older,” assign a - sign to each adult who responded “younger,” and assign a 0 to each adult who responded “my age.” Use a = 0.05. (b) What can you conclude? 24. Contacting Parents A research organization conducts a survey by randomly selecting adults and asking each, “How frequently do you contact your parents by phone?” The results are shown in the figure. (Adapted from Pew Research Center) Weekly 12 Daily 8 Other 6 (a) Use a sign test to test the null hypothesis that the proportion of adults who contact their parents by phone weekly is equal to the proportion of adults who contact their parents by phone daily. Assign a + sign to each adult who responded “weekly,” assign a - sign to each adult who responded “daily,” and assign a 0 to each adult who responded “other.” Use a = 0.05. (b) What can you conclude?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
594 CHAPTER 11 Nonparametric Tests Extending Concepts More on Sign Tests When you are using a sign test for n 7 25 and the test is left-tailed, you know you can reject the null hypothesis when the test statistic z = 1 x + 0.5 2 - 0.5n 1 n 2 is less than or equal to the left-tailed critical value, where x is the smaller number of + or - signs. For a right-tailed test, you can reject the null hypothesis when the test statistic z = 1 x - 0.5 2 - 0.5n 1 n 2 is greater than or equal to the right-tailed critical value, where x is the larger number of + or - signs. In Exercises 25–28, use a right-tailed test and (a) identify the claim and state H 0 and H a , (b) find the critical value, (c) find the test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim. 25. Weekly Earnings A labor organization claims that the median weekly earnings of female workers is less than or equal to $765. To test this claim, you randomly select 50 female workers and ask each to provide her weekly earnings. The table shows the results. At a = 0.01, can you reject the organization’s claim? (Adapted from U.S. Bureau of Labor Statistics) Weekly earnings Number of workers Less than $765 18 $765 3 More then $765 29 Weekly earnings Number of workers Less than $950 23 $950 2 More than $950 45 TABLE FOR EXERCISE 25 TABLE FOR EXERCISE 26 26. Weekly Earnings A labor organization claims that the median weekly earnings of male workers is greater than $950. To test this claim, you randomly select 70 male workers and ask each to provide his weekly earnings. The table shows the results. At a = 0.01, can you support the organization’s claim? (Adapted from U.S. Bureau of Labor Statistics) 27. Ages of Brides A marriage counselor claims that the median age of brides at the time of their first marriage is less than or equal to 27 years old. In a random sample of 65 brides, 24 are less than 27 years old, 35 are more than 27 years old, and 6 are 27 years old. At a = 0.05, can you reject the counselor’s claim? (Adapted from U.S. Census Bureau) 28. Ages of Grooms A marriage counselor claims that the median age of grooms at the time of their first marriage is greater than 28 years old. In a random sample of 56 grooms, 33 are less than 28 years old and 23 are more than 28 years old. At a = 0.05, can you support the counselor’s claim? (Adapted from U.S. Census Bureau)
The Wilcoxon Tests 11.2 SECTION 11.2 The Wilcoxon Tests 595 What You Should Learn How to use the Wilcoxon signed-rank test to determine whether two dependent samples are selected from populations having the same distribution How to use the Wilcoxon rank sum test to determine whether two independent samples are selected from populations having the same distribution The Wilcoxon Signed-Rank Test The Wilcoxon Rank Sum Test The Wilcoxon Signed-Rank Test In this section, you will study the Wilcoxon signed-rank test and the Wilcoxon rank sum test. Unlike the sign test from Section 11.1, the strength of these two nonparametric tests is that each considers the magnitude, or size, of the data entries. In Section 8.3, you used a t @ test together with dependent samples to determine whether there was a difference between two populations. To use the t @ test to test such a difference, you must assume (or know) that the dependent samples are randomly selected from populations having a normal distribution. But, what should you do when the normality assumption cannot be made? Instead of using the two-sample t @ test, you can use the Wilcoxon signed-rank test. The Wilcoxon signed-rank test is a nonparametric test that can be used to determine whether two dependent samples were selected from populations having the same distribution. DEFINITION Performing a Wilcoxon Signed-Rank Test In Words In Symbols 1. Verify that the samples are random and dependent. 2. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 3. Specify the level of significance. Identify a . 4. Determine the sample size n , which is the number of pairs of data for which the difference is not 0. 5. Determine the critical value. Use Table 9 in Appendix B. 6. Find the test statistic w s . Headers: Sample 1, Sample 2, Difference, Absolute value, Rank, and Signed rank. Signed rank takes on the same sign as its corresponding difference. a. Complete a table using the headers listed at the right. b. Find the sum of the positive ranks and the sum of the negative ranks. c. Select the smaller absolute value of the sums. 7. Make a decision to reject or fail If w s is less than or equal to reject the null hypothesis. to the critical value, then reject H 0 . Otherwise, fail to reject H 0 . 8. Interpret the decision in the context of the original claim. GUIDELINES Study Tip Recall that the absolute value of a number is its value, disregarding its sign. A pair of vertical bars, 0 0 , is used to denote absolute value. For example, 0 3 0 = 3 and 0 - 7 0 = 7.
596 CHAPTER 11 Nonparametric Tests Performing a Wilcoxon Signed-Rank Test A golf club manufacturer claims that golfers can lower their scores by using the manufacturer’s newly designed golf clubs. The table shows the scores of 10 golfers while using the old design and while using the new design on the same golf course. At a = 0.05, can you support the manufacturer’s claim? Golfer 1 2 3 4 5 6 7 8 9 10 Score (old design) 89 84 96 74 91 85 95 82 92 81 Score (new design) 83 83 92 76 91 80 87 85 90 77 SOLUTION The claim is “golfers can lower their scores.” To test this claim, use the null and alternative hypotheses below. H 0 : The new design does not lower scores. H a : The new design lowers scores. (Claim) This Wilcoxon signed-rank test is a one-tailed test with a = 0.05, and because one data pair has a difference of 0, n = 9 instead of 10. From Table 9 in Appendix B, the critical value is 8. To find the test statistic w s , complete a table as shown below. Score (old design) Score (new design) Difference Absolute value Rank Signed rank 89 83 6 6 8 8 84 83 1 1 1 1 96 92 4 4 5.5 5.5 74 76 - 2 2 2.5 - 2.5 91 91 0 0 85 80 5 5 7 7 95 87 8 8 9 9 82 85 - 3 3 4 - 4 92 90 2 2 2.5 2.5 81 77 4 4 5.5 5.5 The sum of the negative ranks is - 2.5 + 1 - 4 2 = - 6.5. The sum of the positive ranks is 8 + 1 + 5.5 + 7 + 9 + 2.5 + 5.5 = 38.5. The test statistic is the smaller absolute value of these two sums. Because 0 - 6.5 0 6 0 38.5 0 , the test statistic is w s = 6.5. Because the test statistic is less than the critical value, that is, 6.5 6 8, you reject the null hypothesis. Interpretation There is enough evidence at the 5% level of significance to support the claim that golfers can lower their scores by using the newly designed clubs. EXAMPLE 1 Study Tip Do not assign a rank to any difference of 0. In the case of a tie between data entries, use the average of the corresponding ranks. For instance, when two data entries are tied for the fifth rank, use the average of 5 and 6, which is 5.5, as the rank for both entries. The next data entry will be assigned a rank of 7, not 6. When three entries are tied for the fifth rank, use the average of 5, 6, and 7, which is 6, as the rank for all three data entries. The next data entry will be assigned a rank of 8.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.2 The Wilcoxon Tests 597 TRY IT YOURSELF 1 A quality control inspector wants to test the claim that a spray-on water repellent is effective. To test this claim, he selects 12 pieces of fabric, sprays water on each, and measures the amount of water repelled (in milliliters). He then applies the water repellent and repeats the experiment. The table shows the results. At a = 0.01, can he conclude that the water repellent is effective? Fabric 1 2 3 4 5 6 7 8 9 10 11 12 No repellent 8 7 7 4 6 10 9 5 9 11 8 4 Repellent applied 15 12 11 6 6 8 8 6 12 8 14 8 Answer: Page T1 The Wilcoxon Rank Sum Test In Sections 8.1 and 8.2, you used a z @ test ( s 1 and s 2 known) or a t @ test ( s 1 and s 2 unknown) together with independent samples to determine whether there was a difference between two populations. To use a z @ test or a t @ test to test such a difference, you must assume (or know) that the samples are random and independent, and either the populations are normally distributed or each sample size is at least 30. But, what should you do when the normality and sample size assumptions cannot be made? You can still compare the populations using the Wilcoxon rank sum test. The Wilcoxon rank sum test is a nonparametric test that can be used to determine whether two independent samples were selected from populations having the same distribution. DEFINITION A requirement for the Wilcoxon rank sum test is that the sample size of each sample must be at least 10. When calculating the test statistic for the Wilcoxon rank sum test, let n 1 represent the sample size of the smaller sample and n 2 represent the sample size of the larger sample. When the two samples have the same size, it does not matter which one is n 1 or n 2 . When calculating the sum of the ranks R , combine both samples and rank the combined data. Then sum the ranks for the smaller of the two samples. When the two samples have the same size, you can use the ranks from either sample, but you must use the ranks from the sample you associate with n 1 . For two independent samples, the test statistic z for the Wilcoxon rank sum test is z = R - m R s R where R is the sum of the ranks for the smaller sample, m R = n 1 1 n 1 + n 2 + 1 2 2 , and s R = B n 1 n 2 1 n 1 + n 2 + 1 2 12 . Test Statistic for the Wilcoxon Rank Sum Test Picturing the World To help determine when knee arthroscopy patients can resume driving after surgery, the driving reaction times (in milliseconds) of 10 right knee arthroscopy patients were measured before surgery and 4 weeks after surgery using a computer-linked car simulator. The table shows the results. (Adapted from Knee Surgery, Sports Traumatology, Arthroscopy Journal) Patient Reaction time before surgery Reaction time 4 weeks after surgery 1 720 730 2 750 645 3 735 745 4 730 640 5 755 660 6 745 670 7 730 650 8 725 730 9 770 675 10 700 705 At A = 0.05, can you conclude that the reaction times changed significantly four weeks after surgery? Study Tip Use the Wilcoxon signed-rank test for dependent samples and the Wilcoxon rank sum test for independent samples.
598 CHAPTER 11 Nonparametric Tests Performing a Wilcoxon Rank Sum Test In Words In Symbols 1. Verify that the samples are random and independent. 2. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 3. Specify the level of significance. Identify a . 4. Determine the critical value(s) Use Table 4 in Appendix B. and the rejection region(s). 5. Determine the sample sizes. n 1 n 2 6. Find the sum of the ranks for the R smaller sample. a. List the combined data in ascending order. b. Rank the combined data. c. Add the sum of the ranks for the smaller sample, n 1 . 7. Find the test statistic and sketch z = R - m R s R the sampling distribution. 8. Make a decision to reject or fail If z is in the rejection region, to reject the null hypothesis. then reject H 0 . Otherwise, fail to reject H 0 . 9. Interpret the decision in the context of the original claim. GUIDELINES Performing a Wilcoxon Rank Sum Test The table shows the earnings (in thousands of dollars) of a random sample of 10 male and 12 female pharmaceutical sales representatives. At a = 0.10, can you conclude that there is a difference between the males’ and females’ earnings? Male earnings 78 93 114 101 98 94 86 95 117 99 Female earnings 86 77 101 93 85 98 91 87 84 97 100 90 SOLUTION The claim is “there is a difference between the males’ and females’ earnings.” To test this claim, use the null and alternative hypotheses below. H 0 : There is no difference between the males’ and the females’ earnings. H a : There is a difference between the males’ and the females’ earnings. (Claim) Because the test is a two-tailed test with a = 0.10, the critical values are - z 0 = - 1.645 and z 0 = 1.645. The rejection regions are z 6 - 1.645 and z 7 1.645. EXAMPLE 2
SECTION 11.2 The Wilcoxon Tests 599 The sample size for men is 10 and the sample size for women is 12. Because 10 6 12, n 1 = 10 and n 2 = 12. Before calculating the test statistic, you must find the values of R , m R , and s R . The table shows the combined data listed in ascending order and the corresponding ranks. Ordered data Sample Rank 77 F 1 78 M 2 84 F 3 85 F 4 86 M 5.5 86 F 5.5 87 F 7 90 F 8 91 F 9 93 M 10.5 93 F 10.5 94 M 12 95 M 13 97 F 14 98 M 15.5 98 F 15.5 99 M 17 100 F 18 101 M 19.5 101 F 19.5 114 M 21 117 M 22 Because the smaller sample is the sample of males, R is the sum of the male rankings. R = 2 + 5.5 + 10.5 + 12 + 13 + 15.5 + 17 + 19.5 + 21 + 22 = 138 Using n 1 = 10 and n 2 = 12, you can find m R and s R as follows. m R = n 1 1 n 1 + n 2 + 1 2 2 = 10 1 10 + 12 + 1 2 2 = 230 2 = 115 Study Tip Remember that in the case of a tie between data entries, use the average of the corresponding ranks.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
600 CHAPTER 11 Nonparametric Tests s R = B n 1 n 2 1 n 1 + n 2 + 1 2 12 = B 1 10 21 12 21 10 + 12 + 1 2 12 = A 2760 12 = 2 230 15.17 When R = 138, m R = 115, and s R 15.17, the test statistic is z = R - m R s R 138 - 115 15.17 1.52. The figure shows the location of the rejection regions and the test statistic z . Because z is not in the rejection region, you fail to reject the null hypothesis. z 1.52 α 1 = 0.90 α = 0.05 1 2 α = 0.05 1 2 z 0 1 3 1 2 3 z 0 = 1.645 z 0 = 1.645 Interpretation There is not enough evidence at the 10% level of significance to conclude that there is a difference between the males’ and females’ earnings. TRY IT YOURSELF 2 You are investigating the automobile insurance claims paid (in thousands of dollars) by two insurance companies. The table shows a random sample of 12 claims paid by the two insurance companies. At a = 0.05, can you conclude that there is a difference in the claims paid by the companies? Company A 6.2 10.6 2.5 4.5 6.5 7.4 Company B 7.3 5.6 3.4 1.8 2.2 4.7 Company A 9.9 3.0 5.8 3.9 6.0 6.3 Company B 10.8 4.1 1.7 3.0 4.4 5.3 Answer: Page T1
11.2 EXERCISES SECTION 11.2 The Wilcoxon Tests 601 For Extra Help: MyLab Statistics Building Basic Skills and Vocabulary 1. How do you know whether to use a Wilcoxon signed-rank test or a Wilcoxon rank sum test? 2. What is the requirement for the sample size of each sample when using the Wilcoxon rank sum test? Using and Interpreting Concepts Performing a Wilcoxon Test In Exercises 3– 8, (a) identify the claim and state H 0 and H a . (b) decide whether to use a Wilcoxon signed-rank test or a Wilcoxon rank sum test. (c) find the critical value(s). (d) find the test statistic. (e) decide whether to reject or fail to reject the null hypothesis. (f ) interpret the decision in the context of the original claim. 3. Calcium Supplements and Blood Pressure In a study testing the effects of calcium supplements on blood pressure in men, 12 men were randomly chosen and given a calcium supplement for 12 weeks. The table shows the measurements for each subject’s diastolic blood pressure taken before and after the 12-week treatment period. At a = 0.01, can you reject the claim that there was no reduction in diastolic blood pressure? (Adapted from The Journal of the American Medical Association) Patient 1 2 3 4 5 6 Before treatment 108 109 120 129 112 111 After treatment 99 115 105 116 115 117 Patient 7 8 9 10 11 12 Before treatment 117 135 124 118 130 115 After treatment 108 122 120 126 128 106 4. Wholesale Trade and Manufacturing A private industry analyst claims that there is no difference in the salaries earned by workers in the wholesale trade and manufacturing industries. The table shows the salaries (in thousands of dollars) of a random sample of 10 wholesale trade workers and 10 manufacturing workers. At a = 0.10, can you reject the analyst’s claim? (Adapted from U.S. Bureau of Economic Analysis) Wholesale trade 70 66 65 80 62 69 73 77 74 72 Manufacturing 71 67 56 74 54 65 76 58 64 52
602 CHAPTER 11 Nonparametric Tests 5. Earnings by Degree A college administrator claims that there is a difference in the earnings of people with bachelor’s degrees and those with advanced degrees. The table shows the earnings (in thousands of dollars) of a random sample of 11 people with bachelor’s degrees and 10 people with advanced degrees. At a = 0.05, is there enough evidence to support the administrator’s claim? (Adapted from U.S. Census Bureau) Bachelor’s degree 62 58 71 84 78 58 52 64 68 60 62 Advanced degree 88 91 99 85 90 91 98 98 95 87 6. Headaches A medical researcher wants to determine whether a new drug affects the number of headache hours experienced by headache sufferers. To do so, the researcher randomly selects seven patients and asks each to give the number of headache hours (per day) each experiences before and after taking the drug. The table shows the results. At a = 0.05, can the researcher conclude that the new drug affects the number of headache hours? Patient 1 2 3 4 5 6 7 Headache hours (before) 0.8 2.4 2.8 2.6 2.7 0.9 1.2 Headache hours (after) 1.6 1.3 1.6 1.4 1.5 1.6 1.7 7. Teacher Salaries A teacher’s union representative claims that there is a difference in the salaries earned by teachers in Wisconsin and Michigan. The table shows the salaries (in thousands of dollars) of a random sample of 11 teachers from Wisconsin and 12 teachers from Michigan. At a = 0.05, is there enough evidence to support the representative’s claim? (Adapted from National Education Association) Wisconsin 55 59 49 56 51 61 55 61 53 47 52 Michigan 61 65 55 62 57 67 61 67 59 53 58 76 8. Heart Rate A physician wants to determine whether an experimental medication affects an individual’s heart rate. The physician randomly selects 15 patients and measures the heart rate of each. The subjects then take the medication and have their heart rates measured after one hour. The table shows the results. At a = 0.05, can the physician conclude that the experimental medication affects an individual’s heart rate? Patient 1 2 3 4 5 6 7 8 Heart rate (before) 72 81 75 76 79 74 65 67 Heart rate (after) 73 80 75 79 74 76 73 67 Patient 9 10 11 12 13 14 15 Heart rate (before) 76 83 66 75 76 78 68 Heart rate (after) 74 77 70 77 76 75 74
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.2 The Wilcoxon Tests 603 Extending Concepts Wilcoxon Signed-Rank Test for n + 30 When you are performing a Wilcoxon signed-rank test and the sample size n is greater than 30, you can use the Standard Normal Table and the formula below to find the test statistic. z = w s - n 1 n + 1 2 4 B n 1 n + 1 21 2 n + 1 2 24 In Exercises 9 and 10, perform the Wilcoxon signed-rank test using the test statistic for n 7 30. 9. Fuel Additive A petroleum engineer wants to know whether a certain fuel additive improves a car’s gas mileage. To decide, the engineer records the gas mileages (in miles per gallon) of 33 randomly selected cars with and without the fuel additive. The table shows the results. At a = 0.10, can the engineer conclude that the gas mileage is improved? Car 1 2 3 4 5 6 7 8 9 10 11 Without additive 36.4 36.4 36.6 36.6 36.8 36.9 37.0 37.1 37.2 37.2 36.7 With additive 36.7 36.9 37.0 37.5 38.0 38.1 38.4 38.7 38.8 38.9 36.3 Car 12 13 14 15 16 17 18 19 20 21 22 Without additive 37.5 37.6 37.8 37.9 37.9 38.1 38.4 40.2 40.5 40.9 35.0 With additive 38.9 39.0 39.1 39.4 39.4 39.5 39.8 40.0 40.0 40.1 36.3 Car 23 24 25 26 27 28 29 30 31 32 33 Without additive 32.7 33.6 34.2 35.1 35.2 35.3 35.5 35.9 36.0 36.1 37.2 With additive 32.8 34.2 34.7 34.9 34.9 35.3 35.9 36.4 36.6 36.6 38.3 10. Fuel Additive A petroleum engineer claims that a fuel additive improves gas mileage. The table shows the gas mileages (in miles per gallon) of 32 randomly selected cars measured with and without the fuel additive. Test the petroleum engineer’s claim at a = 0.05. Car 1 2 3 4 5 6 7 8 9 10 11 Without additive 34.0 34.2 34.4 34.4 34.6 34.8 35.6 35.7 30.2 31.6 32.3 With additive 36.6 36.7 37.2 37.2 37.3 37.4 37.6 37.7 34.2 34.9 34.9 Car 12 13 14 15 16 17 18 19 20 21 22 Without additive 33.0 33.1 33.7 33.7 33.8 35.7 36.1 36.1 36.6 36.6 36.8 With additive 34.9 35.7 36.0 36.2 36.5 37.8 38.1 38.2 38.3 38.3 38.7 Car 23 24 25 26 27 28 29 30 31 32 Without additive 37.1 37.1 37.2 37.9 37.9 38.0 38.0 38.4 38.8 42.1 With additive 38.8 38.9 39.1 39.1 39.2 39.4 39.8 40.3 40.8 43.2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
604 CASE STUDY College Ranks 604 CHAPTER 11 Nonparametric Tests Each year, Forbes and the Center for College Affordability and Productivity (CCAP) release a list of the best colleges in the United States. Over 600 colleges and universities are ranked according to factors that fall into one of five categories. 1. Postgraduate success, which is based on salary of alumni by school and the alumni who appear on CCAP’s America’s Leaders list 2. Student debt, which is based on three components: average federal student loan debt load, student loan default rates, and predicted versus actual percent of students taking federal loans 3. Student satisfaction, which is based on student retention rates and student evaluations of professors 4. Graduation rate, which is based on how many students actually finish their degrees in four years and the actual versus predicted rate 5. Academic success, which is based on students who have won competitive scholarships and fellowships, and students who have gone on to earn Ph.D.s The table shows the student populations for randomly selected colleges by region on the 2016 list. EXERCISES 1. Construct a side-by-side box-and-whisker plot for the four regions. Do any of the median student populations appear to be the same? Do any appear to be different? In Exercises 2 –5, use the sign test to test the claim. What can you conclude? Use a = 0.05. 2. The median student population at a college in the Northeast is less than or equal to 7000. 3. The median student population at a college in the Midwest is greater than or equal to 8000. 4. The median student population at a college in the South is 10,000. 5. The median student population at a college in the West is different from 8000. In Exercises 6 and 7, use the Wilcoxon rank sum test to test the claim. Use a = 0.01. 6. There is no difference between student populations for colleges in the Midwest and colleges in the West. 7. There is a difference between student populations for colleges in the Northeast and colleges in the South. Student populations Northeast Midwest South West 1,805 24,766 6,621 1,498 9,181 2,948 14,769 1,394 14,317 1,459 29,175 1,144 2,113 3,688 15,984 8,132 20,445 3,418 2,850 12,820 1,632 14,747 27,511 50,320 5,123 14,906 24,932 31,354 755 5,931 49,610 2,127 15,117 2,791 10,033 19,934 18,090 11,458 1,575 31,332
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The Kruskal-Wallis Test 11.3 SECTION 11.3 The Kruskal-Wallis Test 605 What You Should Learn How to use the Kruskal-Wallis test to determine whether three or more samples were selected from populations having the same distribution The Kruskal-Wallis Test The Kruskal-Wallis Test In Section 10.4, you learned how to use one-way ANOVA techniques to compare the means of three or more populations. When using one-way ANOVA, you should verify that each independent sample is selected from a population that is normally, or approximately normally, distributed. When you cannot verify that the populations are normal, you can still compare the distributions of three or more populations. To do so, you can use the Kruskal-Wallis test. The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were selected from populations having the same distribution. DEFINITION For a Kruskal-Wallis test, the null and alternative hypotheses are always similar to these statements. H 0 : All of the populations have the same distribution. H a : At least one population has a distribution that is different from the others. The conditions for using the Kruskal-Wallis test are that the samples must be random and independent, and the size of each sample must be at least 5. If these conditions are met, then the sampling distribution for the Kruskal-Wallis test is approximated by a chi-square distribution with k - 1 degrees of freedom, where k is the number of samples. You can calculate the Kruskal-Wallis test statistic using the formula below. For three or more independent samples, the test statistic for the Kruskal-Wallis test is H = 12 N 1 N + 1 2 a R 2 1 n 1 + R 2 2 n 2 + c + R 2 k n k b - 3 1 N + 1 2 where k is the number of samples, n i is the size of the i th sample, N is the sum of the sample sizes, and R i is the sum of the ranks of the i th sample. Test Statistic for the Kruskal-Wallis Test Performing a Kruskal-Wallis test consists of combining and ranking the sample data. The data are then separated according to sample and the sum of the ranks of each sample is calculated.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
606 CHAPTER 11 Nonparametric Tests These sums are then used to calculate the test statistic H , which is an approximation of the variance of the rank sums. When the samples are selected from populations having the same distribution, the sums of the ranks will be approximately equal, H will be small, and you should fail to reject the null hypothesis. When the samples are selected from populations not having the same distribution, the sums of the ranks will be quite different, H will be large, and you should reject the null hypothesis. Because you only reject the null hypothesis when H is significantly large, the Kruskal-Wallis test is always a right-tailed test. Performing a Kruskal-Wallis Test In Words In Symbols 1. Verify that the samples are random and independent, and each sample size is at least 5. 2. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 3. Specify the level of significance. Identify a . 4. Identify the degrees d.f. = k - 1 of freedom. 5. Determine the critical value Use Table 6 in Appendix B. and the rejection region. 6. Find the sum of the ranks for each sample. a. List the combined data in ascending order. b. Rank the combined data. 7. Find the test statistic and sketch H = 12 N 1 N + 1 2 # a R 2 1 n 1 + R 2 2 n 2 + c + R 2 k n k b - 3 1 N + 1 2 the sampling distribution. 8. Make a decision to reject or fail If H is in the rejection region, to reject the null hypothesis. then reject H 0 . Otherwise, fail to reject H 0 . 9. Interpret the decision in the context of the original claim. GUIDELINES
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.3 The Kruskal-Wallis Test 607 Performing a Kruskal-Wallis Test You want to compare the numbers of crimes reported in three police precincts in a city. To do so, you randomly select 10 weeks for each precinct and record the numbers of crimes reported. The table shows the results. At a = 0.01, can you conclude that the distribution of the numbers of crimes reported in at least one precinct is different from the others? Number of crimes reported for the week 101st Precinct (Sample 1) 106th Precinct (Sample 2) 113th Precinct (Sample 3) 60 65 69 52 55 51 49 64 70 52 66 61 50 53 67 48 58 65 57 50 62 45 54 59 44 70 60 56 62 63 SOLUTION You want to test the claim that the distribution of the numbers of crimes reported in at least one precinct is different from the others. The null and alternative hypotheses are as follows. H 0 : The distribution of the numbers of crimes reported is the same in all three precincts. H a : The distribution of the numbers of crimes reported in at least one precinct is different from the others. (Claim) The test is a right-tailed test with a = 0.01 and d.f. = k - 1 = 3 - 1 = 2. From Table 6 in Appendix B, the critical value is x 2 0 = 9.210. The rejection region is x 2 7 9.210. Before calculating the test statistic, you must find the sum of the ranks for each sample. The table shows the combined data listed in ascending order and the corresponding ranks. Ordered data Sample Rank 44 101st 1 45 101st 2 48 101st 3 49 101st 4 50 101st 5.5 50 106th 5.5 51 113th 7 52 101st 8.5 52 101st 8.5 53 106th 10 Ordered data Sample Rank 54 106th 11 55 106th 12 56 101st 13 57 101st 14 58 106th 15 59 113th 16 60 101st 17.5 60 113th 17.5 61 113th 19 62 106th 20.5 Ordered data Sample Rank 62 113th 20.5 63 113th 22 64 106th 23 65 106th 24.5 65 113th 24.5 66 106th 26 67 113th 27 69 113th 28 70 106th 29.5 70 113th 29.5 EXAMPLE 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
608 CHAPTER 11 Nonparametric Tests The sum of the ranks for each sample is as follows. R 1 = 1 + 2 + 3 + 4 + 5.5 + 8.5 + 8.5 + 13 + 14 + 17.5 = 77 R 2 = 5.5 + 10 + 11 + 12 + 15 + 20.5 + 23 + 24.5 + 26 + 29.5 = 177 R 3 = 7 + 16 + 17.5 + 19 + 20.5 + 22 + 24.5 + 27 + 28 + 29.5 = 211 Using these sums and the values n 1 = 10, n 2 = 10, n 3 = 10, and N = 30, the test statistic is H = 12 30 1 30 + 1 2 a 77 2 10 + 177 2 10 + 211 2 10 b - 3 1 30 + 1 2 12.521. The figure shows the location of the rejection region and the test statistic H . Because H is in the rejection region, you reject the null hypothesis. 2 4 6 8 10 12 14 χ 2 H 12.521 α = 0.01 0 2 χ = 9.210 Interpretation There is enough evidence at the 1% level of significance to support the claim that the distribution of the numbers of crimes reported in at least one precinct is different from the others. TRY IT YOURSELF 1 You want to compare the salaries of veterinarians who work in Texas, Florida, and California. To compare the salaries, you randomly select several veterinarians in each state and record their salaries. The table shows the salaries (in thousands of dollars). At a = 0.05, can you conclude that the distribution of the veterinarians’ salaries in at least one state is different from the others? (Adapted from U.S. Bureau of Labor Statistics) Sample salaries (in thousands of dollars) TX (Sample 1) FL (Sample 2) CA (Sample 3) 85.3 143.3 111.3 149.9 135.9 83.4 97.9 121.6 126.8 91.0 80.4 146.1 89.6 116.6 154.0 147.7 106.7 160.2 63.3 84.7 57.6 74.8 95.0 113.2 118.7 105.3 131.0 101.1 Answer: Page T1 Picturing the World The randomly collected data below were used to compare the water temperatures (in degrees Fahrenheit) of cities bordering the Gulf of Mexico. (Adapted from National Oceanographic Data Center) Cedar Key, FL (Sample 1) Eugene Island, LA (Sample 2) Dauphin Island, AL (Sample 3) 62 51 63 69 55 51 77 57 54 59 63 60 60 74 75 75 82 80 83 85 70 65 60 78 79 64 82 86 76 84 82 83 86 At A = 0.05, can you conclude that at least one temperature distribution is different from the others?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11.3 EXERCISES SECTION 11.3 The Kruskal-Wallis Test 609 For Extra Help: MyLab Statistics Building Basic Skills and Vocabulary 1. What are the conditions for using a Kruskal-Wallis test? 2. Explain why the Kruskal-Wallis test is always a right-tailed test. Using and Interpreting Concepts Performing a Kruskal-Wallis Test In Exercises 3 – 6, (a) identify the claim and state H 0 and H a , (b) find the critical value and identify the rejection region, (c) find the test statistic H, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim. 3. Home Insurance The table shows the annual premiums for a random sample of home insurance policies in Connecticut, Massachusetts, and Virginia. At a = 0.05, can you conclude that the distribution of the annual premiums in at least one state is different from the others? (Adapted from National Association of Insurance Commissioners) State Annual premium (in dollars) Connecticut 1303 1098 1263 1413 1538 1179 1320 Massachusetts 1382 1302 1257 1572 1387 1166 1034 Virginia 1035 950 766 845 1132 838 755 4. Hourly Rates A researcher wants to determine whether there is a difference in the hourly pay rates for registered nurses in Indiana, Kentucky, and Ohio. The researcher randomly selects several registered nurses in each state and records the hourly pay rate for each. The table shows the results. At a = 0.05, can the researcher conclude that the distribution of the hourly pay rates of registered nurses in at least one state is different from the others? (Adapted from U.S. Bureau of Labor Statistics) State Hourly pay rate (in dollars) Indiana 28.83 29.28 27.68 28.43 31.27 26.13 30.47 Kentucky 27.77 26.40 28.92 31.02 29.37 32.42 25.42 Ohio 27.84 32.24 33.64 33.91 27.34 29.89 5. Annual Salaries The table shows the annual salaries for a random sample of private industry workers in Kentucky, North Carolina, South Carolina, and West Virginia. At a = 0.10, can you conclude that the distribution of the annual salaries of private industry workers in at least one state is different from the others? (Adapted from U.S. Bureau of Labor Statistics) State Annual salary (in thousands of dollars) Kentucky 39.9 41.6 50.5 62.1 38.3 32.9 39.9 North Carolina 48.8 47.2 41.9 59.6 40.8 44.9 48.8 South Carolina 35.4 43.0 49.1 48.5 40.3 41.7 35.4 West Virginia 34.8 45.9 36.6 45.1 50.3 38.1 34.8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
610 CHAPTER 11 Nonparametric Tests 6. Caffeine Content The table shows the amounts of caffeine (in milligrams) in 16-ounce servings for a random sample of beverages. At a = 0.01, can you conclude that the distribution of the amounts of caffeine in at least one beverage is different from the others? (Adapted from Center for Science in the Public Interest) Beverage Amount of caffeine in 16-ounce serving (in milligrams) Coffees 320 300 206 150 266 Soft drinks 95 96 56 51 71 72 47 Energy drinks 200 141 160 152 154 166 Teas 100 106 42 15 32 10 Extending Concepts Comparing Two Tests In Exercises 7 and 8, (a) perform a Kruskal-Wallis test. (b) perform a one-way ANOVA test, assuming that each population is normally distributed and the population variances are equal. (c) Compare the results. 7. Hospital Patient Stays An insurance underwriter claims that the number of days patients spend in the hospital is different in at least one region of the United States. The table shows the numbers of days randomly selected patients spent in the hospital in four U.S. regions. At a = 0.01, can you support the underwriter’s claim? (Adapted from U.S. National Center for Health Statistics) Region Number of days Northeast 8 6 6 3 5 11 3 8 1 6 Midwest 5 4 3 9 1 4 6 3 4 7 South 5 8 1 5 8 7 5 1 West 2 3 6 6 5 4 3 6 5 8. Energy Consumption The table shows the energy consumed (in millions of Btu) in one year for a random sample of households from four U.S. regions. At a = 0.01, can you conclude that the energy consumed is different in at least one region? (Adapted from U.S. Energy Information Administration) Region Energy consumed (in millions of Btu) Northeast 61 95 140 127 93 97 84 123 89 163 Midwest 59 158 169 140 95 187 123 104 88 37 72 South 86 35 67 86 142 69 65 62 West 81 39 85 35 113 46 125 70 77 63
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Rank Correlation 11.4 SECTION 11.4 Rank Correlation 611 What You Should Learn How to use the Spearman rank correlation coefficient to determine whether the correlation between two variables is significant The Spearman Rank Correlation Coefficient The Spearman Rank Correlation Coefficient In Section 9.1, you learned how to measure the strength of the relationship between two variables using the Pearson correlation coefficient r . Two requirements for the Pearson correlation coefficient are that the variables are linearly related and that the variables have a bivariate normal distribution. When these requirements cannot be met, you can examine the relationship between two variables using the nonparametric equivalent to the Pearson correlation coefficient—the Spearman rank correlation coefficient. The Spearman rank correlation coefficient has several advantages over the Pearson correlation coefficient. For instance, the Spearman rank correlation coefficient can be used to describe the relationship between linear or nonlinear data. The Spearman rank correlation coefficient can be used for data at the ordinal level. And, the Spearman rank correlation coefficient is easier to calculate by hand than the Pearson correlation coefficient. The Spearman rank correlation coefficient r s is a measure of the strength of the relationship between two variables. The Spearman rank correlation coefficient is calculated using the ranks of paired sample data entries. If there are no ties in the ranks of either variable, then the formula for the Spearman rank correlation coefficient is r s = 1 - 6 Σ d 2 n ( n 2 - 1 ) where n is the number of paired data entries and d is the difference between the ranks of a paired data entry. If there are ties in the ranks and the number of ties is small relative to the number of data pairs, then the formula can still be used to approximate r s . DEFINITION The values of r s range from - 1 to 1, inclusive. When the ranks of corresponding data pairs are exactly identical, r s is equal to 1. When the ranks are in “reverse” order, r s is equal to - 1. When the ranks of corresponding data pairs have no relationship, r s is equal to 0. After calculating the Spearman rank correlation coefficient, you can determine whether the correlation between the variables is significant. You can make this determination by performing a hypothesis test for the population correlation coefficient r s . The null and alternative hypotheses for this test are listed below. H 0 : r s = 0 (There is no correlation between the variables.) H a : r s 0 (There is a significant correlation between the variables.) Table 10 in Appendix B lists the critical values for the Spearman rank correlation coefficient for selected levels of significance and sample sizes. The test statistic for the hypothesis test is the Spearman rank correlation coefficient r s .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
612 CHAPTER 11 Nonparametric Tests Testing the Significance of the Spearman Rank Correlation Coefficient In Words In Symbols 1. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 2. Specify the level of significance. Identify a . 3. Determine the critical value. Use Table 10 in Appendix B. 4. Find the test statistic. r s = 1 - 6 Σ d 2 n ( n 2 - 1 ) 5. Make a decision to reject or fail If 0 r s 0 is greater than the to reject the null hypothesis. critical value, then reject H 0 . Otherwise, fail to reject H 0 . 6. Interpret the decision in the context of the original claim. GUIDELINES The Spearman Rank Correlation Coefficient The table shows the school enrollments of males and females for a random sample of 10 colleges. At a = 0.05, can you conclude that there is a significant correlation between the number of males and the number of females enrolled at a college? Male Female 1786 2182 4246 4415 1419 1537 1188 1236 2394 2182 1079 919 4049 4209 3595 3741 1102 1086 1345 1282 SOLUTION The claim is “there is a significant correlation between the number of males and the number of females enrolled at a college.” The null and alternative hypotheses are listed below. H 0 : r s = 0 (There is no correlation between the number of males and the number of females enrolled at a college.) H a : r s 0 (There is a significant correlation between the number of males and the number of females enrolled at a college.) (Claim) EXAMPLE 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.4 Rank Correlation 613 Each data set has 10 entries. Because a = 0.05 and n = 10, the critical value is 0.648. Before calculating the test statistic, you must find Σ d 2 , the sum of the squares of the differences of the ranks of the data sets. You can use a table to calculate Σ d 2 , as shown below. Male Rank Female Rank d d 2 1786 6 2182 6.5 - 0.5 0.25 4246 10 4415 10 0 0 1419 5 1537 5 0 0 1188 3 1236 3 0 0 2394 7 2182 6.5 0.5 0.25 1079 1 919 1 0 0 4049 9 4209 9 0 0 3595 8 3741 8 0 0 1102 2 1086 2 0 0 1345 4 1282 4 0 0 Σ d 2 = 0.5 When n = 10 and Σ d 2 = 0.5, the test statistic is r s = 1 - 6 Σ d 2 n ( n 2 - 1 ) = 1 - 6 1 0.5 2 10 ( 10 2 - 1 ) 0.997. Because 0 r s 0 0.997 7 0.648, you reject the null hypothesis. Interpretation There is enough evidence at the 5% level of significance to conclude that there is a significant correlation between the number of males and the number of females enrolled at a college. TRY IT YOURSELF 1 The table shows the prices (in dollars per bushel) received for oat and wheat for a random sample of seven U.S. farmers. At a = 0.10, can you conclude that there is a significant correlation between the oat and wheat prices? (Adapted from U.S. Department of Agriculture) Oat Wheat 1.84 3.67 1.97 3.49 2.03 3.68 2.25 3.88 2.35 3.91 2.31 4.02 2.40 4.15 Answer: Page T1 Picturing the World The table shows the retail prices (in dollars per pound) for ground beef and fresh whole chicken for a random sample of nine U.S. grocery stores. (Adapted from U.S. Bureau of Labor Statistics) Beef Chicken 3.69 1.44 3.66 1.42 3.65 1.48 3.68 1.50 3.60 1.47 3.55 1.46 3.55 1.41 3.56 1.47 3.59 1.46 Does a significant correlation exist between ground beef and chicken prices in U.S. grocery stores? Use A = 0.10. Study Tip Remember that in the case of a tie between data entries, use the average of the corresponding ranks.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11.4 EXERCISES 614 CHAPTER 11 Nonparametric Tests For Extra Help: MyLab Statistics Building Basic Skills and Vocabulary 1. What are some advantages of the Spearman rank correlation coefficient over the Pearson correlation coefficient? 2. Describe the ranges of the Spearman rank correlation coefficient and the Pearson correlation coefficient. 3. What does it mean when r s is equal to 1? What does it mean when r s is equal to - 1? What does it mean when r s is equal to 0? 4. Explain, in your own words, what r s and r s represent in Example 1. Using and Interpreting Concepts Testing a Claim In Exercises 5 – 8, (a) identify the claim and state H 0 and H a , (b) find the critical value, (c) find the test statistic r s , (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim. 5. Farming Expenses In an agricultural report, a commodities analyst claims that there is a significant correlation between purchased seed expenses and fertilizer and lime expenses in the farming business. The table shows the total purchased seed expenses and fertilizer and lime expenses for farms in eight randomly selected states for a recent year. At a = 0.05, is there enough evidence to support the analyst’s claim? (Source: U.S. Department of Agriculture) State Purchased seed expenses (in millions of dollars) Fertilizer and lime expenses (in millions of dollars) Arkansas 490 480 California 1530 2060 Florida 490 480 Kentucky 266 402 Michigan 741 642 North Carolina 380 470 Ohio 879 858 Washington 360 560 6. Exercise Machines The table shows the overall scores and the prices for a random sample of nine different models of elliptical exercise machines. The overall score represents the ergonomics, exercise range, ease of use, construction, heart-rate monitoring, and safety. At a = 0.05, can you conclude that there is a significant correlation between the overall score and the price? (Source: Consumer Reports) Overall score 77 75 73 71 Price (in dollars) 3700 1700 1300 900 Overall score 66 66 64 62 58 Price (in dollars) 1000 1400 1800 1000 700
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.4 Rank Correlation 615 7. Crop Prices The table shows the prices (in dollars per bushel) received for barley and corn for a random sample of nine U.S. farmers. At a = 0.05, can you conclude that there is a significant correlation between the barley and corn prices? (Adapted from U.S. Department of Agriculture) Barley 4.89 4.52 4.85 4.97 5.12 4.91 5.08 4.98 4.87 Corn 3.21 3.22 3.29 3.23 3.33 3.40 3.44 3.49 3.43 8. Vacuum Cleaners The table shows the overall scores and the prices for a random sample of 12 different models of vacuum cleaners. The overall score represents cleaning, airflow, handling, noise, and emissions. At a = 0.10, can you conclude that there is a significant correlation between the overall score and the price? (Source: Consumer Reports) Overall score 65 71 69 47 55 38 Price (in dollars) 150 200 550 350 470 90 Overall score 47 47 47 57 34 65 Price (in dollars) 80 130 210 190 300 260 Test Scores and GNI In Exercises 9 –12, use the table below. The table shows the average achievement scores of 15-year-olds in science and mathematics along with the gross national incomes (GNI) of nine randomly selected countries for a recent year. (The GNI is a measure of the total value of goods and services produced by the economy of a country.) (Source: Organization for Economic Cooperation and Development; The World Bank) Country Science average Mathematics average GNI (in billions of dollars) Canada 528 516 1,529 France 495 493 2,458 Germany 509 506 3,437 Italy 481 490 1,815 Japan 538 532 4,549 Mexico 416 408 1,143 Spain 493 486 1,192 Sweden 493 494 503 United States 496 470 18,496 9. Science and GNI At a = 0.10, can you conclude that there is a significant correlation between science achievement scores and GNI? 10. Math and GNI At a = 0.10, can you conclude that there is a significant correlation between mathematics achievement scores and GNI? 11. Science and Math At a = 0.10, can you conclude that there is a significant correlation between science and mathematics achievement scores? 12. Writing a Summary Use the results from Exercises 9 –11 to write a summary about the correlation (or lack of correlation) between test scores and GNI.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
616 CHAPTER 11 Nonparametric Tests Extending Concepts Testing the Spearman Rank Correlation Coefficient for n + 30 When you are testing the significance of the Spearman rank correlation coefficient and the sample size n is greater than 30, you can use the expression below to find the critical value. { z 2 n - 1 , z corresponds to the level of significance In Exercises 13 and 14, test the Spearman rank correlation coefficient. 13. Work Injuries The table shows the average hours worked per week and the numbers of on-the-job injuries for a random sample of U.S. companies in a recent year. At a = 0.10, can you conclude that there is a significant correlation between average hours worked and the number of on-the-job injuries? Hours worked 46 43 41 40 41 42 45 45 42 45 44 44 Injuries 22 25 18 17 20 22 28 29 24 26 26 25 Hours worked 45 46 47 47 46 46 49 50 50 42 41 42 Injuries 27 29 29 30 29 29 30 30 30 23 22 23 Hours worked 41 41 41 41 40 39 38 39 39 Injuries 21 19 18 18 17 16 16 16 16 14. Work Injuries in Construction The table shows the average hours worked per week and the numbers of on-the-job injuries for a random sample of U.S. construction companies in a recent year. At a = 0.05, can you conclude that there is a significant correlation between average hours worked and the number of on-the-job injuries? Hours worked 38 38 37 38 38 40 39 39 39 40 39 41 Injuries 11 11 9 10 10 17 15 14 14 16 15 17 Hours worked 41 42 41 41 41 42 42 42 42 41 41 39 Injuries 17 21 18 18 18 22 21 19 21 18 17 12 Hours worked 38 38 39 39 36 37 36 37 37 37 37 Injuries 12 11 13 12 6 6 6 6 7 8 7
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The Runs Test 11.5 SECTION 11.5 The Runs Test 617 What You Should Learn How to use the runs test to determine whether a data set is random The Runs Test for Randomness The Runs Test for Randomness In obtaining a sample of data, it is important for the data to be selected randomly. But how do you know whether the sample data are truly random? One way to test for randomness in a data set is to use a runs test for randomness. Before using a runs test for randomness, you must first know how to determine the number of runs in a data set. A run is a sequence of data having the same characteristic. Each run is preceded by and followed by data with a different characteristic or by no data at all. The number of data in a run is called the length of the run. DEFINITION Finding the Number of Runs A liquid-dispensing machine has been designed to fill one-liter bottles. A quality control inspector decides whether each bottle is filled to an acceptable level and passes inspection 1 P 2 or fails inspection 1 F 2 . Determine the number of runs for each sequence and find the length of each run. 1. P P P P P P P P F F F F F F F F 2. P F P F P F P F P F P F P F P F 3. P P F F F F P F F F P P P P P P SOLUTION 1. There are two runs. The first 8 P ’s form a run of length 8 and the first 8 F ’s form another run of length 8, as shown below. P P P P P P P P F F F F F F F F 1st run 2nd run 2. There are 16 runs each of length 1, as shown below. P F P F P F P F P F P F P F P F 1st run 2nd run… …16th run 3. There are 5 runs, the first of length 2, the second of length 4, the third of length 1, the fourth of length 3, and the fifth of length 6, as shown below. P P F F F F P F F F P P P P P P 1st run 2nd run 3rd run 4th run 5th run EXAMPLE 1
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
618 CHAPTER 11 Nonparametric Tests TRY IT YOURSELF 1 A machine produces engine parts. An inspector measures the diameter of each engine part and determines whether the part passes inspection 1 P 2 or fails inspection 1 F 2 . The results are shown below. Determine the number of runs in the sequence and find the length of each run. P P P F P F P P P P F F P F P P F F F P P P F P P P Answer: Page T1 When each value in a set of data can be categorized into one of two separate categories, you can use the runs test for randomness to determine whether the data are random. The runs test for randomness is a nonparametric test that can be used to determine whether a sequence of sample data is random. DEFINITION The runs test for randomness considers the number of runs in a sequence of sample data in order to test whether a sequence is random. When a sequence has too few or too many runs, it is usually not random. For instance, the sequence P P P P P P P P F F F F F F F F from Example 1, part 1, has too few runs (only 2 runs). The sequence P F P F P F P F P F P F P F P F from Example 1, part 2, has too many runs (16 runs). So, these sample data are probably not random. You can use a hypothesis test to determine whether the number of runs in a sequence of sample data is too high or too low. The runs test is a two-tailed test, and the null and alternative hypotheses are listed below. H 0 : The sequence of data is random. H a : The sequence of data is not random. When using the runs test, let n 1 represent the number of data that have one characteristic and let n 2 represent the number of data that have the second characteristic. It does not matter which characteristic you choose to be represented by n 1 . Let G represent the number of runs. n 1 = number of data with one characteristic n 2 = number of data with the other characteristic G = number of runs Table 12 in Appendix B lists the critical values for the runs test for selected values of n 1 and n 2 at the a = 0.05 level of significance. (In this text, you will use only the a = 0.05 level of significance when performing runs tests.) When n 1 or n 2 is greater than 20, you can use the standard normal distribution to find the critical values.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.5 The Runs Test 619 You can calculate the test statistic for the runs test as follows. When n 1 20 and n 2 20, the test statistic for the runs test is G , the number of runs. When n 1 7 20 or n 2 7 20, the test statistic for the runs test is z = G - m G s G where m G = 2 n 1 n 2 n 1 + n 2 + 1 and s G = B 2 n 1 n 2 1 2 n 1 n 2 - n 1 - n 2 2 1 n 1 + n 2 2 2 1 n 1 + n 2 - 1 2 . Test Statistic for the Runs Test Performing a Runs Test for Randomness In Words In Symbols 1. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 2. Specify the level of significance. Identify a . (Use a = 0.05 for the runs test.) 3. Determine the number of data that Determine n 1 , n 2 , and G . have each characteristic and the number of runs. 4. Determine the critical values. When n 1 20 and n 2 20, use Table 12 in Appendix B. When n 1 7 20 or n 2 7 20, use Table 4 in Appendix B. 5. Find the test statistic. When n 1 20 and n 2 20, use G . When n 1 7 20 or n 2 7 20, use z = G - m G s G 6. Make a decision to reject or fail If G is less than or equal to to reject the null hypothesis. the lower critical value or greater than or equal to the upper critical value, then reject H 0 . Otherwise, fail to reject H 0 . Or, if z is in the rejection region, then reject H 0 . Otherwise, fail to reject H 0 . 7. Interpret the decision in the context of the original claim. GUIDELINES
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
620 CHAPTER 11 Nonparametric Tests Using the Runs Test As people enter a concert, an usher records where they are sitting. The results for 13 people are shown, where L represents a lawn seat and P represents a pavilion seat. At a = 0.05, can you conclude that the sequence of seat locations is not random? L L L P P L P P P L L P L SOLUTION The claim is “the sequence of seat locations is not random.” To test this claim, use the null and alternative hypotheses below. H 0 : The sequence of seat locations is random. H a : The sequence of seat locations is not random. (Claim) To find the critical values, first determine n 1 , the number of L ’s; n 2 the number of P ’s; and G , the number of runs. L L L P P L P P P L L P L 1st 2nd 3rd 4th 5th 6th 7th run run run run run run run n 1 = number of L ’s = 7 n 2 = number of P ’s = 6 G = number of runs = 7 Because n 1 20, n 2 20, and a = 0.05, use Table 12 to find the lower critical value 3 and the upper critical value 12. The test statistic is the number of runs G = 7. Because the test statistic G is between the critical values 3 and 12, you fail to reject the null hypothesis. Interpretation There is not enough evidence at the 5% level of significance to support the claim that the sequence of seat locations is not random. So, it appears that the sequence of seat locations is random. TRY IT YOURSELF 2 The genders of 15 students as they enter a classroom are shown below, where F represents a female and M represents a male. At a = 0.05, can you conclude that the sequence of genders is not random? M F F F M M F F M F M M F F F Answer: Page T1 EXAMPLE 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SECTION 11.5 The Runs Test 621 Using the Runs Test You want to determine whether the selection of recently hired employees in a large company is random with respect to gender. The genders of 36 recently hired employees are shown below, where F represents a female and M represents a male. At a = 0.05, can you conclude that the sequence of employees is not random? M M F F F F M M M M M M F F F F F M M M M M M M F F F M M M M F M M F M SOLUTION The claim is “the sequence of employees is not random.” To test this claim, use the null and alternative hypotheses below. H 0 : The sequence of employees is random. H a : The sequence of employees is not random. (Claim) To find the critical values, first determine n 1 , the number of F ’s; n 2 , the number of M ’s; and G , the number of runs. M M F F F F M M M M M M 1st run 2nd run 3rd run F F F F F M M M M M M M 4th run 5th run F F F M M M M F M M F M 6th 7th 8th 9th 10th 11th run run run run run run n 1 = number of F ’s = 14 n 2 = number of M ’s = 22 G = number of runs = 11 Because n 2 7 20, use Table 4 in Appendix B to find the critical values. Because the test is a two-tailed test with a = 0.05, the critical values are - z 0 = - 1.96. and z 0 = 1.96. Before calculating the test statistic, find the values of m G and s G , as follows. m G = 2 n 1 n 2 n 1 + n 2 + 1 = 2 1 14 21 22 2 14 + 22 + 1 = 616 36 + 1 18.11 EXAMPLE 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
622 CHAPTER 11 Nonparametric Tests s G = B 2 n 1 n 2 1 2 n 1 n 2 - n 1 - n 2 2 1 n 1 + n 2 2 2 1 n 1 + n 2 - 1 2 = B 2 1 14 21 22 2 [2 1 14 21 22 2 - 14 - 22] 1 14 + 22 2 2 1 14 + 22 - 1 2 2.81 You can find the test statistic as follows. z = G - m G s G 11 - 18.11 2.81 - 2.53 The figure shows the location of the rejection regions and the test statistic z . Because z is in the rejection region, you reject the null hypothesis. z 3 2 1 0 1 2 3 z 0 = 1.96 z 2.53 z 0 = 1.96 α = 0.025 1 2 α α = 0.025 1 2 1 = 0.95 Interpretation There is enough evidence at the 5% level of significance to support the claim that the sequence of employees with respect to gender is not random. TRY IT YOURSELF 3 Let S represent a day in a small town in which it snowed and let N represent a day in the same town in which it did not snow. The snowfall results for the entire month of January are shown below. At a = 0.05, can you conclude that the sequence is not random? N N N S S N N S N S N N N N N S N S N S N N S N S S N N N N N Answer: Page T1 When n 1 or n 2 is greater than 20, you can also use a P @ value to perform a hypothesis test for the randomness of the data. In Example 3, you can calculate the P @ value to be 0.0114. Because P 6 a , you reject the null hypothesis. Picturing the World The sequence shows the National Football League conference of each winning team for the first 51 Super Bowls, where A represents the American Football Conference and N represents the National Football Conference. (Source: National Football League) N N A A A N A A A A A N A A A N N A N N N N N N N N N N N N N A A N A A N A A A A N A N N N A N A A A At A = 0.05, can you conclude that the sequence of conferences of Super Bowl winning teams is random?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
11.5 EXERCISES SECTION 11.5 The Runs Test 623 For Extra Help: MyLab Statistics Building Basic Skills and Vocabulary 1. In your own words, explain why the hypothesis test discussed in this section is called the runs test. 2. Describe the test statistic for the runs test when the sample sizes n 1 and n 2 are less than or equal to 20 and when either n 1 or n 2 is greater than 20. Using and Interpreting Concepts Finding the Number of Runs In Exercises 3 – 6, determine the number of runs in the sequence. Then find the length of each run. 3. T F T F T T T F F F T F 4. U U D D U D U U D D U D U U 5. M F M F M F F F F F F M M M F F M M M M 6. A A A B B B A B B A A A A A A B A A B A B B 7. Find the values of n 1 and n 2 in Exercise 3. 8. Find the values of n 1 and n 2 in Exercise 4. 9. Find the values of n 1 and n 2 in Exercise 5. 10. Find the values of n 1 and n 2 in Exercise 6. Finding Critical Values In Exercises 11–14, use the sequence and Table 12 in Appendix B to determine the number of runs that are considered too high and the number of runs that are considered too low for the data to be in random order. 11. T F T F T F T F T F T F 12. M F M M M M M M F F M M 13. N S S S N N N N N S N S N S S N N N 14. X X X X X X X Y Y Y Y Y Y Y Y Y Y Y Y Y Y Performing a Runs Test In Exercises 15 – 20, (a) identify the claim and state H 0 and H a , (b) find the critical values, (c) find the test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim. Use a = 0.05. 15. Coin Toss A coach records the results of the coin toss at the beginning of each football game for a season. The results are shown, where H represents heads and T represents tails. The coach claimed the tosses were not random. Test the coach’s claim. H T T T H T H H T T T T H T H H 16. Senate The sequence shows the majority party of the U.S. Senate after each election for a recent group of years, where R represents the Republican party and D represents the Democratic party. Can you conclude that the sequence is not random? (Source: U.S. Senate) R D D D R R R R R R R D D D D D D D R D D R D D D D D D D D D D D D D R R R D D D D R R R D R R D D D D R R
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
624 CHAPTER 11 Nonparametric Tests 17. Baseball The sequence shows the Major League Baseball league of each World Series winning team from 1969 to 2016, where N represents the National League and A represents the American League. Can you conclude that the sequence of leagues of World Series winning teams is not random? (Source: Major League Baseball) N A N A A A N N A A N N N N A A A N A N A N A A A N A N A A A N A N A A N A N A N N N A N A N 18. Number Generator A number generator outputs the sequence of digits shown, where O represents an odd digit and E represents an even digit. Test the claim that the digits were not randomly generated. O O O E E E E O O O O O E E E E O O E E E E O O O O E E E E O O 19. Dog Identifications A team of veterinarians record, in order, the genders of every dog that is microchipped at their pet hospital in one month. The genders of recently microchipped dogs are shown, where F represents a female and M represents a male. A veterinarian claims that the microchips are random by gender. Do you have enough evidence to reject the doctor’s claim? M M F M F F F F F M M M F F F M F F F F F M F F F M F F F 20. Golf Tournament A golf tournament official records whether each past winner is American-born ( A ) or foreign-born ( F ). The results are shown for every year the tournament has existed. Can you conclude that the sequence is not random? F F A F F A F F A F F A F F A F F A F F F F F F A F F A F F A F F A F F A F A F F A F F F F F A F F F F F A F F F A Extending Concepts Runs Test with Quantitative Data In Exercises 21–23, use the following information to perform a runs test. You can also use the runs test for randomness with quantitative data. First, calculate the median. Then assign a + sign to those values above the median and a - sign to those values below the median. Ignore any values that are equal to the median. Use a = 0.05. 21. Daily High Temperatures The sequence shows the daily high temperatures (in degrees Fahrenheit) for a city during the month of July. Test the claim that the daily high temperatures do not occur randomly. 84 87 92 93 95 84 82 83 81 87 92 98 99 93 84 85 86 92 91 95 84 92 83 81 87 92 98 89 93 84 85 22. Exam Scores The sequence shows the exam scores of a class based on the order in which the students finished the test. Test the claim that the scores occur randomly. 83 94 80 76 92 89 65 75 82 87 90 91 81 99 97 72 72 89 90 92 87 76 74 66 88 81 90 92 89 76 80 23. Use technology to generate a sequence of 30 numbers from 1 to 99, inclusive. Test the claim that the sequence of numbers is not random.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
USES AND ABUSES Statistics in the Real World Uses and Abuses 625 EXERCISES 1. Insufficient Evidence Give an example of a nonparametric test in which there is not enough evidence to reject the null hypothesis. 2. Using an Inappropriate Test Discuss the nonparametric tests described in this chapter and match each test with its parametric counterpart, which you studied in earlier chapters. Uses Nonparametric Tests Before you could perform many of the hypothesis tests you learned about in previous chapters, you had to ensure that certain conditions about the population were satisfied. For instance, before you could perform a t @ test, you had to verify that the population was normally distributed or the sample size was at least 30. One advantage of the nonparametric tests shown in this chapter is that they are distribution free. That is, they do not require any particular information about the population or populations being tested. Another advantage of nonparametric tests is that they are easier to perform than their parametric counterparts. This means that they are easier to understand and quicker to use. Nonparametric tests can often be used when data are at the nominal or ordinal level. Abuses Insufficient Evidence Stronger evidence is needed to reject a null hypothesis in a nonparametric test than in a corresponding parametric test. That is, when you are trying to support a claim represented by the alternative hypothesis, you might need a larger sample when performing a nonparametric test. When the outcome of a nonparametric test results in failure to reject the null hypothesis, you should investigate the sample size used. It may be that a larger sample will produce different results. Using an Inappropriate Test In general, when information about the population (such as the condition of normality) is known, it is more efficient to use a parametric test. When information about the population is not known, however, nonparametric tests can be helpful.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
626 CHAPTER 11 Nonparametric Tests Chapter Summary 11 Example(s) Review Exercises What Did You Learn? Section 11.1 How to use the sign test to test a population median z = 1 x + 0.5 2 - 0.5 n 1 n 2 1, 2 1– 3, 6 How to use the paired-sample sign test to test the difference between two population medians (dependent samples) 3 4, 5 Section 11.2 How to use the Wilcoxon signed-rank test and the Wilcoxon rank sum test to determine whether two samples are selected from populations having the same distribution z = R - m R s R , m R = n 1 1 n 1 + n 2 + 1 2 2 , s R = B n 1 n 2 1 n 1 + n 2 + 1 2 12 1, 2 7, 8 Section 11.3 How to use the Kruskal-Wallis test to determine whether three or more samples were selected from populations having the same distribution H = 12 N 1 N + 1 2 a R 2 1 n 1 + R 2 2 n 2 + c + R 2 k n k b - 3 1 N + 1 2 1 9, 10 Section 11.4 How to use the Spearman rank correlation coefficient to determine whether the correlation between two variables is significant r s = 1 - 6 Σ d 2 n ( n 2 - 1 ) 1 11, 12 Section 11.5 How to use the runs test to determine whether a data set is random G = number of runs, z = G - m G s G , m G = 2 n 1 n 2 n 1 + n 2 + 1, s G = B 2 n 1 n 2 1 2 n 1 n 2 - n 1 - n 2 2 1 n 1 + n 2 2 2 1 n 1 + n 2 - 1 2 1– 3 13, 14 The table summarizes parametric and nonparametric tests. Always use the parametric test when the conditions for that test are satisfied. Test application Parametric test Nonparametric test One-sample tests z @ test for a population mean t @ test for a population mean Sign test for a population median Two-sample tests Dependent samples Independent samples t @ test for the difference between means z @ test for the difference between means t @ test for the difference between means Paired-sample sign test Wilcoxon signed-rank test Wilcoxon rank sum test Tests involving three or more samples One-way ANOVA Kruskal-Wallis test Correlation Pearson correlation coefficient Spearman rank correlation coefficient Randomness (No parametric test) Runs test
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Review Exercises 627 Review Exercises 11 Section 11.1 In Exercises 1– 6, use a sign test to test the claim by doing the following. (a) Identify the claim and state H 0 and H a . (b) Find the critical value. (c) Find the test statistic. (d) Decide whether to reject or fail to reject the null hypothesis. (e) Interpret the decision in the context of the original claim. 1. A store manager claims that the median number of customers per day is no more than 650. The numbers of customers per day for 17 randomly selected days are listed below. At a = 0.01, can you reject the manager’s claim? 675 665 601 642 554 653 639 650 645 550 677 569 650 660 682 689 590 2. A company claims that the median credit score for U.S. adults is at least 710. The credit scores for 13 randomly selected U.S. adults are listed below. At a = 0.05, can you reject the company’s claim? (Adapted from Fair Isaac Corporation) 750 782 805 695 700 706 625 589 690 772 745 704 710 3. A government agency claims that the median sentence length for all federal prisoners is 2 years. In a random sample of 180 federal prisoners, 65 have sentence lengths that are less than 2 years, 109 have sentence lengths that are more than 2 years, and 6 have sentence lengths that are 2 years. At a = 0.10, can you reject the agency’s claim? (Adapted from U.S. Sentencing Commission) 4. In a study testing the effects of calcium supplements on blood pressure in men, 10 randomly selected men were given a calcium supplement for 12 weeks. The table shows the measurements for each subject’s diastolic blood pressure taken before and after the 12-week treatment period. At a = 0.05, can you reject the claim that there was no reduction in diastolic blood pressure? (Adapted from the American Medical Association) Patient 1 2 3 4 5 Before treatment 107 110 123 129 112 After treatment 100 114 105 112 115 Patient 6 7 8 9 10 Before treatment 111 107 112 136 102 After treatment 116 106 102 125 104
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
628 CHAPTER 11 Nonparametric Tests 5. In a study testing the effects of an herbal supplement on blood pressure in men, 11 randomly selected men were given an herbal supplement for 12 weeks. The table shows the measurements for each subject’s diastolic blood pressure taken before and after the 12-week treatment period. At a = 0.05, can you reject the claim that there was no reduction in diastolic blood pressure? (Adapted from The Journal of the American Medical Association) Patient 1 2 3 4 5 6 Before treatment 123 109 112 102 98 114 After treatment 124 97 113 105 95 119 Patient 7 8 9 10 11 Before treatment 119 112 110 117 130 After treatment 114 114 121 118 133 6. An association claims that the median annual salary of lawyers is $118,160. In a random sample of 125 lawyers, 76 were paid less than $118,160, and 49 were paid more than $118,160. At a = 0.05, can you reject the association’s claim? (Adapted from U.S. Bureau of Labor Statistics) Section 11.2 In Exercises 7 and 8, use a Wilcoxon test to test the claim by doing the following. (a) Identify the claim and state H 0 and H a . (b) Decide whether to use a Wilcoxon signed-rank test or a Wilcoxon rank sum test. (c) Find the critical value(s). (d) Find the test statistic. (e) Decide whether to reject or fail to reject the null hypothesis. (f ) Interpret the decision in the context of the original claim. 7. A career placement advisor claims that there is a difference in the total times required to earn a doctorate degree by female and male graduate students. The table shows the total times (in years) to earn a doctorate for a random sample of 12 female and 12 male graduate students. At a = 0.01, can you support the advisor’s claim? (Adapted from Survey of Earned Doctorates) Female 9 11 9 12 11 8 10 13 6 6 8 9 Male 8 7 8 10 9 7 7 9 10 8 9 7
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Review Exercises 629 8. A medical researcher claims that a new drug affects the number of headache hours experienced by headache sufferers. The numbers of headache hours (per day) experienced by eight randomly selected patients before and after taking the drug are shown in the table. At a = 0.05, can you support the researcher’s claim? Patient 1 2 3 4 5 6 7 8 Headache hours (before) 0.9 2.3 2.7 2.4 2.9 1.9 1.2 3.1 Headache hours (after) 1.4 1.5 1.4 1.8 1.3 0.6 0.7 1.9 Section 11.3 In Exercises 9 and 10, use the Kruskal-Wallis test to test the claim by doing the following. (a) Identify the claim and state H 0 and H a . (b) Find the critical value and identify the rejection region. (c) Find the test statistic H. (d) Decide whether to reject or fail to reject the null hypothesis. (e) Interpret the decision in the context of the original claim. 9. The table shows the ages for a random sample of doctorate recipients in three fields of study. At a = 0.01, can you conclude that the distribution of the ages of the doctorate recipients in at least one field of study is different from the others? (Adapted from Survey of Earned Doctorates) Field of study Age Life sciences 31 32 34 31 30 32 35 31 32 34 29 Physical sciences 30 31 32 31 30 29 31 30 32 33 30 Social sciences 32 35 31 33 34 31 35 36 32 30 33 10. The table shows the starting salaries for a random sample of college graduates in four fields of engineering. At a = 0.05, can you conclude that the distribution of the starting salaries in at least one field of engineering is different from the others? (Adapted from National Association of Colleges and Employers) Field of engineering Starting salary (in thousands of dollars) Chemical 68.4 65.9 71.7 70.5 64.3 69.9 67.5 65.7 69.4 71.1 Computer 68.2 67.6 65.8 66.4 69.5 72.6 67.0 70.2 68.5 66.4 Electrical 66.9 65.5 66.1 64.4 67.6 67.3 68.9 68.1 67.1 67.4 Mechanical 65.5 64.8 65.6 63.7 65.6 65.3 68.1 68.6 64.9 62.7
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
630 CHAPTER 11 Nonparametric Tests Section 11.4 In Exercises 11 and 12, use the Spearman rank correlation coefficient to test the claim by doing the following. (a) Identify the claim and state H 0 and H a . (b) Find the critical value. (c) Find the test statistic r s . (d) Decide whether to reject or fail to reject the null hypothesis. (e) Interpret the decision in the context of the original claim. 11. The table shows the overall scores and the prices for six randomly selected video disk players. The overall score is based mainly on picture quality. At a = 0.10, can you conclude that there is a significant correlation between the overall score and the price? (Source: Consumer Reports) Overall score 93 91 90 87 85 69 Price (in dollars) 500 300 500 150 250 130 12. The table shows the overall scores and the prices per gallon for seven randomly selected interior paints. The overall score represents hiding, surface smoothness, and resistance to staining, scrubbing, gloss change, sticking, mildew, and fading. At a = 0.10, can you conclude that there is a significant correlation between the overall score and the price? (Adapted from Consumer Reports) Overall score 46 73 64 56 94 86 50 Price per gallon (in dollars) 24 40 25 24 40 38 26 Section 11.5 In Exercises 13 and 14, (a) identify the claim and state H 0 and H a , (b) find the critical values, (c) find the test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim. Use a = 0.05. 13. A highway patrol officer stops speeding vehicles on an interstate highway. The genders of the last 25 drivers who were stopped are shown, where F represents a female driver and M represents a male driver. Can you conclude that the stops were not random by gender? F M M M F M F M F F F M M F F F M M M F M M F F M 14. The sequence shows the departure status of the last 18 buses to leave a bus station, where T represents a bus that departed on time and L represents a bus that departed late. Can you conclude that the departure status of the buses is not random? T T T T L L L L T L L L T T T T T T
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter Quiz 631 Chapter Quiz 11 Take this quiz as you would take a quiz in class. After you are done, check your work against the answers given in the back of the book. In Exercises 1– 5, (a) identify the claim and state H 0 and H a , (b) decide which nonparametric test to use, (c) find the critical value(s), (d) find the test statistic, (e) decide whether to reject or fail to reject the null hypothesis, and (f ) interpret the decision in the context of the original claim. 1. An organization claims that the median number of annual volunteer hours is 52. In a random sample of 75 people who volunteered last year, 47 volunteered for less than 52 hours, 23 volunteered for more than 52 hours, and 5 volunteered for 52 hours. At a = 0.05, can you reject the organization’s claim? (Adapted from U.S. Bureau of Labor Statistics) 2. A labor organization claims that there is a difference in the hourly earnings of union workers and nonunion workers in state and local governments. The table shows the hourly earnings (in dollars) for a random sample of 10 union workers and 10 nonunion workers in state and local governments. At a = 0.10, can you support the organization’s claim? (Adapted from U.S. Bureau of Labor Statistics) Union Nonunion 29.75 28.15 32.30 35.52 32.88 27.85 27.35 29.05 27.60 26.75 26.15 23.10 21.20 26.95 22.05 24.75 22.50 22.25 21.40 20.45 3. The table shows the sales prices for a random sample of apartment condominiums and cooperatives in four U.S. regions. At a = 0.01, can you conclude that the distribution of the sales prices in at least one region is different from the others? (Adapted from National Association of Realtors) Region Sales price (in thousands of dollars) Northeast 257.3 250.3 242.7 275.0 270.7 254.8 264.2 243.4 Midwest 166.9 183.1 178.9 153.9 148.5 169.9 163.3 165.1 South 181.3 156.7 155.6 170.4 175.3 196.3 178.4 166.8 West 320.2 303.6 357.4 331.7 291.6 327.4 321.7 308.0 4. The table shows the numbers of emails sent and the numbers of emails received in a week for a random sample of nine people. At a = 0.01, can you conclude that there is a significant correlation between the number of emails sent and the number of emails received? Emails sent 30 30 25 26 24 18 18 25 28 Emails received 32 36 21 22 20 20 22 23 23 5. A meteorologist wants to determine whether days with rain occur randomly in April in his hometown. To do so, the meteorologist records whether it rains for each day in April. The results are shown, where R represents a day with rain and N represents a day with no rain. At a = 0.05, can the meteorologist conclude that days with rain are not random? N R R N N N N R N R R N R R R N R R R R N N N N R N R N N R
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter Test 11 632 CHAPTER 11 Nonparametric Tests Take this test as you would take a test in class. In Exercises 1– 5, (a) identify the claim and state H 0 and H a , (b) decide which nonparametric test to use, (c) find the critical value(s), (d) find the test statistic, (e) decide whether to reject or fail to reject the null hypothesis, and (f ) interpret the decision in the context of the original claim. 1. The mayor called on council members at a town meeting in the sequence shown, where R represents a Republican council member and D represents a Democrat council member. At a = 0.05, can you conclude that the selection of members was not random? R D D D R R D R D D R D D D R R D R R R R D R R R D D D R D R D R R 2. An employment agency representative wants to determine whether there is a difference in the annual household incomes in four regions of the United States. The representative randomly selects several households in each region and records the annual household income for each. The table shows the results. At a = 0.01, can the representative conclude that the distribution of the annual household incomes in at least one region is different from the others? (Adapted from U.S. Census Bureau) Region Household income (in thousands of dollars) Northeast 64.2 57.0 65.6 64.7 59.9 62.4 61.5 Midwest 56.0 61.1 51.9 55.2 57.4 58.5 58.7 South 49.3 50.5 54.1 46.4 51.3 54.1 51.9 West 64.0 61.9 58.6 60.7 59.6 61.2 63.1 3. An investment company claims that the median age of people with mutual funds is 51 years. The ages (in years) of 20 randomly selected mutual fund owners are listed below. At a = 0.01, is there enough evidence to reject the company’s claim? (Adapted from Investment Company Institute) 46 34 33 27 58 64 54 36 38 42 26 51 49 44 46 50 39 34 51 63 4. An employment agency claims that there is a difference in the weekly earnings of workers who are union members and workers who are not union members. The table shows the weekly earnings (in dollars) for a random sample of nine union members and eight nonunion members. At a = 0.05, can you support the agency’s claim? (Adapted from U.S. Bureau of Labor Statistics) Member 951 1090 788 896 980 1087 1136 1000 890 919 1026 Nonmember 850 783 954 649 747 906 895 730 790 687 5. The table shows the overall scores and the prices for a random sample of eight different suitcases. The overall score represents the ease of use, features, construction, and durability of a suitcase. At a = 0.05, can you conclude that there is a significant correlation between the overall score and the price? (Adapted from Consumer Reports) Overall score 90 85 81 78 72 68 64 61 Price (in dollars) 495 230 190 160 350 230 260 200
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
REAL STATISTICS REAL DECISIONS Putting it all together Real Statistics—Real Decisions 633 In a recent year, according to the Bureau of Labor Statistics, the median number of years that wage and salary workers had been with their current employer (called employee tenure) was 4.2 years. Information on employee tenure has been gathered since 1996 using the Current Population Survey (CPS), a monthly survey of about 60,000 households that provides information on employment, unemployment, earnings, demographics, and other characteristics of the U.S. population ages 16 and over. With respect to employee tenure, the questions measure how long workers have been with their current employers, not how long they plan to stay with their employers. EXERCISES 1. How Would You Do It? (a) What sampling technique would you use to select the sample for the CPS? (b) Do you think the technique in part (a) will give you a sample that is representative of the U.S. population? Why or why not? (c) Identify possible flaws or biases in the survey on the basis of the technique you chose in part (a). 2. Is There a Difference? A congressional representative claims that the median tenure for workers from the representative’s district is less than the national median tenure of 4.2 years. The claim is based on the representative’s data, which is shown in the table at the right above. (Assume that the employees were randomly selected.) (a) Is it possible that the claim is true? What questions should you ask about how the data were collected? (b) How would you test the representative’s claim? Can you use a parametric test, or do you need to use a nonparametric test? (c) State the null hypothesis and the alternative hypothesis. (d) Test the claim using a = 0.05. What can you conclude? 3. Comparing Male and Female Employee Tenures A congressional representative claims that there is a difference between the median tenures for male workers and female workers. The claim is based on the representative’s data, which is shown in the table at the right. (Assume that the employees were randomly selected from the representative’s district.) (a) How would you test the representative’s claim? Can you use a parametric test, or do you need to use a nonparametric test? (b) State the null hypothesis and the alternative hypothesis. (c) Test the claim using a = 0.05. What can you conclude? www.bls.gov Employee Tenure of 20 Workers 4.6 2.6 3.3 2.8 1.5 1.9 4.0 5.0 3.9 5.1 3.7 5.4 3.6 3.9 6.2 1.7 4.6 3.1 4.4 3.6 TABLE FOR EXERCISE 2 Employee tenure for a sample of male workers Employee tenure for a sample of female workers 3.9 4.4 4.4 4.9 4.7 5.4 4.3 4.3 4.9 4.0 3.8 1.8 3.6 5.1 4.7 5.1 2.3 3.3 6.5 2.2 0.9 5.2 5.1 3.0 1.3 4.0 TABLE FOR EXERCISE 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
EXCEL MINITAB TI-84 PLUS TECHNOLOGY U.S. Income and Economic Research 634 CHAPTER 11 Nonparametric Tests Extended solutions are given in the technology manuals that accompany this text. Technical instruction is provided for Minitab, Excel, and the TI-84 Plus. The National Bureau of Economic Research (NBER) is a private, nonprofit, nonpartisan research organization. The NBER provides information for better understanding of how the U.S. economy works. Researchers at the NBER concentrate on four types of empirical research: developing new statistical measurements, estimating quantitative models of economic behavior, assessing the effects of public policies on the U.S. economy, and projecting the effects of alternative policy proposals. One of the NBER’s interests is the median income of people in different regions of the United States. The table at the right shows the annual incomes (in dollars) of a random sample of people (15 years and over) in a recent year in four U.S. regions: Northeast, Midwest, South, and West. In Exercises 1– 5, refer to the annual incomes of people in the table. Use a = 0.05 for all tests. 1. Construct a box-and-whisker plot for each region. Do the median annual incomes appear to differ between regions? 2. Use technology to perform a sign test to test the claim that the median annual income in the Midwest is greater than $30,000. 3. Use technology to perform a Wilcoxon rank sum test to test the claim that the median annual incomes in the Northeast and South are the same. 4. Use technology to perform a Kruskal-Wallis test to test the claim that the distributions of annual incomes for all four regions are the same. 5. Use technology to perform a one-way ANOVA to test the claim that the average annual incomes for all four regions are the same. Assume that the populations of incomes are normally distributed, the samples are independent, and the population variances are equal. How do your results compare with those in Exercise 4? 6. Repeat Exercises 1, 3, 4, and 5 using the data in the table below. The table shows the annual incomes (in dollars) of a random sample of families in a recent year in four U.S. regions: Northeast, Midwest, South, and West. Annual income of families (in dollars) Northeast Midwest South West 70,225 67,357 61,072 70,527 128,686 97,795 63,918 80,168 91,252 45,198 54,699 59,137 127,864 64,479 99,562 76,928 79,411 84,647 61,082 61,302 62,529 60,658 39,088 90,710 56,461 79,352 66,672 69,716 80,559 72,338 42,988 98,707 59,332 75,972 71,434 99,676 88,559 66,853 58,433 47,719 54,603 72,805 85,764 76,136 79,256 69,636 56,547 54,417 70,807 82,608 65,464 71,171 87,708 71,869 49,965 76,402 69,976 91,479 61,471 53,273 EXERCISES Annual income of people (in dollars) Northeast Midwest South West 45,481 25,781 19,946 37,922 31,922 28,326 35,140 31,198 27,750 26,910 33,323 24,129 23,179 34,609 36,008 32,194 24,304 32,945 18,030 34,924 32,216 32,119 24,251 22,491 30,393 30,990 24,581 28,668 28,897 44,317 32,005 42,207 25,981 18,021 37,091 24,465 20,439 42,193 33,866 20,776 40,562 25,054 21,746 28,521 48,863 27,703 26,324 37,422
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help