unit3-practice-questions

pdf

School

Memorial University of Newfoundland *

*We aren’t endorsed by this school

Course

2501

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

9

Uploaded by PresidentCobra2346

Report
CHI-SQUARE DISTRIBUTION Exercise 1. If the number of degrees of freedom for a chi-square distribution is 25, what is the population mean and standard deviation? Solution mean = 25 and standard deviation = 7.0711 Exercise 2. If df > 90, the distribution is _____________. If df = 15, the distribution is ________________. Solution approximately normal, skewed right. Exercise 3. An article in the New England Journal of Medicine, discussed a study on smokers in California and Hawaii. In one part of the report, the self-reported ethnicity and smoking levels per day were given. Of the people smoking at most ten cigarettes per day, there were 9,886 African Americans, 2,745 Native Hawaiians, 12,831 Latinos, 8,378 Japanese Americans and 7,650 whites. Of the people smoking 11 to 20 cigarettes per day, there were 6,514 African Americans, 3,062 Native Hawaiians, 4,932 Latinos, 10,680 Japanese Americans, and 9,877 whites. Of the people smoking 21 to 30 cigarettes per day, there were 1,671 African Americans, 1,419 Native Hawaiians, 1,406 Latinos, 4,715 Japanese Americans, and 6,062 whites. Of the people smoking at least 31 cigarettes per day, there were 759 African Americans, 788 Native Hawaiians, 800 Latinos, 2,305 Japanese Americans, and 3,970 whites. Complete the table. Smoking Level Per Day Afric an Amer ican Native Hawaiian Latino Japanese American s White TOTA LS 1-10 11-20 21-30 31+ Totals Table 11.26 Smoking Levels by Ethnicity (observed) Solution Smoking Level Per Day Afric an Amer ican Native Hawaiian Latino Japanese America ns White Totals 1-10 9,886 2,745 12,831 8,378 7,650 41,490 11-20 6,514 3,062 4,932 10,680 9,877 35,065 21-30 1,671 1,419 1,406 4,715 6,062 15,273 31+ 759 788 800 2,305 3,970 8,622 Totals 18,83 0 8,014 19,969 26,078 27,55 9 10,0450 State the hypotheses.
H 0 : _______ H a : _______ Solution H 0 : Smoking level is independent of ethnic group. H a : Smoking level is dependent on ethnic group. Enter expected values in Table. Round to two decimal places. Solution Smoking Level Per Day African America n Native Hawaiian Latino Japanese Americans White 1-10 7777.57 3310.11 8248.02 10771.29 11383.01 11-20 6573.16 2797.52 6970.76 9103.29 9620.27 21-30 2863.02 1218.49 3036.20 3965.05 4190.23 31+ 1616.25 687.87 1714.01 2238.37 2365.49 . df = ______ Solution 12 χ 2 test statistic =______ Solution 10,301.8 State the decision and conclusion (in a complete sentence) for the following preconceived levels of α . α = 0.05 a. Decision: ___________________ b. Reason for the decision:___________________ c. Conclusion (write out in a complete sentence):___________________ Solution a. Reject the null hypothesis. b. p -value < alpha c. There is sufficient evidence to conclude that smoking level is dependent on ethnic group. State the decision and conclusion (in a complete sentence) for the following preconceived levels of α . α = 0.01 a. Decision:___________________
b. Reason for the decision:___________________ c. Conclusion (write out in a complete sentence):___________________ Solution a. Reject the null hypothesis. b. p -value < alpha c. There is sufficient evidence to conclude that smoking level is dependent on the ethnic group. Exercise 4 The marital status distribution of the U.S. male population, ages 15 and older, is as shown in Table 11.35. Marital Status Percent Expected Frequency never married 31.3 married 56.1 widowed 2.5 divorced/separated 10.1 Table 11.35 Suppose that a random sample of 400 U.S. young adult males, 18 to 24 years old, yielded the following frequency distribution. We are interested in whether this age group of males fits the distribution of the U.S. adult population. Calculate the frequency one would expect when surveying 400 people. Fill in Table 11.36, rounding to two decimal places. Marital Status Frequency never married 140 married 238 widowed 2 divorced/separated 20 Table 11.36 Solution Marital Status Percent Expected Frequency never married 31.3 125.2 married 56.1 224.4 widowed 2.5 10 divorced/separated 10.1 40.4 Table 11.62 a. The data fits the distribution. b. The data do not fit the distribution. c. 3 d. chi-square distribution with df = 3 e. 19.27 f. 0.0002 g. Check student’s solution.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
h. i. Alpha = 0.05 ii. Decision: Reject null iii. Reason for decision: p -value < alpha iv. Conclusion: Data does not fit the distribution. Exercise 5. Table 11.37 contains information from a survey among 499 participants classified according to their age groups. The second column shows the percentage of obese people per age class among the study participants. The last column comes from a different study at the national level that shows the corresponding percentages of obese people in the same age classes in the USA. Perform a hypothesis test at the 5% significance level to determine whether the survey participants are a representative sample of the USA obese population. Age Class (Years) Obese (Percentage) Expected USA average (Percentage) 20 30 75.0 32.6 31 40 26.5 32.6 41 50 13.6 36.6 51 60 21.9 36.6 61 70 21.0 39.7 Table 11.37 Solution a. H 0 : Surveyed obese fit the distribution of expected obese b. H a : Surveyed obese do not fit the distribution of expected obese c. df = 4 d. chi-square distribution with df = 4 e. test statistic = 54.01 f. p -value = 0 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Reject the null hypothesis. iii. Reason for decision: p -value < alpha iv. Conclusion: At the 5% level of significance, from the data, there is sufficient evidence to conclude that the surveyed obese do not fit the distribution of expected obese. Exercise 6. Car manufacturers are interested in whether there is a relationship between the size of car an in dividual drives and the number of people in the driver’s family (that is, whether car size and family size are independent). To test this, suppose that 800 car owners were randomly surveyed with the results in Table 11.39 . Conduct a test of independence. Family Size Sub & Compact Mid-size Full-size Van & Truck 1 20 35 40 35 2 20 50 70 80 3 4 20 50 100 90
5+ 20 30 70 70 Table 11.39 Solution a. H 0 : Car size is independent of family size. b. H a : Car size is dependent on family size. c. df = 9 d. chi-square distribution with df = 9 e. test statistic = 15.8284 f. p -value = 0.0706 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Do not reject the null hypothesis. iii. Reason for decision: p -value > alpha iv. Conclusion: At the 5% significance level, there is insufficient evidence to conclude that car size and family size are dependent. Exercise 7. College students may be interested in whether or not their majors have any effect on starting salaries after graduation. Suppose that 300 recent graduates were surveyed as to their majors in college and their starting salaries after graduation. Table 11.40 shows the data. Conduct a test of independence. Major < $50,000 $50,000 $68,999 $69,000+ English 5 20 5 Engineering 10 30 60 Nursing 10 15 15 Business 10 20 30 Psychology 20 30 20 Table 11.40 Solution a. H 0 : Salaries are independent of majors. b. H a : Salaries are dependent on majors. c. df = 8 d. chi-square distribution with df = 8 e. test statistic = 33.55 f. p -value = 0 g. Check s tudent’s solution. h. i. Alpha: 0.05 ii. Decision: Reject the null. iii. Reason for decision: p -value < alpha Conclusion: Major and starting salary are not independent events. Exercise 8. Some travel agents claim that honeymoon hot spots vary according to age of the bride. Suppose that 280 recent brides were interviewed as to where they spent their honeymoons. The information is given in Table 11.41. Conduct a test of independence. Location 20 29 30 39 40 49 50 and over Niagara Falls 15 25 25 20
Poconos 15 25 25 10 Europe 10 25 15 5 Virgin Islands 20 25 15 5 Table 11.41 Solution a. H 0 : Honeymoon locations are independent of bride’s age. b. H a : Honeymoon locations are dependent on bride’s age. c. df = 9 d. chi-square distribution with df = 9 e. test statistic = 15.7027 f. p -value = 0.0734 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Do not reject the null hypothesis. iii. Reason for decision: p -value > alpha Conclusion: At the 5% significance level, there is insufficient evidence to conclude that honeymoon location and bride age are dependent. Exercise 9 A manager of a sports club keeps information concerning the main sport in which members participate and their ages. To test whether there is a relationship between the age of a member and his or her choice of sport, 643 members of the sports club are randomly selected. Conduct a test of independence. Sport 18 25 26 30 31 40 41 and over Racquetball 42 58 30 46 Tennis 58 76 38 65 Swimming 72 60 65 33 iv. Table 11.42 Solution a. H 0 : The age of a member and his or her main sport are independent. b. H a : The age of a member and his or her main sport are dependent. c. df = 6 d. chi-square distribution with df = 6 e. test statistic = 25.21 f. p -value = 0.0003 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Reject the null hypothesis. iii. Reason for decision: p -value < alpha Conclusion: At the 5% significance level, there is sufficient evidence to conclude that sport and age are dependent. Exercise 10. A major food manufacturer is concerned that the sales for its skinny french fries have been decreasing. As a part of a feasibility study, the company conducts research into
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
the types of fries sold across the country to determine if the type of fries sold is independent of the area of the country. The results of the study are shown in Table 11.43 Conduct a test of independence. Type of Fries Northeast South Central West skinny fries 70 50 20 25 curly fries 100 60 15 30 steak fries 20 40 10 10 iv. Table 11.43 Solution a. H 0 : The types of fries sold are independent of the location. b. H a : The types of fries sold are dependent on the location. c. df = 6 d. chi-square distribution with df = 6 e. test statistic =18.8369 f. p -value = 0.0044 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Reject the null hypothesis. iii. Reason for decision: p -value < alpha Conclusion: At the 5% significance level, There is sufficient evidence that types of fries and location are dependent. Exercise 11. According to Dan Lenard, an independent insurance agent in the Buffalo, N.Y. area, the following is a breakdown of the amount of life insurance purchased by males in the following age groups. He is interested in whether the age of the male and the amount of life insurance purchased are independent events. Conduct a test for independence. Age of Males None <$200,000 $200,000 $400,000 $400,001 $1,000,000 $1,000,000 + 20 29 40 15 40 0 5 30 39 35 5 20 20 10 40 49 20 0 30 0 30 50+ 40 30 15 15 10 iv. Table 11.44 Solution a. H 0 : The amount of life insurance is independent of age. b. H a : The amount of life insurance is dependent on age. c. df = 12 d. chi--square distribution with df = 12 e. test statistic = 1 25.7 4 f. p -value = 0 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Reject null
iii. Reason for decision: p -value < alpha Conclusion: At the 5% significance level, there is sufficient evidence to conclude that amount of life insurance and age are dependent. Exercise 12. A Psychologist is interested in testing whether there is a difference in the distribution of personality types for business majors and social science majors. The results of the study are shown in Table 11.49. Conduct a test of homogeneity. Test at a 5% level of significance. Open Conscie ntious Extrovert Agreeable Neurotic Business 41 52 46 61 58 Social Science 72 75 63 80 65 Table 11.49 Solution a. H 0 : The distribution for personality types is the same for both majors b. H a : The distribution for personality types is not the same for both majors c. df = 4 d. chi-square with df = 4 e. test statistic = 3.01 f. p -value = 0.5568 g. Check student’s solution. h. i. Alpha: 0.05 ii. Decision: Do not reject the null hypothesis. iii. Reason for decision: p -value > alpha iv. Conclusion: There is insufficient evidence to conclude that the distribution of personality types is different for business and social science majors. 13. For example, a sample of teenagers might be divided into male and female on one hand and those who are and are not currently studying for a statistics exam on the other. For example, we want to test whether the proportion of studying students is higher among the women than among the men. The data might look like this: Men Women Row total Studying 1 9 10 Not-studying 11 3 14 Column total 12 12 24
The question we ask about these data is: Knowing that 10 of these 24 teenagers are studying and that 12 of the 24 are female, and assuming the null hypothesis that men and women are equally likely to study, what is the probability that these 10 teenagers who are studying would be so unevenly distributed between the women and the men? If we were to choose 10 of the teenagers at random, what is the probability that 9 or more of them would be among the 12 women and only 1 or fewer from among the 12 men? By hypergeometric distribution, p-value = 0.001346 For the case Men Women Row Total Studying 0 10 10 Non-studying 12 2 14 Column Total 12 12 24 p-value=0.000034. In order to calculate the significance of the observed data, i.e. the total probability of observing data as extreme or more extreme if the null hypothesis is true, we have to calculate the values of p for both these tables, and add them together. This gives a one- tailed test , with p approximately 0.001346 + 0.000034 = 0.001380. > x<-matrix(c(1,11,9,3),2,2) > fisher.test(x,alternative="less") Fisher's Exact Test for Count Data data: x p-value = 0.00138 alternative hypothesis: true odds ratio is less than 1 95 percent confidence interval: 0.0000000 0.3260026 sample estimates: odds ratio 0.03723312
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help