Day 17 Chi-Square Homogeneity NOTES

docx

School

Rochester Institute of Technology *

*We aren’t endorsed by this school

Course

146

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

10

Uploaded by AdmiralSeaLion3703

Report
STAT 146 Intro to Statistics II Day 17 Chi-Square Test for Homogeneity Notes Table of Contents I. The Test for Homogeneity II. Minitab steps for generating a Test of Homogeneity Pearson Chi-square and P-value III. Complete testing process for the Test for Homogeneity (Including Minitab steps) IV. Requirements for a Chi-Square Test V. Examples and Completed Examples 1
I. The Test for Homogeneity The Test for Homogeneity is a test to determine whether frequency counts are the same across multiple populations. This test is used to see if many different samples come from populations with the same distribution. II. Minitab steps for generating a Test of Homogeneity Pearson Chi-square and P-value These steps are the same Minitab steps that you use for a Test for Association. (Your null and alternative are very different, however). Use the appropriate Chi-square test in Minitab Stat Tables Chi square Test for Association If you have raw data (2 columns representing two variables of data in Minitab, choose ‘ Raw data’. Place one variable in the Rows, and one variable in the Columns. Click on ‘Statistics’, select Each cell’s contribution to chi-square. Report the Pearson chi-square. If you have summarized data in a two-way table (data that has already been counted and put into a contingency table), choose ‘ Summarized data’. Click on ‘Statistics’, select Each cell’s contribution to chi-square. Report the Pearson chi-square. 2
III. The Complete Testing Process for Test for Homogeneity Multiple populations are being studied. List the multiple populations: _____, _______, and ______ **NOTE: there will be at least 2 populations. The ONE categorical variable being studied is ________. The goal is to find evidence to show that the true proportion of ____________ is NOT the SAME for all ________(state the populations). METHOD Ho: The true proportion of _______________________ is the SAME for ____________ _________ (categorical variable being studied) (state the multiple populations from which sampling took place) Ha: The true proportion of _______________________ is NOT the SAME for ____________________ (categorical variable being studied) (state the multiple populations from which sampling took place) OR you can state… Ha: At least one proportion is different from the others. Alpha = ____ Use the appropriate Chi-square test in Minitab Stat Tables Chi square Test for Association Check if the Chi-square requirement has been met: No more than 20% of expected counts < 5. Chi-square = _______ df = ____ State the P-value = Is the P-value less than alpha? Decide if you CAN or CANNOT reject the null. At the __% level of significance, the sample does/does not provide sufficient support to say that the true proportion of ____________ is NOT the SAME for all (state the populations). Follow-up sentence. if you have rejected the null. **If you conclude that the true proportions are NOT THE SAME for all of the populations, report which cell is the greatest contributor to Chi-square and if the observed values are greater than or less than the expected values. You will look at the Minitab output and identify the cell that has the greatest contribution to Chi- square. State this in a complete sentence. Then, identify if the observed counts are less than/greater than what was expected in that cell. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
IV. Requirements for a Chi-Square Test Have the requirements been met? The test is valid if expected frequencies are > 1 and No more than 20% of expected frequencies are less than 5. Look at your Minitab output and find the expected counts. Are they all greater than 5? If so, you have met both requirements above. If there are cells with expected counts less than 5, Minitab will have alerted you with this message: * NOTE * 2 cells with expected counts less than 5 If Minitab shows this * NOTE * (above), it does not mean that we have NOT met the requirements. You need to first determine if the percentage of cells reported is more than 20% of the cells. 4
V. Chi-Square Test for Homogeneity 5
Here is a video from Prof. Coffey showing how to enter the data into Minitab, how to construct a stacked bar chart and how to run the test for homogeneity using the data from Example 1. https://youtu.be/JKkoryw0FE8 VI. Examples and Completed Examples Example 1) The incidences of three types of malaria were randomly collected from three tropical regions. Asia Africa South America Totals Malaria A 31 14 45 90 Malaria B 2 5 53 60 Malaria C 53 45 2 100 Totals 86 64 100 250 Investigate the possible differences among types of malaria by tropical region by completing the appropriate test. Repeat: Here is a video from Prof. Coffey showing how to enter the data into Minitab, how to construct a stacked bar chart and how to run the test for homogeneity using the data from Example 1. https://youtu.be/JKkoryw0FE8 a) Which Chi Square test should be used to study these data? b) Run the appropriate test and show the complete testing process. 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Completed Example 1 a) This is a Test for Homogeneity since multiple populations were sampled. The tropical regions are the multiple populations: Asia, Africa, South America and the same categorical variable is studied across these regions. That variable is: type of malaria. b) Complete the appropriate hypothesis test and show the complete testing process. Multiple populations are being studied and they are the three tropical regions: Asia, Africa and South America The ONE categorical variable being studied is type of malaria (type A, B or C) The goal is to find evidence to show that the true proportion of malaria types is NOT the SAME for all three tropical regions. Ho: The true proportion of malaria types is the SAME for all three tropical regions. Ha: The true proportion of malaria types is NOT the SAME for all three tropical regions. Alpha = .05 This test is valid since all expected frequencies are greater than 5. Chi square = 125.519. df = 4. P-value = very small The P-value is less than alpha = .05 and we CAN reject the null hypothesis. At the 5% level of significance, the sample DOES provide sufficient evidence, to say that the true proportion of malaria types is NOT the same for all three tropical regions. FOLLOW-UP since we rejected the null: The greatest contributor to Chi-square is Malaria C in South America where what was observed is less than what was expected. 7
Example 2) A large corporation is interested in studying whether the levels of stress on the job is the same for random samples of employees with varying lengths of commuting times. a) Which Chi Square test should be used to study these data? Explain. b) Create a 100% stacked bar chart. Provide one sentence of analysis . (Minitab Web App users will not be able to get the bars the SAME height each time) PRACTICE building this with the counts from the 2-way table: c) Complete the appropriate hypothesis test using α = .05. (Show the complete testing process) Note: You will type this table into Minitab. 8
Completed Example 2) a) Which Chi Square test should be used to study these data? Explain. This is a test of homogeneity since two samples were taken (new and old textbook) and one categorical variable is being studied. b) I have created a 100% stacked bar chart. Provide one sentence of analysis . PRACTICE building this with the counts from the 2-way table: (MAC users) The over 45-minute commute time occurs much less with high job stress. a) Complete the appropriate hypothesis test. (The groups are the three different commute times.) (The variable is the job-related stress levels; 3 levels). GOAL: To test to see if the population proportion of job-related stress levels is NOT the same across the different commute times. Ho: The population proportion of job-related stress levels is the same across the different commute times. Ha: The population proportion of job-related stress levels is NOT the same across the different commute times. Alpha = 0.05 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
By looking over the expected counts, I can see that we have met the requirements of a Chi-square since all expected counts are greater than 5. Chi-square test statistic = 9.831 with df = 4 P-value = 0.043 We will reject the null since P-value is less than alpha (0.05). At the 5% level of significance, the sample data DOES provide sufficient evidence to say that the population proportion of job-related stress levels is NOT the same across the different commute times. In other words, the level of stress changes depending on commute times. The greatest contributor to Chi-square is for those with a commute over 45 minutes with HIGH job stress since what was observed (count= 7) was LESS than what was expected (14.16). 10