Unit 9- BUSN 3000
docx
keyboard_arrow_up
School
University Of Georgia *
*We aren’t endorsed by this school
Course
3000
Subject
Statistics
Date
Apr 3, 2024
Type
docx
Pages
5
Uploaded by AdmiralField13659
BUSN 3000
Unit 9: Models for Categorical Data
Unit 9: Models for Categorical Data – Chi-squared Tests
Goodness-of-fit test - one categorical variable, more than two categories
Example: Where do BUSN 3000 students live? Is the distribution the same as the UGA population? Housing
UGA
population
Observed counts
(sample data)
Expected counts
(if Ho true)
Off-campus
(not UGA-owned or affiliated)
66%
22
25.08
Fraternity or Sorority housing
5.5%
14
2.09
On-campus dorms or other UGA-owned or affiliated housing
28.5%
2
10.83
Total
100%
38
38
Could these observed counts have occurred just by chance if the distribution of housing choices for BUSN 3000 students were really the same as the UGA population?
H
0
:
p
off
−
campus
=
0.66
, p
Greek
=
0.055
, p
on
−
campus
=
0.285
BUSN 3000 has same distribution as UGA population H
A
:
at least one p different from what’s given
The chi-squared (
χ
2
) test statistic
compares the observed counts to the expected counts. χ
2
=
∑
(
Observed count −
Expected count
)
2
Expected count
= (22-25.08)^2/25.08 + (14-2.09)^2/2.09 + (2-
10.83)^2/10.83
When the observed counts are somewhat close to the expected counts…
χ
2
is large / small and the evidence against the null hypothesis is strong / weak.
When the observed counts differ greatly from the expected counts…
χ
2
is large / small and the evidence against the null hypothesis is strong / weak. 1
BUSN 3000
Unit 9: Models for Categorical Data
Student housing (continued)
H
0
:
p
off
−
campus
=
0.66
, p
Greek
=
0.055
, p
on
−
campus
=
0.285
H
A
:
At least one p
is different from these values. P-values for chi-squared tests
How large must the chi-squared statistic be to convince us that the null hypothesis is not true?
df = number of
categories – 1 (always choose second option for chi squared)
If the distribution of housing choices for BUSN students were really _____
the same as (Ho true) ______UGA students overall, sample results like ours would be… unlikely
For α
=
0.05
, state your conclusion in context.
o
There is sufficient / insufficient evidence to conclude that distribution of housing choices for BUSN students is ______
different from
_______ the overall UGA population. Conducting a goodness-of-fit test using Analyze – Distribution in JMP
2
BUSN 3000
Unit 9: Models for Categorical Data
Using residuals as a follow-up analysis
Why is this follow-up necessary?
at least one p is different form what is given (which ones? And by how much?)
The residuals show which individual categories have large differences between observed and expected counts.
A positive residual means observed count is ____
larger
____
than expected.
A negative residual means observed count is ___
smaller
____ than expected.
Values less than -2 or greater than 2 are unusual.
Housing
UGA
population
Observed
counts
Expected
counts
Deviation
(obs-exp)
Standardized
residual
Off-campus
(not UGA-owned or affiliated)
66%
22
25.08
-3.08
-0.615
Fraternity or Sorority housing
5.5%
14
2.09
11.91
8.238
On-campus dorms or other UGA-owned or affiliated housing
28.5%
2
10.83
-8.83
-2.683
Total
100%
38
38
0
Checking conditions for a chi-squared test
1.
Random – random selection means generalization to population
(our sample may not be representative of BUSN 3000 population)
Random Assignment means causation
2.
Sample size large enough – expected counts must all be at least 5
(sample size condition is not met because or smallest expected value is 2.09)
Two-way tables and segmented bar graphs
3
residual
=
observed count
−
expected count
√
expectedcount
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
BUSN 3000
Unit 9: Models for Categorical Data
Example: Is more choice always better? A researcher set up a tasting booth for
jams in a grocery store. She alternated the choice set: sometimes the booth
would feature 6 different jams and sometimes 24.
Is
there any evidence of a relationship between size of the
choice set and whether the customer stopped?
p-hat small = 65/157 = 0.4013
P-hat large = 98/163 = 0.6012
There is a relationship between size fo choice set and topping in this sample Chi-squared tests for independence
To test the relationship between two categorical variables, we use the chi-squared test of independence. H
0
:
size of choice set not related to customers stopping in the population
H
A
:
size of choice set is related to customers stopping in the population
What counts would you expect if there were no relationship between size of the choice set and whether the customer stopped?
Overall = 161/320 = 50.31% stopped
50.31% of 157 = 78.99
expected count = (row total)(col total)/overall total
Chi-squared test statistic and p-value
χ
2
=
∑
(
Observed count −
Expected count
)
2
Expected count
4
Customer stopped?
Yes
No
Total
Small choice set (6)
63
94
157
Large choice set (24)
98
65
163
Total
161
159
320
BUSN 3000
Unit 9: Models for Categorical Data
If the chances of stopping at the tasting booth were really ________________for large and small choice sets,
sample results like our would be…
For α
=
0.05
, state your conclusion in context.
*
o
There is sufficient / insufficient evidence to conclude that Conducting a test of independence using Analyze – Fit Y by X in JMP
Click the red arrow below the graph (next to Contingency Table
) to show counts, percentages, etc.
*
This isn’t the end of the story. Try using JMP to investigate how size of the choice set affects whether the customer ultimately purchased one of the jams. 5
Related Documents
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill