STAT4610Project3
docx
keyboard_arrow_up
School
University Of Denver *
*We aren’t endorsed by this school
Course
4610
Subject
Statistics
Date
Jun 12, 2024
Type
docx
Pages
13
Uploaded by Deacon_Rose_Jackal10
STAT4610 Project 3
Statistical Tests
USE THE FOLLOWING SCENARIO AND DATA SET FOR PROBLEMS 1 AND 2.
As the Executive Vice President of Sales for ACME Nose Hair Trimmers Inc., your sales force is
divided into four regions: NW, SW, NE, and SE. Your CFO and HR Director have determined that you can promote 400 of your sales force in the following year, and you allocate those promotions to your Regional VPs of Sales as follows: NW gets 80, SW gets 120, and NE and SE get 100 each. (This allocation is based on the relative sizes of the regional sales forces.)
Your employees identify themselves by hair color, and when the promotions have been completed, your HR Director presents you with a demographic break-down of regional promotions by hair color as follows:
REGION
NW
SW
NE
SE
HAIR
Brunette
30
30
40
25
COLOR
Brown
20
20
20
15
Blond
10
40
20
25
Red
10
20
10
15
Gray
5
0
10
5
Bald
5
10
0
15
1. You receive a number of complaints from your employees that this year’s promotions were not assigned fairly (in that some VPs favored different hair colors), so you decide to determine if the distribution of promotions differed by region. You conduct a hypothesis test to this effect.
a. State your null hypothesis for this test.
b. State your alternative hypothesis for this test.
c. Describe the Type I error for this test and what its implications are for ACME NHT Inc.
d. Describe the Type II error for this test and what its implications are for ACME NHT Inc.
e. What is the critical value of your test statistic if you are willing to reject your null hypothesis at the
= .05 level of significance? (Ensure you identify what type of statistic it is.)
f. What is the calculated value of your test statistic? (Show as much work as necessary to ensure
partial credit if you are not confident of your answer.)
g. What is the p
-value for this test? What does it represent?
h. What do you conclude about your hypothesis test, and why?
i. What action will you take as the Executive Vice President of Sales?
2. Because of further employee complaints, the ACLU becomes involved in a discrimination lawsuit against ACME NHT Inc., accusing you of discriminatory promotion practices within your sales force. You immediately ask your HR Director for the distribution of your total sales force by hair color (expressed as percentages). She provides you with the following data:
Brunett
e
Brown
Blond
Red
Gray
Bald
30%
20%
25%
12%
5%
8%
You decide to conduct a hypothesis test to determine if your total promotions follow the distribution of the demographics of your sales force.
a. State your null hypothesis.
b. State your alternative hypothesis.
c. Describe the Type I error for this test and what its implications are for ACME NHT Inc.
d. Describe the Type II error for this test and what its implications are for ACME NHT Inc.
e. Fill in the following table with your expected number of promotions for each hair color if H
0
is true. (Recall you had a total of 400 promotions.)
Brunett
e
Brown
Blon
d
Red
Gray
Bald
f. What is the critical value of your test statistic if you are willing to reject your null hypothesis at the
= .10 level of significance? (Ensure you identify what type of statistic it is.)
g. What is the calculated value of your test statistic? (Show as much work as necessary to ensure partial credit if you are not confident of your answer.)
h. What is the p
-value for this test? What does it represent?
i. What do you conclude about your hypothesis test, and why?
j. What action will you take as the Executive Vice President of Sales?
3. The ACME NHT Inc. Executive Vice President of Marketing is considering a national advertising campaign, and asks you if the averages of your quarterly sales differ significantly by region. You already have available your quarterly sales totals for each region (in millions of dollars). They are as follows:
REGION
NW
SW
NE
SE
Q1
5
7
8
4
Q2
7
6
7
6
Q3
3
5
9
5
Q4
6
4
6
3
You decide to test if there is a statistical difference in your quarterly regional sales averages.
a. State your null hypothesis for this test.
b. State your alternative hypothesis for this test.
c. Describe the Type I error for this test and what its implications are for ACME NHT Inc.
d. Describe the Type II error for this test and what its implications are for ACME NHT Inc.
e. What is the critical value of your test statistic if you are willing to reject your null hypothesis at the
= .05 level of significance? (Ensure you identify what type of statistic it is.)
f. What is the calculated value of your test statistic? (Show as much work as necessary to ensure
partial credit if you are not confident of your answer.)
g. What is the p
-value for this test? What does it represent?
h. What do you conclude about your hypothesis test, and why?
i. What action will you take as the Executive Vice President of Sales?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
j. Which two regions are most likely different from each other in sales performance? 4. In your attempt to “catch them all” you hunt the prized Pokemon, Mew and Mewtwo. As you
battle them with your top 25 Pokemon, you lose every battle. However, after each battle, you notice that the hit points (HP) of Mew and Mewtwo have been reduced. The remaining HPs of Mew and Mewtwo are shown below after battling your respective Pokemon. (A higher HP remaining indicates more success in battle.) You suspect that Mewtwo is the tougher opponent, and you want to test this hypothesis at the 95% level of confidence.
Attacker
Mew
Mewtwo
Bulbasaur
402
377
Ivysaur
305
354
Venusaur
348
343
Charmande
r
350
341
Charmeleon
320
347
Charizard
340
401
Squirtle
396
406
Wartortle
311
364
Blastoise
367
381
Caterpie
309
402
Metapod
391
387
Butterfree
385
347
sWeedle
390
349
Kakuna
313
402
Beedrill
396
403
Pidgey
394
389
Pidgeotto
329
395
Pidgeot
375
356
Rattata
364
340
Raticate
340
400
Spearow
365
405
Fearow
395
402
Ekans
365
377
Arbok
396
341
Pikachu
381
399
Assume that before each battle, Mew and Mewtwo started with the same number of HPs. a. State the null hypotheses for this test.
b. State the alternative hypotheses for this test.
c. What type of test will you use to test this hypothesis?
d. Describe the Type I Error for this test, and explain its implication for you as you continue your quest.
e. Describe the Type II Error for this test, and explain its implication for you as you continue your quest.
f. What is the rejection region for this hypothesis test?
g. Calculate the test statistic for this hypothesis test. What is your conclusion?
h. If you reject the null hypothesis for this test, what is the probability that you have committed a Type I Error? (i.e., find the p
-value for this test.)
i. What action will you take as you continue your quest based on your conclusion.
j. Which do like better: Mew or Mewtwo?
5. You conduct an experiment where you collect the data in the table below. What population distribution does this sample come from? How confident are you of your conclusion?
14.7991
1
28.2385
8
22.7928
24.5166
7
22.5070
2
35.5008
9
21.7505
7
28.0875
2
23.4774
6
17.7120
1
26.9566
3
28.8737
9
30.5948
1
30.8270
1
24.7706
4
23.7422
3
25.5293
6
29.0418
5
36.8747
9
30.3736
2
27.9390
6
21.9075
4
26.4268
7
19.9263
8
22.9328
4
24.1324
6
20.7254
24.2610
7
18.3925
9
27.6490
9
34.6292
18.9683
13.4252
1
26.8225
4
31.9074
1
24.8972
3
27.2288
1
19.4729
9
29.7634
4
32.5879
9
16.8599
4
29.5746
27.5072
25.6561
9
15.6708
27.3671
6
27.7929
8
26.6896
2
22.0614
2
23.0711
9
31.6090
2
26.0389
3
32.9860
6
28.5494
3
20.3410
1
33.4400
26.5577
21.0856
23.8731
23.2079
6
7
4
3
8
29.2143
6
25.2763
5
28.7097
6
29.3917
7
22.2124
8
19.7051
9
23.3433
7
25.0236
5
16.3265
4
25.4027
7
22.3609
4
32.2315
6
22.6509
1
20.9082
3
33.9734
6
18.6909
8
19.2280
7
23.6103
8
30.1620
7
21.5902
4
28.3924
6
31.3317
26.1675
4
27.0920
2
20.1379
3
25.9108
8
20.5365
6
17.3139
4
21.4262
1
31.0719
3
24.6438
1
18.0684
4
26.2968
3
29.7247
7
24.0031
7
25.15
29.9589
9
21.2581
5
24.3891
2
25.7833
3
6. Going back to the home sales data for Delta County, CO (see your Project 1 work) you want to see when the market peaked and where it seemed to return to its previous levels. You consider the average sale price for the County for different years.
a. Can you determine if the market peaked in 2007 or 2008? (Is there a difference between the averages prices in these two years?)
b. Can we conclude that in 2010 and 2011 that the market had returned to 2006 levels?
c. Did the market increase from 1995 to 2007? Are you sure?
7. For this problem, use the data file STAT4610Projecdt3ResidentialData2009. This file contains data on 75% of the homes sold in Colorado in 2009 (Sales2009 tab) and data on the listing real estate agents for those homes (Listing Broker Data tab). We are interested in investigating the differences in the “Big 5” Realtor Associations in Colorado (identified under the “Board” field):
Aurora Association of Realtors
Boulder County
Denver Metro Association of Realtors (Not Denver Metro Comm)
Douglas/Elbert Realtor Association
South Metro Denver Realtor Association
The data we are interested in is “Close Price,” i.e., the actual price the homes sold for.
Unfortunately, the Sales tab does not have the Realtor Association included, so the two files must be merged through a common key. (Fortunately, there is one!) The List Agent MLSID is a
unique identifier for the listing agents, and it is included with the Sales data.
By filtering out the close price data by Association and answer the following questions.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
a. What are the average closing prices for each Association in 2009.
b. Are any of the averages different from the rest? (Justify your answer with an appropriate test and a p
-value.)
c. Is there a difference between Boulder and South Metro? (Justify your answer with an appropriate test and a p
-value.)
d. Is there a difference between Aurora and Douglas/Elbert? (Justify your answer with an appropriate test and a p
-value.)
e. Is there a difference between South Metro and Denver Metro? (Justify your answer with an appropriate test and a p
-value.)
f. Is there a difference between Boulder and Douglas/Elbert? (Justify your answer with an appropriate test and a p
-value.)
1/
a. The null hypothesis (H0): The distribution of promotions across hair colors is the same for all regions.
b. The alternative hypothesis (H1): The distribution of promotions across hair colors differs for at least one region.
c. The Type I error in this test would be rejecting the null hypothesis when it is actually true, meaning concluding that the distribution of promotions differs across regions when, in fact, it does not. The implication of this error for ACME NHT Inc. would be taking unnecessary actions or making changes to the promotion process, which could lead to additional costs and potential legal issues.
d. The Type II error in this test would be failing to reject the null hypothesis when it is false, meaning concluding that the distribution of promotions is the same across regions when, in reality, it differs. The implication of this error for ACME NHT Inc. would be failing to address any potential discrimination or unfair promotion practices, which could lead to legal issues, employee dissatisfaction, and potential fines or penalties.
e. The appropriate test statistic would be the chi-square statistic - test of independence.
α = 0.05 and df =(6-1)*(4-1)=15
The critical value is 24.996
f. Calculated Test Statistic
NW:
Brunette: (80 * 125) / 400 = 25
Brown: (80 * 75) / 400 = 15
Blond: (80 * 95) / 400 = 19
Red: (80 * 55) / 400 = 11
Gray: (80 * 20) / 400 = 4
Bald: (80 * 30) / 400 = 6
SW:
Brunette: (120 * 125) / 400 = 37.5
Brown: (120 * 75) / 400 = 22.5
Blond: (120 * 95) / 400 = 28.5
Red: (120 * 55) / 400 = 16.5
Gray: (120 * 20) / 400 = 6
Bald: (120 * 30) / 400 = 9
NE:
Brunette: (100 * 125) / 400 = 31.25
Brown: (100 * 75) / 400 = 18.75
Blond: (100 * 95) / 400 = 23.75
Red: (100 * 55) / 400 = 13.75
Gray: (100 * 20) / 400 = 5
Bald: (100 * 30) / 400 = 7.5
SE:
Brunette: (100 * 125) / 400 = 31.25
Brown: (100 * 75) / 400 = 18.75
Blond: (100 * 95) / 400 = 23.75
Red: (100 * 55) / 400 = 13.75
Gray: (100 * 20) / 400 = 5
Bald: (100 * 30) / 400 = 7.5
Using the formula: X² = Σ((Observed - Expected)^2 / Expected)
The calculated chi-square statistic is: 24.96
g. With 15 degrees of freedom and a chi-square statistic of 24.96, the p-value is approximately 0.05.
h. Since the p-value of 0.05 is equal to the significance level of α = 0.05, we can reject the null hypothesis and conclude that the distribution of promotions differs across regions for at least one
hair color.
i. investigates the promotion procedures further and take appropriate actions to address any potential discrimination or unfair practices. This may include retraining regional VPs, setting stronger criteria, or performing more audits.
2/
a. The null hypothesis (H0): The distribution of promotions across hair colors follows the demographic distribution of the sales force. b. The alternative hypothesis (H1): The distribution of promotions across hair colors does not follow the demographic distribution of the sales force.
c. The Type I error in this test would be rejecting the null hypothesis when it is actually true, meaning concluding that the distribution of promotions differs from the demographic distribution
when, in fact, it does not. The implication of this error for ACME NHT Inc. would be taking unnecessary actions or making changes to the promotion process, which could lead to additional costs and potential legal issues. d. The Type II error in this test would be failing to reject the null hypothesis when it is false, meaning concluding that the distribution of promotions follows the demographic distribution when, in reality, it does not. The implication of this error for ACME NHT Inc. would be failing to address any potential discrimination or unfair promotion practices, which could lead to legal issues, employee dissatisfaction, and potential fines or penalties.
e.
Brunett
e
Brown
Blon
d
Red
Gray
Bald
120
80
100
48
20
32
f. For this question, still use same df?
The appropriate test statistic would be the chi-square statistic.
α = 0.1 and df =5
The critical value is 9.236
g. is it the same with question 1?
h. With 15 degrees of freedom and a chi-square statistic of 22.24, the p-value is approximately 0.102 (using a chi-square distribution table or statistical software).
i. Since the p-value of 0.102 is greater than the significance level of α = 0.10, we fail to reject the
null hypothesis. We do not have sufficient evidence to conclude that the distribution of promotions does not follow the demographic distribution of the sales force.
j. As the Executive Vice President of Sales, while I may not have strong statistical evidence of discrimination in the promotion process, it is still advisable to address the ACLU's concerns and ensure fair promotion practices going forward. This could involve reviewing the promotion guidelines, providing additional training to the regional VPs, and implementing regular audits to monitor the distribution of promotions across different demographics.
3/
a. The null hypothesis (H0): There is no difference in the average quarterly sales across the four regions.
b. The alternative hypothesis (H1): There is a difference in the average quarterly sales for at least
one region.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
c. Type I error: Rejecting H0 when it is true. Implications: The company might wrongly conclude that there are differences in sales averages between regions, potentially leading to unnecessary strategic changes and resource redistribution.
d. Type II error: Failing to reject H0 when it is false. Implications: ACME NHT Inc. might overlook actual regional discrepancies in sales performance, missing the opportunity to address underperforming areas or to capitalize on successful strategies.
e. The appropriate test statistic would be the chi-square statistic.
α = 0.05 and df = 9
The critical value is 16.919
4/
a. The null hypothesis (H0) for this test is: There is no difference in the remaining hit points (HP)
of Mew and Mewtwo after battling with the same Pokemon.
b. The alternative hypothesis (H1): Mewtwo has higher remaining hit points than Mew after battling with the same Pokemon.
c. Since we are comparing the means of two paired samples (Mew and Mewtwo battling against the same Pokemon), the appropriate test to use is the paired t-test.
d. The Type I error in this test would be rejecting the null hypothesis when it is true, meaning concluding that Mewtwo is tougher (has higher remaining HP) when, in fact, there is no difference between Mew and Mewtwo. The implication of this error would be to unnecessarily focus the efforts on defeating Mewtwo, potentially leading to wasted resources and time.
e. The Type II error in this test would be failing to reject the null hypothesis when it is false, meaning concluding that there is no difference between Mew and Mewtwo when, in fact, Mewtwo is tougher. The implication of this error would be to underestimate Mewtwo's strength and fail to prepare adequately for battling it, potentially leading to a more challenging and prolonged quest.
f. For a paired t-test with α = 0.05 (95% confidence level) and 24 degrees of freedom (since there
are 25 pairs of observations), the rejection region is t > 1.711 or t < -1.711. If the calculated test statistic falls outside this range, we reject the null hypothesis.
g. To calculate the test statistic, we need to find the differences between the remaining HP of Mew and Mewtwo for each pair of observations, calculate the mean and standard deviation of these differences, and then compute the t-statistic using the formula:
t = (mean of differences) / (standard deviation of differences / sqrt(n))
Calculating the differences:
377 - 402 = -25
354 - 305 = 49
... (continue for all 25 pairs)
Mean of differences = -2.64
Standard deviation of differences = 24.67
n = 25
t = -2.64 / (24.67 / sqrt(25)) = -0.563
Since the calculated t-statistic (-0.563) falls within the non-rejection region (-1.711 to 1.711), we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that Mewtwo has higher remaining hit points than Mew after battling with the same Pokemon.
h. If we reject the null hypothesis, the probability of committing a Type I error (p-value) would be the probability of obtaining a test statistic as extreme or more extreme than the calculated value under the assumption that the null hypothesis is true. For a two-tailed test with 24 degrees of freedom, the p-value corresponding to a t-statistic of -0.563 is approximately 0.578 (using a t-
distribution table or statistical software).
i. Based on the conclusion of failing to reject the null hypothesis, you should not assume that Mewtwo is a tougher opponent than Mew. You should continue your quest with a balanced approach, preparing equally for both Mew and Mewtwo, and not underestimating either of them.
j. Since the statistical analysis did not find evidence of a significant difference between Mew and
Mewtwo in terms of remaining hit points, it is difficult to objectively prefer one over the other based on this data alone. Your preference between Mew and Mewtwo may depend on other factors, such as their overall abilities, appearance, or personal preference.
5/
It’s likely be a normal population distribution.
6/
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12
19
95
19
96
19
97
19
98
19
99
20
00
20
01
20
02
20
03
20
04
20
05
20
06
2007
2008
20
09
20
10
20
11
20
12
0
50000
100000
150000
200000
250000
Average Monthly Price of Homes Sold in Delta County, CO 1995-2012
6b/
H0: In 2010 and 2011 that the market had returned to 2006 levels
H1: In 2010 and 2011 that the market had NOT returned to 2006
levels
Based on the P value, we reject the null hypothesis and can say
that the market had NOT returned to 2006 levels
6c/
Year
Average of s_sale_price
1995
75946.46119
1996
77478.28514
1997
90950.79681
1998
93326.7446
1999
94487.65132
2000
102757.5113
2001
113275.8656
2002
120729.6247
2003
129151.901
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2004
138907.4851
2005
153896.0876
2006
171954.1795
2007
188623.725
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
Average Monthly Price of Homes Sold in Delta County, CO 1995-2012
Given the average sale price and the graph, the market increased from 1995 to 2007
Related Documents
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill