STAT4610Project3

docx

School

University Of Denver *

*We aren’t endorsed by this school

Course

4610

Subject

Statistics

Date

Jun 12, 2024

Type

docx

Pages

13

Uploaded by Deacon_Rose_Jackal10

Report
STAT4610 Project 3 Statistical Tests USE THE FOLLOWING SCENARIO AND DATA SET FOR PROBLEMS 1 AND 2. As the Executive Vice President of Sales for ACME Nose Hair Trimmers Inc., your sales force is divided into four regions: NW, SW, NE, and SE. Your CFO and HR Director have determined that you can promote 400 of your sales force in the following year, and you allocate those promotions to your Regional VPs of Sales as follows: NW gets 80, SW gets 120, and NE and SE get 100 each. (This allocation is based on the relative sizes of the regional sales forces.) Your employees identify themselves by hair color, and when the promotions have been completed, your HR Director presents you with a demographic break-down of regional promotions by hair color as follows: REGION NW SW NE SE HAIR Brunette 30 30 40 25 COLOR Brown 20 20 20 15 Blond 10 40 20 25 Red 10 20 10 15 Gray 5 0 10 5 Bald 5 10 0 15 1. You receive a number of complaints from your employees that this year’s promotions were not assigned fairly (in that some VPs favored different hair colors), so you decide to determine if the distribution of promotions differed by region. You conduct a hypothesis test to this effect. a. State your null hypothesis for this test. b. State your alternative hypothesis for this test. c. Describe the Type I error for this test and what its implications are for ACME NHT Inc. d. Describe the Type II error for this test and what its implications are for ACME NHT Inc. e. What is the critical value of your test statistic if you are willing to reject your null hypothesis at the = .05 level of significance? (Ensure you identify what type of statistic it is.)
f. What is the calculated value of your test statistic? (Show as much work as necessary to ensure partial credit if you are not confident of your answer.) g. What is the p -value for this test? What does it represent? h. What do you conclude about your hypothesis test, and why? i. What action will you take as the Executive Vice President of Sales? 2. Because of further employee complaints, the ACLU becomes involved in a discrimination lawsuit against ACME NHT Inc., accusing you of discriminatory promotion practices within your sales force. You immediately ask your HR Director for the distribution of your total sales force by hair color (expressed as percentages). She provides you with the following data: Brunett e Brown Blond Red Gray Bald 30% 20% 25% 12% 5% 8% You decide to conduct a hypothesis test to determine if your total promotions follow the distribution of the demographics of your sales force. a. State your null hypothesis. b. State your alternative hypothesis. c. Describe the Type I error for this test and what its implications are for ACME NHT Inc. d. Describe the Type II error for this test and what its implications are for ACME NHT Inc. e. Fill in the following table with your expected number of promotions for each hair color if H 0 is true. (Recall you had a total of 400 promotions.) Brunett e Brown Blon d Red Gray Bald f. What is the critical value of your test statistic if you are willing to reject your null hypothesis at the = .10 level of significance? (Ensure you identify what type of statistic it is.) g. What is the calculated value of your test statistic? (Show as much work as necessary to ensure partial credit if you are not confident of your answer.)
h. What is the p -value for this test? What does it represent? i. What do you conclude about your hypothesis test, and why? j. What action will you take as the Executive Vice President of Sales? 3. The ACME NHT Inc. Executive Vice President of Marketing is considering a national advertising campaign, and asks you if the averages of your quarterly sales differ significantly by region. You already have available your quarterly sales totals for each region (in millions of dollars). They are as follows: REGION NW SW NE SE Q1 5 7 8 4 Q2 7 6 7 6 Q3 3 5 9 5 Q4 6 4 6 3 You decide to test if there is a statistical difference in your quarterly regional sales averages. a. State your null hypothesis for this test. b. State your alternative hypothesis for this test. c. Describe the Type I error for this test and what its implications are for ACME NHT Inc. d. Describe the Type II error for this test and what its implications are for ACME NHT Inc. e. What is the critical value of your test statistic if you are willing to reject your null hypothesis at the = .05 level of significance? (Ensure you identify what type of statistic it is.) f. What is the calculated value of your test statistic? (Show as much work as necessary to ensure partial credit if you are not confident of your answer.) g. What is the p -value for this test? What does it represent? h. What do you conclude about your hypothesis test, and why? i. What action will you take as the Executive Vice President of Sales?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
j. Which two regions are most likely different from each other in sales performance? 4. In your attempt to “catch them all” you hunt the prized Pokemon, Mew and Mewtwo. As you battle them with your top 25 Pokemon, you lose every battle. However, after each battle, you notice that the hit points (HP) of Mew and Mewtwo have been reduced. The remaining HPs of Mew and Mewtwo are shown below after battling your respective Pokemon. (A higher HP remaining indicates more success in battle.) You suspect that Mewtwo is the tougher opponent, and you want to test this hypothesis at the 95% level of confidence. Attacker Mew Mewtwo Bulbasaur 402 377 Ivysaur 305 354 Venusaur 348 343 Charmande r 350 341 Charmeleon 320 347 Charizard 340 401 Squirtle 396 406 Wartortle 311 364 Blastoise 367 381 Caterpie 309 402 Metapod 391 387 Butterfree 385 347 sWeedle 390 349 Kakuna 313 402 Beedrill 396 403 Pidgey 394 389 Pidgeotto 329 395 Pidgeot 375 356 Rattata 364 340 Raticate 340 400 Spearow 365 405 Fearow 395 402 Ekans 365 377 Arbok 396 341 Pikachu 381 399 Assume that before each battle, Mew and Mewtwo started with the same number of HPs. a. State the null hypotheses for this test. b. State the alternative hypotheses for this test.
c. What type of test will you use to test this hypothesis? d. Describe the Type I Error for this test, and explain its implication for you as you continue your quest. e. Describe the Type II Error for this test, and explain its implication for you as you continue your quest. f. What is the rejection region for this hypothesis test? g. Calculate the test statistic for this hypothesis test. What is your conclusion? h. If you reject the null hypothesis for this test, what is the probability that you have committed a Type I Error? (i.e., find the p -value for this test.) i. What action will you take as you continue your quest based on your conclusion. j. Which do like better: Mew or Mewtwo? 5. You conduct an experiment where you collect the data in the table below. What population distribution does this sample come from? How confident are you of your conclusion? 14.7991 1 28.2385 8 22.7928 24.5166 7 22.5070 2 35.5008 9 21.7505 7 28.0875 2 23.4774 6 17.7120 1 26.9566 3 28.8737 9 30.5948 1 30.8270 1 24.7706 4 23.7422 3 25.5293 6 29.0418 5 36.8747 9 30.3736 2 27.9390 6 21.9075 4 26.4268 7 19.9263 8 22.9328 4 24.1324 6 20.7254 24.2610 7 18.3925 9 27.6490 9 34.6292 18.9683 13.4252 1 26.8225 4 31.9074 1 24.8972 3 27.2288 1 19.4729 9 29.7634 4 32.5879 9 16.8599 4 29.5746 27.5072 25.6561 9 15.6708 27.3671 6 27.7929 8 26.6896 2 22.0614 2 23.0711 9 31.6090 2 26.0389 3 32.9860 6 28.5494 3 20.3410 1 33.4400 26.5577 21.0856 23.8731 23.2079
6 7 4 3 8 29.2143 6 25.2763 5 28.7097 6 29.3917 7 22.2124 8 19.7051 9 23.3433 7 25.0236 5 16.3265 4 25.4027 7 22.3609 4 32.2315 6 22.6509 1 20.9082 3 33.9734 6 18.6909 8 19.2280 7 23.6103 8 30.1620 7 21.5902 4 28.3924 6 31.3317 26.1675 4 27.0920 2 20.1379 3 25.9108 8 20.5365 6 17.3139 4 21.4262 1 31.0719 3 24.6438 1 18.0684 4 26.2968 3 29.7247 7 24.0031 7 25.15 29.9589 9 21.2581 5 24.3891 2 25.7833 3 6. Going back to the home sales data for Delta County, CO (see your Project 1 work) you want to see when the market peaked and where it seemed to return to its previous levels. You consider the average sale price for the County for different years. a. Can you determine if the market peaked in 2007 or 2008? (Is there a difference between the averages prices in these two years?) b. Can we conclude that in 2010 and 2011 that the market had returned to 2006 levels? c. Did the market increase from 1995 to 2007? Are you sure? 7. For this problem, use the data file STAT4610Projecdt3ResidentialData2009. This file contains data on 75% of the homes sold in Colorado in 2009 (Sales2009 tab) and data on the listing real estate agents for those homes (Listing Broker Data tab). We are interested in investigating the differences in the “Big 5” Realtor Associations in Colorado (identified under the “Board” field): Aurora Association of Realtors Boulder County Denver Metro Association of Realtors (Not Denver Metro Comm) Douglas/Elbert Realtor Association South Metro Denver Realtor Association The data we are interested in is “Close Price,” i.e., the actual price the homes sold for. Unfortunately, the Sales tab does not have the Realtor Association included, so the two files must be merged through a common key. (Fortunately, there is one!) The List Agent MLSID is a unique identifier for the listing agents, and it is included with the Sales data. By filtering out the close price data by Association and answer the following questions.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
a. What are the average closing prices for each Association in 2009. b. Are any of the averages different from the rest? (Justify your answer with an appropriate test and a p -value.) c. Is there a difference between Boulder and South Metro? (Justify your answer with an appropriate test and a p -value.) d. Is there a difference between Aurora and Douglas/Elbert? (Justify your answer with an appropriate test and a p -value.) e. Is there a difference between South Metro and Denver Metro? (Justify your answer with an appropriate test and a p -value.) f. Is there a difference between Boulder and Douglas/Elbert? (Justify your answer with an appropriate test and a p -value.) 1/ a. The null hypothesis (H0): The distribution of promotions across hair colors is the same for all regions. b. The alternative hypothesis (H1): The distribution of promotions across hair colors differs for at least one region. c. The Type I error in this test would be rejecting the null hypothesis when it is actually true, meaning concluding that the distribution of promotions differs across regions when, in fact, it does not. The implication of this error for ACME NHT Inc. would be taking unnecessary actions or making changes to the promotion process, which could lead to additional costs and potential legal issues. d. The Type II error in this test would be failing to reject the null hypothesis when it is false, meaning concluding that the distribution of promotions is the same across regions when, in reality, it differs. The implication of this error for ACME NHT Inc. would be failing to address any potential discrimination or unfair promotion practices, which could lead to legal issues, employee dissatisfaction, and potential fines or penalties. e. The appropriate test statistic would be the chi-square statistic - test of independence. α = 0.05 and df =(6-1)*(4-1)=15 The critical value is 24.996 f. Calculated Test Statistic NW: Brunette: (80 * 125) / 400 = 25 Brown: (80 * 75) / 400 = 15 Blond: (80 * 95) / 400 = 19 Red: (80 * 55) / 400 = 11 Gray: (80 * 20) / 400 = 4
Bald: (80 * 30) / 400 = 6 SW: Brunette: (120 * 125) / 400 = 37.5 Brown: (120 * 75) / 400 = 22.5 Blond: (120 * 95) / 400 = 28.5 Red: (120 * 55) / 400 = 16.5 Gray: (120 * 20) / 400 = 6 Bald: (120 * 30) / 400 = 9 NE: Brunette: (100 * 125) / 400 = 31.25 Brown: (100 * 75) / 400 = 18.75 Blond: (100 * 95) / 400 = 23.75 Red: (100 * 55) / 400 = 13.75 Gray: (100 * 20) / 400 = 5 Bald: (100 * 30) / 400 = 7.5 SE: Brunette: (100 * 125) / 400 = 31.25 Brown: (100 * 75) / 400 = 18.75 Blond: (100 * 95) / 400 = 23.75 Red: (100 * 55) / 400 = 13.75 Gray: (100 * 20) / 400 = 5 Bald: (100 * 30) / 400 = 7.5 Using the formula: X² = Σ((Observed - Expected)^2 / Expected) The calculated chi-square statistic is: 24.96 g. With 15 degrees of freedom and a chi-square statistic of 24.96, the p-value is approximately 0.05. h. Since the p-value of 0.05 is equal to the significance level of α = 0.05, we can reject the null hypothesis and conclude that the distribution of promotions differs across regions for at least one hair color. i. investigates the promotion procedures further and take appropriate actions to address any potential discrimination or unfair practices. This may include retraining regional VPs, setting stronger criteria, or performing more audits. 2/ a. The null hypothesis (H0): The distribution of promotions across hair colors follows the demographic distribution of the sales force. b. The alternative hypothesis (H1): The distribution of promotions across hair colors does not follow the demographic distribution of the sales force.
c. The Type I error in this test would be rejecting the null hypothesis when it is actually true, meaning concluding that the distribution of promotions differs from the demographic distribution when, in fact, it does not. The implication of this error for ACME NHT Inc. would be taking unnecessary actions or making changes to the promotion process, which could lead to additional costs and potential legal issues. d. The Type II error in this test would be failing to reject the null hypothesis when it is false, meaning concluding that the distribution of promotions follows the demographic distribution when, in reality, it does not. The implication of this error for ACME NHT Inc. would be failing to address any potential discrimination or unfair promotion practices, which could lead to legal issues, employee dissatisfaction, and potential fines or penalties. e. Brunett e Brown Blon d Red Gray Bald 120 80 100 48 20 32 f. For this question, still use same df? The appropriate test statistic would be the chi-square statistic. α = 0.1 and df =5 The critical value is 9.236 g. is it the same with question 1? h. With 15 degrees of freedom and a chi-square statistic of 22.24, the p-value is approximately 0.102 (using a chi-square distribution table or statistical software). i. Since the p-value of 0.102 is greater than the significance level of α = 0.10, we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that the distribution of promotions does not follow the demographic distribution of the sales force. j. As the Executive Vice President of Sales, while I may not have strong statistical evidence of discrimination in the promotion process, it is still advisable to address the ACLU's concerns and ensure fair promotion practices going forward. This could involve reviewing the promotion guidelines, providing additional training to the regional VPs, and implementing regular audits to monitor the distribution of promotions across different demographics. 3/ a. The null hypothesis (H0): There is no difference in the average quarterly sales across the four regions. b. The alternative hypothesis (H1): There is a difference in the average quarterly sales for at least one region.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
c. Type I error: Rejecting H0 when it is true. Implications: The company might wrongly conclude that there are differences in sales averages between regions, potentially leading to unnecessary strategic changes and resource redistribution. d. Type II error: Failing to reject H0 when it is false. Implications: ACME NHT Inc. might overlook actual regional discrepancies in sales performance, missing the opportunity to address underperforming areas or to capitalize on successful strategies. e. The appropriate test statistic would be the chi-square statistic. α = 0.05 and df = 9 The critical value is 16.919 4/ a. The null hypothesis (H0) for this test is: There is no difference in the remaining hit points (HP) of Mew and Mewtwo after battling with the same Pokemon. b. The alternative hypothesis (H1): Mewtwo has higher remaining hit points than Mew after battling with the same Pokemon. c. Since we are comparing the means of two paired samples (Mew and Mewtwo battling against the same Pokemon), the appropriate test to use is the paired t-test. d. The Type I error in this test would be rejecting the null hypothesis when it is true, meaning concluding that Mewtwo is tougher (has higher remaining HP) when, in fact, there is no difference between Mew and Mewtwo. The implication of this error would be to unnecessarily focus the efforts on defeating Mewtwo, potentially leading to wasted resources and time. e. The Type II error in this test would be failing to reject the null hypothesis when it is false, meaning concluding that there is no difference between Mew and Mewtwo when, in fact, Mewtwo is tougher. The implication of this error would be to underestimate Mewtwo's strength and fail to prepare adequately for battling it, potentially leading to a more challenging and prolonged quest. f. For a paired t-test with α = 0.05 (95% confidence level) and 24 degrees of freedom (since there are 25 pairs of observations), the rejection region is t > 1.711 or t < -1.711. If the calculated test statistic falls outside this range, we reject the null hypothesis. g. To calculate the test statistic, we need to find the differences between the remaining HP of Mew and Mewtwo for each pair of observations, calculate the mean and standard deviation of these differences, and then compute the t-statistic using the formula: t = (mean of differences) / (standard deviation of differences / sqrt(n)) Calculating the differences: 377 - 402 = -25 354 - 305 = 49 ... (continue for all 25 pairs)
Mean of differences = -2.64 Standard deviation of differences = 24.67 n = 25 t = -2.64 / (24.67 / sqrt(25)) = -0.563 Since the calculated t-statistic (-0.563) falls within the non-rejection region (-1.711 to 1.711), we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that Mewtwo has higher remaining hit points than Mew after battling with the same Pokemon. h. If we reject the null hypothesis, the probability of committing a Type I error (p-value) would be the probability of obtaining a test statistic as extreme or more extreme than the calculated value under the assumption that the null hypothesis is true. For a two-tailed test with 24 degrees of freedom, the p-value corresponding to a t-statistic of -0.563 is approximately 0.578 (using a t- distribution table or statistical software). i. Based on the conclusion of failing to reject the null hypothesis, you should not assume that Mewtwo is a tougher opponent than Mew. You should continue your quest with a balanced approach, preparing equally for both Mew and Mewtwo, and not underestimating either of them. j. Since the statistical analysis did not find evidence of a significant difference between Mew and Mewtwo in terms of remaining hit points, it is difficult to objectively prefer one over the other based on this data alone. Your preference between Mew and Mewtwo may depend on other factors, such as their overall abilities, appearance, or personal preference. 5/ It’s likely be a normal population distribution. 6/
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 19 95 19 96 19 97 19 98 19 99 20 00 20 01 20 02 20 03 20 04 20 05 20 06 2007 2008 20 09 20 10 20 11 20 12 0 50000 100000 150000 200000 250000 Average Monthly Price of Homes Sold in Delta County, CO 1995-2012 6b/ H0: In 2010 and 2011 that the market had returned to 2006 levels H1: In 2010 and 2011 that the market had NOT returned to 2006 levels Based on the P value, we reject the null hypothesis and can say that the market had NOT returned to 2006 levels 6c/ Year Average of s_sale_price 1995 75946.46119 1996 77478.28514 1997 90950.79681 1998 93326.7446 1999 94487.65132 2000 102757.5113 2001 113275.8656 2002 120729.6247 2003 129151.901
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
2004 138907.4851 2005 153896.0876 2006 171954.1795 2007 188623.725 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000 Average Monthly Price of Homes Sold in Delta County, CO 1995-2012 Given the average sale price and the graph, the market increased from 1995 to 2007