Day 11 Daily 2-Factor Anova

docx

School

Rochester Institute of Technology *

*We aren’t endorsed by this school

Course

146

Subject

Statistics

Date

Apr 3, 2024

Type

docx

Pages

8

Uploaded by AdmiralSeaLion3703

Report
Class Day/Time: Name: Jack Di Lorenzo STAT 146 Daily 11 Two-factor ANOVA Problem 1. The Iams Company sells premium dog and cat foods (dry and canned) in 70 countries. Iams Co. makes dry dog and cat food at plants in Lewisburg, Ohio (Plant A); Aurora, Nebraska (Plant B); Henderson, North Carolina (Plant C); and Coevorden, the Netherlands (Plant D). Iams Co. brand dry dog foods come in five formulas. One of the ingredients is of particular importance: crude fat . To discover if there is a difference in the average percent of crude fat among the four formulas and among the production sites, the sample data were obtained. These data are in the file Dog_Food.mtw . C1 Plant Plant A, Plant B, Plant C, Plant D C2 Fat Average percent of crude fat C3 Formula Formula1, Formula2, Formula 3, Formula 4 Part I. Get to know the data. Explore the relationships between the response variable and the two factors by completing the following: 1. State the response variable. Average percent of crude fat 2. Name the factors and state the number of levels. The two levels of the two factors, which are "Plant" and "Formula" 3. Obtain summary statistics for the average percent of crude fat by Plant and also summary statistics by Formula. Intro to Statistics II Minitab for Day 11 Page 1 of 8
4. Rank the means for both factors. For Plants: Plant A < Plant C < Plant D < Plant B For Formulas: Formula 4 < Formula 2 < Formula 1 < Formula 3 5. Produce a Full Interaction Plot and paste it here. 6. Using either graph on the full interaction plot, answer the following : A. Which Plant and which Formula appear to have the HIGHEST percent of crude fat? Plant : Plant B seems to have the highest fat content across the formulas Formula: Formula 3 seems to have the highest fat content across the plants B. Which Plant and which Formula appear to have the LOWEST percent of crude fat? Plant: Plant D generally appears to have the lowest fat content Formula: Formula 4 has the lowest fat content C. Does there appear to be an interaction between the Plant and the Dog Food Formula? The lines in the plot are clearly not parallel and actually cross each other, indicating that the effect of one factor on the fat content depends on the level of the other factor. This implies that there is indeed an Intro to Statistics II Minitab for Day 11 Page 2 of 8
interaction between the Plant and the Formula regarding their effect on the percent of crude fat. Part II. Further explore the relationships between the response variable and the two factors by completing a test for significant interaction and, if necessary, a test of main effects. Be sure to answer the following questions: 7. Fit the General Linear Model; don’t forget to click on Model… and select both factors and ADD the interaction to the model as well. Paste the Analysis of Variance (ANOVA) output here: 8. Test for significant interaction (use the shortened test process). H0: there is NOT a significant interaction between 'formula' and 'plant'. Ha: there is a significant interaction between 'formula' and 'plant'. P-value = .532 The sample data DOES NOT provide sufficient evidence to say that there is a significant interaction between 'formula' and 'plant'. The high p-value indicates that any interaction observed in the sample is likely due to random chance rather than a systematic effect. 9. If the interaction is not significant, then proceed to test for main effects. Include the shortened test process for each factor . For the 'formula': H0: μ formula1 = μ formula 2 = μ formula 3 = μ formula 4 Ha: At least one μ formula differs Given that the p-value for 'formula' is 0.000, which is very small, the sample data DOES provide sufficient evidence to say that at least one mean fat content differs among the different formulas. For the 'plant': H0: μ plant A = μ plant B = μ plant C = μ plant D Ha: At least one μ plant differs Intro to Statistics II Minitab for Day 11 Page 3 of 8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Given that the p-value for plant is 0.017, which is also smaller than the alpha level of 0.05, the sample data DOES provide sufficient evidence to say that at least one mean fat content differs among the different plants. 10. For each main effect test, if there is a significant difference, produce the Tukey’s comparisons output and then describe what that differences are. Which Plant(s) and which Formula(s) MINIMIZE the percent of crude fat in Iams Co. dog food? Formula 4: It had the lowest mean fat content among the formulas. Plant D: It generally appears to have the lowest fat content among the plants, as indicated by the interaction plot. Therefore, the combination of Plant D and Formula 4 would be expected to minimize the percent of crude fat in Iams Co. dog food based on the provided data. Problem 2. In a small-scale experimental study, a market researcher wished to employ the analysis of variance model to determine whether or not moisture content and sweetness of a product affect the degree of brand liking. These data are BrandLiking.mtw . C1 BrandLiking Degree of brand liking C2 MoistureContent Moisture content (levels: 4, 6, 8 and 10) C3 Sweetness Sweetness levels (2 and 4) These data are consistent with the results presented in Kutner, Nachtsheim, Neter and Li, Applied Linear Statistical Models, McGraw-Hill, 2005. Part I. Get to know the data. Explore the relationships between the response variable and the two factors by completing the following: 1. State the response variable. BrandLiking 2. Name the factors and state the number of levels. MoistureContent: Four levels (4, 6, 8, and 10). Sweetness: Two levels (2 and 4). 3. Obtain summary statistics for the average Brank Liking by Sweetness and also summary statistics by Moisture Content. Intro to Statistics II Minitab for Day 11 Page 4 of 8
4. Rank the means for both factors. Moisture Content: 4 < 6 < 8 < 10 Sweetness: 2 < 4 5. Produce a Full Interaction Plot and paste it here. D 6. Using either graph on the full interaction plot, answer the following : D. Which Moisture Content and which Sweetness level appear to have the HIGHEST Brand Liking? (Is this the same information you found with the descriptive statistics in parts 3 and 4?) Moisture Content: Level 10 Sweetness : Level 4. E. Which Moisture Content and which Sweetness level appear to have the LOWEST Brand Liking? (Is this the same information you found with the descriptive statistics in parts 3 and 4?) Intro to Statistics II Minitab for Day 11 Page 5 of 8
Moisture Content: Level 4 Sweetness: Level 2 F. Does there appear to be an interaction between the Moisture Content and the Sweetness Level? There appears to be an interaction between Moisture Content and Sweetness level, meaning the effect of Moisture Content on Brand Liking is different at different levels of Sweetness. ----------We will not continue with the ANOVA test…go on to practice the multiple choice problems below----- In the following multiple-choice questions, select the best answer. When you are done, check the solutions at the bottom of the page. These questions will not be graded by the grader. 1. Analysis of variance is a statistical method of comparing the ________ of several populations. a. standard deviations b. variances c. means d. proportions e. none of the above 2. The ______ sum of squares measures the variability of the observed values around their respective treatment means. a. treatment b. error c. interaction d. total 3. The ________ sum of squares measures the variability of the sample treatment means around the overall mean. a. treatment b. error c. interaction d. total 4. You obtained a significant test statistic when comparing three treatments in a one-way ANOVA. In words, how would you interpret the alternative hypothesis H A ? a. All three treatments have different effects on the mean response. b. Exactly two of the three treatments have the same effect on the mean response. c. At least two treatments are different from each other in terms of their effect on the mean response. d. All of the above. e. None of the above 5. What would happen if, instead of using an ANOVA to compare 10 groups, you performed multiple T- tests? a. Nothing, there is no difference between using an ANOVA and using a t-test. b. Nothing serious, except that making multiple comparisons with a t-test requires more computation than doing a single ANOVA. c. Sir Ronald Fischer would be turning over in his grave; he put all that work into developing ANOVA, and you use multiple t-tests. Intro to Statistics II Minitab for Day 11 Page 6 of 8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
d. Making multiple comparisons with a t-test increases the probability of making a Type I error. 6. What is the function of a post-test in ANOVA? a. Determine if any statistically significant group differences have occurred. b. Describe those groups that have reliable differences between group means and determine those that are statistically the same. c. Set the critical value for the F test (or chi-square). 7. Assuming no bias, the total variation in a response variable is due to error (unexplained variation) plus differences due to treatments (explained/known variation). If known variation is large compared to unexplained variation, which of the following conclusions is the best? a. There is no evidence for a difference in response due to treatments. b. There is evidence for a difference in response due to treatments. c. The treatments are not comparable. d. The cause of the response is due to something other than treatments. 8. An investigator randomly assigns 30 college students into three equal size study groups (early-morning, afternoon, late-night) to determine if the period of the day at which people study has an effect on their retention. The students live in a controlled environment for one week, on the third day of the experimental treatment is administered (study of predetermined material). On the seventh day the investigator tests for retention. In computing his ANOVA table, he sees that his MS within groups (MSE) is larger than his MS between groups (MST). What does this result indicate? a. An error in the calculations was made. b. There was more than the expected amount of variability between groups. c. There was more variability between subjects within the same group than there was between groups. d. There should have been additional controls in the experiment. Intro to Statistics II Minitab for Day 11 Page 7 of 8
ANSWERS TO MULTIPLE CHOICE QUESTIONS 1. ANSWER: C. (ANOVA is a test about means.) 2. ANSWER: B (When you study the variability of data points around the means of the group—that is the unexplained variability and comes from the error row.) 3. ANSWER: A (When you study the variability of the treatment means compared to the overall mean of all the data, then that is called the explained/known variability and comes from the row labeled factor/treatment.) 4. ANSWER: C (The only conclusion you may make is that at least one mean differs from another mean. Another way of saying this is that at least two means differ from one another. This is an example of a significant anova.) 5. ANSWER: D (Individual T-tests on pairs would each use an alpha = .05 (which is the probability of type 1 error) and ultimately you would have a probability AND another probability AND another and they would all get multiplied and get really large) 6. ANSWER: B (We run the Post-hoc Tukey Comparisons in order to figure out which pairs are statistically different and which are statistically the same.) 7. ANSWER: B (If the numerator (explained variability) is larger than the denominator (unexplained variability) in calculating the F test statistic, then you get a big F (and a small P-value) so you find that at least one mean differs. We would call this a significant anova.) 8. ANSWER: C (If the MSE is larger than MST, then the denominator of the F test statistic is larger. This means that the variability within each group is bigger than the variability between/across the groups. Buy the way, this will lead to a large F and a small P-value. We will say that this is NOT a significant anova.) Intro to Statistics II Minitab for Day 11 Page 8 of 8