Day 11 Daily 2-Factor Anova
docx
keyboard_arrow_up
School
Rochester Institute of Technology *
*We aren’t endorsed by this school
Course
146
Subject
Statistics
Date
Apr 3, 2024
Type
docx
Pages
8
Uploaded by AdmiralSeaLion3703
Class Day/Time:
Name: Jack Di Lorenzo
STAT 146
Daily 11
Two-factor ANOVA
Problem 1.
The Iams Company sells premium dog and cat foods (dry and canned) in 70 countries. Iams Co. makes dry dog and cat food at plants in Lewisburg, Ohio (Plant A); Aurora, Nebraska (Plant B); Henderson, North Carolina (Plant C); and Coevorden, the Netherlands (Plant D). Iams Co. brand dry dog foods come in five formulas. One of the ingredients is of particular importance: crude fat
. To discover if there is a difference
in the average percent of crude fat among the four formulas and among the production sites, the sample data were obtained.
These data are in the file
Dog_Food.mtw
.
C1
Plant
Plant A, Plant B, Plant C, Plant D
C2
Fat
Average percent of crude fat
C3
Formula
Formula1, Formula2, Formula 3, Formula 4
Part I. Get to know the data. Explore the relationships between the response variable and the two factors by completing the following:
1.
State the response variable.
Average percent of crude fat
2.
Name the factors and state the number of levels.
The two levels of the two factors, which are "Plant" and "Formula"
3.
Obtain summary statistics for the average percent of crude fat
by Plant and also summary statistics by Formula.
Intro to Statistics II
Minitab for Day 11
Page 1 of 8
4.
Rank the means for both factors.
For Plants:
Plant A < Plant C < Plant D < Plant B
For Formulas:
Formula 4 < Formula 2 < Formula 1 < Formula 3
5.
Produce a Full Interaction Plot and paste it here.
6.
Using either graph on the full interaction plot, answer the following
:
A.
Which Plant and which Formula appear to have the HIGHEST percent of crude fat? Plant
: Plant B seems to have the highest fat content across the formulas
Formula: Formula 3 seems to have the highest fat content across the plants B.
Which Plant and which Formula appear to have the LOWEST percent of crude fat? Plant: Plant D generally appears to have the lowest fat content Formula: Formula 4 has the lowest fat content
C.
Does there appear to be an interaction between the Plant and the Dog Food Formula?
The lines in the plot are clearly not parallel and actually cross each other, indicating that the effect of one factor on the fat content depends on the level of the other factor. This implies that there is indeed an Intro to Statistics II
Minitab for Day 11
Page 2 of 8
interaction between the Plant and the Formula regarding their effect on the percent of crude fat.
Part II. Further explore the relationships between the response variable and the two factors by completing a test for significant interaction and, if necessary, a test of main effects. Be sure to answer the following questions:
7.
Fit the General Linear Model; don’t forget to click on Model… and select both factors and ADD the interaction to the model as well. Paste the Analysis of Variance (ANOVA) output here:
8.
Test for significant interaction (use the shortened test process).
H0: there is NOT a significant interaction between 'formula' and 'plant'.
Ha:
there is a significant interaction between 'formula' and 'plant'.
P-value = .532
The sample data DOES NOT provide sufficient evidence to say that there is a significant interaction between 'formula' and 'plant'. The high p-value indicates that any interaction observed in the sample is likely due to random chance rather than a systematic effect.
9.
If the interaction is not significant, then proceed to test for main effects. Include the shortened test
process for each factor
. For the 'formula':
H0: μ formula1 = μ formula 2 = μ formula 3 = μ formula 4 Ha: At least one μ formula differs Given that the p-value for 'formula' is 0.000, which is very small, the sample data DOES provide sufficient evidence to say that at least one mean fat content differs among the different formulas.
For the 'plant':
H0: μ plant A = μ plant B = μ plant C = μ plant D Ha: At least one μ plant differs Intro to Statistics II
Minitab for Day 11
Page 3 of 8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Given that the p-value for plant is 0.017, which is also smaller than the alpha level of 0.05, the sample data DOES provide sufficient evidence to say that at least one mean fat content differs among the different plants.
10.
For each main effect test, if there is a significant difference, produce the Tukey’s comparisons output and then describe what that differences are. Which Plant(s) and which Formula(s) MINIMIZE the percent of crude fat in Iams Co. dog food?
Formula 4: It had the lowest mean fat content among the formulas.
Plant D: It generally appears to have the lowest fat content among the plants, as indicated by the interaction
plot.
Therefore, the combination of Plant D and Formula 4 would be expected to minimize the percent of crude fat in Iams Co. dog food based on the provided data.
Problem 2.
In a small-scale experimental study, a market researcher wished to employ the analysis of variance model to determine whether or not moisture content and sweetness of a product affect the degree
of brand liking. These data are BrandLiking.mtw
.
C1
BrandLiking
Degree of brand liking
C2
MoistureContent
Moisture content (levels: 4, 6, 8 and 10)
C3
Sweetness
Sweetness levels (2 and 4)
These data are consistent with the results presented in Kutner, Nachtsheim, Neter and Li, Applied Linear Statistical Models, McGraw-Hill, 2005. Part I. Get to know the data. Explore the relationships between the response variable and the two factors by completing the following:
1.
State the response variable.
BrandLiking
2.
Name the factors and state the number of levels.
MoistureContent: Four levels (4, 6, 8, and 10).
Sweetness: Two levels (2 and 4).
3.
Obtain summary statistics for the average Brank Liking by Sweetness and also summary statistics by Moisture Content.
Intro to Statistics II
Minitab for Day 11
Page 4 of 8
4.
Rank the means for both factors.
Moisture Content:
4 < 6 < 8 < 10
Sweetness:
2 < 4
5.
Produce a Full Interaction Plot and paste it here.
D
6.
Using either graph on the full interaction plot, answer the following
:
D.
Which Moisture Content and which Sweetness level appear to have the HIGHEST Brand Liking? (Is this the same information you found with the descriptive statistics in parts 3 and 4?)
Moisture Content: Level 10
Sweetness
: Level 4.
E.
Which Moisture Content and which Sweetness level appear to have the LOWEST Brand Liking? (Is this
the same information you found with the descriptive statistics in parts 3 and 4?)
Intro to Statistics II
Minitab for Day 11
Page 5 of 8
Moisture Content: Level 4
Sweetness: Level 2
F.
Does there appear to be an interaction between the Moisture Content and the Sweetness Level?
There appears to be an interaction between Moisture Content and Sweetness level, meaning the effect of Moisture Content on Brand Liking is different at different levels of Sweetness.
----------We will not continue with the ANOVA test…go on to practice the multiple choice problems below-----
In the following multiple-choice questions, select the best answer. When you are done, check the solutions at the bottom of the page. These questions will not be graded by the grader.
1. Analysis of variance is a statistical method of comparing the ________ of several populations. a. standard deviations b. variances c. means d. proportions e. none of the above 2. The ______ sum of squares measures the variability of the observed values around their respective treatment means.
a. treatment
b. error
c. interaction
d. total
3. The ________ sum of squares measures the variability of the sample treatment means around the overall mean.
a. treatment
b. error
c. interaction
d. total
4. You obtained a significant test statistic when comparing three treatments in a one-way ANOVA. In words, how would you interpret the alternative hypothesis H
A
? a. All three treatments have different effects on the mean response. b. Exactly two of the three treatments have the same effect on the mean response. c. At least two treatments are different from each other in terms of their effect on the mean response. d. All of the above. e. None of the above
5. What would happen if, instead of using an ANOVA to compare 10 groups, you performed multiple T-
tests? a. Nothing, there is no difference between using an ANOVA and using a t-test. b. Nothing serious, except that making multiple comparisons with a t-test requires more computation than doing a single ANOVA. c. Sir Ronald Fischer would be turning over in his grave; he put all that work into developing ANOVA, and you use multiple t-tests.
Intro to Statistics II
Minitab for Day 11
Page 6 of 8
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
d. Making multiple comparisons with a t-test increases the probability of making a Type I error. 6. What is the function of a post-test in ANOVA? a. Determine if any statistically significant group differences have occurred. b. Describe those groups that have reliable differences between group means and determine those that are
statistically the same.
c. Set the critical value for the F test (or chi-square). 7. Assuming no bias, the total variation in a response variable is due to error (unexplained variation) plus
differences due to treatments (explained/known variation). If known variation is large compared to unexplained variation, which of the following conclusions is the best? a. There is no evidence for a difference in response due to treatments. b. There is evidence for a difference in response due to treatments. c. The treatments are not comparable. d. The cause of the response is due to something other than treatments. 8. An investigator randomly assigns 30 college students into three equal size study groups (early-morning, afternoon, late-night) to determine if the period of the day at which people study has an effect on their retention. The students live in a controlled environment for one week, on the third day of the experimental treatment is administered (study of predetermined material). On the seventh day the investigator tests for retention. In computing his ANOVA table, he sees that his MS within groups (MSE) is larger than his MS between groups (MST). What does this result indicate? a. An error in the calculations was made. b. There was more than the expected amount of variability between groups. c. There was more variability between subjects within the same group than there was between groups. d. There should have been additional controls in the experiment.
Intro to Statistics II
Minitab for Day 11
Page 7 of 8
ANSWERS TO MULTIPLE CHOICE QUESTIONS
1.
ANSWER: C. (ANOVA is a test about means.)
2.
ANSWER: B (When you study the variability of data points around the means of the group—that is the unexplained variability and comes from the error row.)
3.
ANSWER: A (When you study the variability of the treatment means compared to the overall mean of all the data, then that is called the explained/known variability and comes from the row labeled factor/treatment.)
4.
ANSWER: C (The only conclusion you may make is that at least one mean differs from another mean. Another way of saying this is that at least two means differ from one another. This is an example of a significant anova.)
5.
ANSWER: D
(Individual T-tests on pairs would each use an alpha = .05 (which is the probability of type 1 error) and ultimately you would have a probability AND another probability AND another and they would all get multiplied and get really large)
6.
ANSWER: B
(We run the Post-hoc Tukey Comparisons in order to figure out which pairs are statistically different and which are statistically the same.)
7.
ANSWER: B
(If the numerator (explained variability) is larger than the denominator (unexplained variability) in calculating the F test statistic, then you get a big F (and a small P-value) so you find that at least one mean differs. We would call this a significant anova.)
8.
ANSWER: C
(If the MSE is larger than MST, then the denominator of the F test statistic is larger. This means that the variability within
each group is bigger than the variability between/across the groups. Buy the way, this will lead to a large F and a small P-value. We will say that this is NOT a significant anova.)
Intro to Statistics II
Minitab for Day 11
Page 8 of 8
Related Documents
Related Questions
A problem with a phone line that prevents a customer from receiving or making calls is upsetting to both the customer and the telecommunications company. The file Phone contains samples of 20 problems reported to two different offices of a telecommunications company and the time to clear these problems (in minutes) from the customers’ lines:
Central Office I Time to Clear Problems (minutes)
1.48 1.75 0.78 2.85 0.52 1.60 4.15 3.97 1.48 3.10
1.02 0.53 0.93 1.60 0.80 1.05 6.32 3.93 5.45 0.97
Central Office II Time to Clear Problems (minutes)
7.55 3.75 0.10 1.10 0.60 0.52 3.30 2.10 0.58 4.02
3.75 0.65 1.92 0.60 1.53 4.23 0.08 1.48 1.65 0.72
Perform a hypothesis test to determine if there’s evidence in this data of a difference in the mean waiting time between the two offices by answering the following questions:
(b) Assuming that the population…
arrow_forward
5. Frustrated passengers, congested streets, time schedules, and air and noise pollution are just some of the physical and social pressures that lead many urban bus drivers to retire prematurely with disabilities such as coronary heart disease and stomach disorders. An intervention program was implemented to improve the work conditions of the city’s bus drivers. The following table reported the heart rates, in beats per minute, of the drivers who drove on the improved routes (intervention) and the drivers who drove on the normal routes (control).
a) Is it reasonable to apply the pooled two-sample ? test? Justify your answer
b) At the 5% significance level, do the data provide sufficient evidence that the intervention program reduces mean heart rate of urban bus drivers?
c) Obtain a confidence interval for the difference between the mean heart rates of urban bus drivers in the two environments corresponding to the hypothesis test in part (a)
d) Interpret the confidence interval…
arrow_forward
You are in charge of quality control for a company that manufactures a famous luxury brand ofhandbags. There are three different defects that you must be on the lookout for. A defect in any ofthese areas means the handbag cannot be sold and must be either recycled or destroyed.• Defects in stitching: 11% of items• Defects in coloring: 8% of items• Defects in materials: 4% of itemsBecause the areas (stitching, coloring/dying, materials) are done by different suppliers or in differentareas of the manufacturing plant, you may assume that the types of defects are independent of eachother.1. If there is a defect in all of the areas, the product must be destroyed. What percentage of bagswill need to be destroyed?2. A defect in coloring or materials means the product cannot be recycled and must therefore bedestroyed. For what percentage of bags will this be the case?3. What is the probability that a defect will be found of any of the three types?
arrow_forward
menthol cigarettes. A factor in this decision was US data on smoking
which shows that 40% of people who did not finish high school smoke
cigarettes, 34% of high school graduates smoke cigarettes, 24% of
individuals with some college education smoke cigarettes, and 14% of
college graduates smoke cigarettes. We also know from the statistics
provided by the Education Department that 10.55% of the population
did not finish high school, 32.36% of the population are high school
graduates, 18.47% of the population have some college education, and
38.62% of the population are college graduates.
In announcing his plan for banning cigarettes, President Biden invites a
random individual to the White House to discuss his plan. After the
individual arrives at the White House, it is discovered that the
individual is a smoker. What is the probability that the individual is a
college graduate?
Answer: .044
arrow_forward
6
ut of
Allied Laboratories is combing some of its most common tests into one-price packages.
One such package will contain three tests that have the following variable costs:
Test A
Test B
Test C
Disposable Syringe
$4.00
$4.00
$4.00
Blood Vial
$0.50
$0.50
$0.50
Forms
$0.15
$0.15
$0.15
Reagents
$0.80
$0.60
$1.20
Sterile Bandages
$0.20
$0.20
$0.20
Breakage/losses
$0.05
$0.05
$0.05
When the tests are combined, only one syringe, form, and sterile bandage will be used.
Furthermore, only one charge for breakage/losses will apply. However, you will need three
blood vials and three individual reagents (pay attention to the price of the regents).
Assuming marginal cost pricing, what is the cost of the combined test?
Select one or more:
a.
$8.50
O b.
$8.00
O c. $6.90
O d. $4.50
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Related Questions
- A problem with a phone line that prevents a customer from receiving or making calls is upsetting to both the customer and the telecommunications company. The file Phone contains samples of 20 problems reported to two different offices of a telecommunications company and the time to clear these problems (in minutes) from the customers’ lines: Central Office I Time to Clear Problems (minutes) 1.48 1.75 0.78 2.85 0.52 1.60 4.15 3.97 1.48 3.10 1.02 0.53 0.93 1.60 0.80 1.05 6.32 3.93 5.45 0.97 Central Office II Time to Clear Problems (minutes) 7.55 3.75 0.10 1.10 0.60 0.52 3.30 2.10 0.58 4.02 3.75 0.65 1.92 0.60 1.53 4.23 0.08 1.48 1.65 0.72 Perform a hypothesis test to determine if there’s evidence in this data of a difference in the mean waiting time between the two offices by answering the following questions: (b) Assuming that the population…arrow_forward5. Frustrated passengers, congested streets, time schedules, and air and noise pollution are just some of the physical and social pressures that lead many urban bus drivers to retire prematurely with disabilities such as coronary heart disease and stomach disorders. An intervention program was implemented to improve the work conditions of the city’s bus drivers. The following table reported the heart rates, in beats per minute, of the drivers who drove on the improved routes (intervention) and the drivers who drove on the normal routes (control). a) Is it reasonable to apply the pooled two-sample ? test? Justify your answer b) At the 5% significance level, do the data provide sufficient evidence that the intervention program reduces mean heart rate of urban bus drivers? c) Obtain a confidence interval for the difference between the mean heart rates of urban bus drivers in the two environments corresponding to the hypothesis test in part (a) d) Interpret the confidence interval…arrow_forwardYou are in charge of quality control for a company that manufactures a famous luxury brand ofhandbags. There are three different defects that you must be on the lookout for. A defect in any ofthese areas means the handbag cannot be sold and must be either recycled or destroyed.• Defects in stitching: 11% of items• Defects in coloring: 8% of items• Defects in materials: 4% of itemsBecause the areas (stitching, coloring/dying, materials) are done by different suppliers or in differentareas of the manufacturing plant, you may assume that the types of defects are independent of eachother.1. If there is a defect in all of the areas, the product must be destroyed. What percentage of bagswill need to be destroyed?2. A defect in coloring or materials means the product cannot be recycled and must therefore bedestroyed. For what percentage of bags will this be the case?3. What is the probability that a defect will be found of any of the three types?arrow_forward
- menthol cigarettes. A factor in this decision was US data on smoking which shows that 40% of people who did not finish high school smoke cigarettes, 34% of high school graduates smoke cigarettes, 24% of individuals with some college education smoke cigarettes, and 14% of college graduates smoke cigarettes. We also know from the statistics provided by the Education Department that 10.55% of the population did not finish high school, 32.36% of the population are high school graduates, 18.47% of the population have some college education, and 38.62% of the population are college graduates. In announcing his plan for banning cigarettes, President Biden invites a random individual to the White House to discuss his plan. After the individual arrives at the White House, it is discovered that the individual is a smoker. What is the probability that the individual is a college graduate? Answer: .044arrow_forward6 ut of Allied Laboratories is combing some of its most common tests into one-price packages. One such package will contain three tests that have the following variable costs: Test A Test B Test C Disposable Syringe $4.00 $4.00 $4.00 Blood Vial $0.50 $0.50 $0.50 Forms $0.15 $0.15 $0.15 Reagents $0.80 $0.60 $1.20 Sterile Bandages $0.20 $0.20 $0.20 Breakage/losses $0.05 $0.05 $0.05 When the tests are combined, only one syringe, form, and sterile bandage will be used. Furthermore, only one charge for breakage/losses will apply. However, you will need three blood vials and three individual reagents (pay attention to the price of the regents). Assuming marginal cost pricing, what is the cost of the combined test? Select one or more: a. $8.50 O b. $8.00 O c. $6.90 O d. $4.50arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill