Bonus Lab

pdf

School

Athabasca University, Athabasca *

*We aren’t endorsed by this school

Course

215

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

4

Uploaded by NeoPindani

Report
1 STAT151 University Studies Lakeland College ONE-WAY ANALYSIS OF VARIANCE The One-Way ANOVA procedure in StatCrunch performs a one-way analysis of variance for a quantitative dependent variable by a single independent variable (factor). The one-way analysis of variance F-test is used to test the hypothesis that several population means are equal. The test is valid if the assumption of independence (within and across samples), normality and equal standard deviations for the populations are satisfied. The normality assumption is likely to be violated if the graphical methods show extreme skew for the response variable. The F test works quite well as long as the largest group standard deviation is no more than twice the smallest standard deviation. In this assignment you will compare the mean lengths of the cuckoo’s eggs that were found in nests of various other host birds. In particular, you will use analysis of variance to see whether the size of the egg varies depending on the host bird. It is strongly recommended that you get familiar with Lab 5 Instructions before you start working on the assignment. Cuckoo Birds Cuckoos lay their eggs in the nests of other (host) birds. The eggs are then adopted and hatched by the host birds. Usually one egg is laid by in one nest but some cuckoos may lay 2 eggs or even more. As soon as the young Cuckoo hatches, it pushes out many of its foster parent's eggs or young which tend to be smaller than Cuckoos. In consequence, the host birds raise only the young cuckoos and lose their own clutch. It is hypothesized that each female Cuckoo specializes in one species of host and will produce eggs which are very similar to the true eggs of that species. This behaviour is supposed to reduce the risk of the foster parents recognizing the foreign egg and ejecting it. In this lab assignment you will use one-factor analysis of variance to determine if there is a relationship between the size of a cuckoo's egg and the type of nest where the egg was laid. More precisely, you will determine whether significant differences exist among the mean lengths of cuckoo eggs laid in various host nests. The data set that will be used in the lab assignment contains the lengths of cuckoo eggs that were found in nests of various other host birds. The data come from the text by L.H.C. Tippett, The Methods of Statistics, 4th Edition, John Wiley and Sons, Inc., 1952, p. 176. The following host birds are represented in the data set: the Tree Pipit, Hedge Sparrow, Robin, Pied Wagtail, and Wren. All data are lengths in millimeters. For the purpose of this assignment assume that the data were obtained by a scientist who wandered around the countryside looking in birds nests to find cuckoo eggs and found a total of 120 cuckoo eggs in various host species. The lengths of the eggs were obtained with digital calipers by another scientist not directly involved in the study. The data are available in the StatCrunch file lab5data.xlsx . The data are not to be printed in your lab report submission. The following is the description of the two variables in the data file: Column Variable Name Description of Variable 1 HOST Meadow Pipit, Tree Pipit, Hedge Sparrow, Robin, Pied Wagtail, and Wren; 2 LENGTH Egg’s length (in millimeters). Answer the following questions using the data:
2 STAT151 University Studies Lakeland College 1. First you will examine the data collection process and study design. (a) Can the 120 eggs collected be considered a random sample from the population of all cuckoo eggs in this particular geographic area? Can the results of the study be generalized to the population of cuckoo eggs in the area the eggs were collected? How could the way the eggs were collected affect the outcome of the study? (b) Can you think of any potential measurement bias with the way the egg lengths were determined? (c) What is the response variable? What is the factor in the study? What are the levels of the factor? (d) Previous studies have shown that some hosts are able to eject cuckoo eggs; probably because cuckoo eggs tend to be larger than their own eggs. How this could affect the outcome of the study? 2. Now you will display the relationship between the type of nest and egg length with boxplots. (a) Obtain side-by- side boxplots of length for the six hosts. Check the “Use fences to identify outliers” option. Paste the side-by-side boxplots into your report. Make sure that the boxplots have their axes properly labelled and titles. (b) Do the boxplots in part (a) indicate any differences in the centers and spreads among the six groups? Is there any evidence of extreme skewness in any of the six distributions? Are there any outliers? 3. In this question you will compare the mean lengths with means plot and obtain confidence intervals for each group. (a) Construct the means plot to see if the mean egg size in the host species are all the same. Paste the plot into your report. What do you conclude from your plot? (b) Find a 95% confidence interval for the mean length for each host species and paste the related output into your report. Do the confidence intervals indicate any differences among the six population means? Explain briefly. 4. Now you will describe the data with the Summary Stats feature in the Stat menu. Moreover, you will use standard deviations to check the equal standard deviation assumption for the data. (a) Obtain the summary statistics for each of the six host species and paste the summaries into your report. (b) Compare the means, medians, standard deviations and interquartile ranges of the six distributions. Are there any extreme outliers in any of the six groups that may affect the validity of the F test? Extreme outliers are defined as observations three interquartile ranges below the first quartile or three interquartile ranges above the third quartile. 5. Now you will check whether the assumptions of normality and equal standard deviations are not violated for the data. (a) Obtain the normality plots of length for the six hosts. Do not paste the plots into your report. Do the plots indicate that the normal distribution assumption is seriously violated for any of the six distributions? (b) As the sample sizes are not equal, the ratio of the largest and the smallest standard deviations has to be obtained to verify the assumption of equal standard deviations for the five populations. Do the sample standard deviations satisfy the rule of thumb for safe use of the ANOVA test?
3 STAT151 University Studies Lakeland College 6. Is there any evidence that mean cuckoo egg size laid in the six nest types differ? Answer the question by running the one-way ANOVA test in StatCrunch. In particular: (a) Define the null the alternative hypotheses in terms of the population parameters of interest that correspond to the question asked. (b) Paste the ANOVA output into your report. Report the sum of squares due to treatments (the between-groups sum of squares), the sum of squares due to error (the within-groups sum of squares) and the total sum of squares. What is the pooled estimate of the variance? What is the value of the F statistic, the distribution of the test statistic under the null hypothesis, and the p- value of the test? State your conclusions. (c) By hand, demonstrate how to obtain the value of the F-statistic given the sums of squared residuals from the ANOVA output in part (b). (d) Are the assumptions of the F test satisfied in this case? Refer to the discussion in Questions 1-3 to answer the question.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 STAT151 University Studies Lakeland College Lab 5 Assignment: Marking Schema Item Points Score Proper Appearance (file and header) 10 Question 1 a) Random sample 2 Method of data collection and study outcome 2 b) Measurement bias 2 c) Response, factor levels 3 d) Effect of egg ejection 2 Question 2 a) Side by side boxplots 4 b) Differences in center and spread 2 Extreme skewness 2 Outliers 2 Question 3 a) Means plot 4 Conclusions 2 b) 95% Confidence Intervals 4 Conclusions 2 Question 4 a) Summaries 3 b) Comparison of means, std. deviations, and interquartile ranges 4 Extreme Outliers 2 Question 5 a) Normality plot comments 3 b) Rule for equal std. deviations assumption 3 Question 6 a) Null and alternative hypotheses 3 b) ANOVA output 3 Sum of squares for treatments 1 Sum of squares for error 1 Total sum of squares 1 Pooled estimate of the variance 1 Value of the F-statistic 1 Distribution of the F statistic 1 P-value 1 Conclusion in plain language 2 c) Calculating the value of F statistic by hand 4 d) Assumptions 3 TOTAL 80