STAT3032_005_HW1_2022F_Solution

docx

School

University of Notre Dame *

*We aren’t endorsed by this school

Course

3032

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

7

Uploaded by PresidentValor2227

Report
STAT 3032 Regression and Correlated Data Homework 1 (Solution) Please show your work on each problem for full credit. A correct answer, unsupported by the necessary explanation , R code or output will receive very little if any credit. Your work needs to be organized in a reasonably neat and coherent way, and submitted as a pdf or doc file on Canvas. Please do not share this handout outside the class. Problem 1 A large-scale study in the 1980s revealed that the mean of the adult Bald Eagle wingspan was about 2.06 meters. Researchers want to evaluate this knowledge using a new sample of the adult Bald Eagles in a hypothesis test at the 5% significance level. Let μ be the population mean of the Bald Eagle wingspan measured in meters. The hypotheses are H 0 : μ = 2.06 H A : μ > 2.06 The researchers choose a one-sided test because they suspect that the wingspan is getting larger slowly over the generations. After measuring the wingspan of 64 adult Bald Eagles, the following statistics were computed: The sample mean is 2.05 m. The sample variance is 0.0049. Please answer the following questions. You may assume that the wingspan of the adult Bald Eagle follows a Normal distribution. (a)_Compute the value of the test statistic based on the sample of 64 adult Bald Eagles. Please show your work. 2 points Solution: The test statistic is ¯ X 2.06 S / , where ¯ X is the sample mean, S is the square root of the sample variance (i.e., the sample standard deviation), and n is the sample size. (b)_Assuming that μ = 2.06 (which means that the null hypothesis is true), the test statistic follows a t distribution. What are the degrees of freedom of the t distribution? There is no need to explain.
STAT 3032 Regression and Correlated Data 1 point Solution: n-1 = 64-1=63 degrees of freedom (c)_Is the p-value greater or less than 0.5 (one half)? Please explain. You are not allowed to compute the value of the p-value. 2 points Solution: Since the direction of the alternative hypothesis is “greater than”, the p-value will be the area under the t distribution curve greater than the observed test statistic value (-1.143). Since -1.143 < 0, we can see that the area representing the p-value is more than 0.5. (d)_Based on Part (c), do you reject the null hypothesis? Please explain. 2 points Solution: Since the p-value is greater than 0.5, it is greater than the significance level 0.05. We do not reject the null hypothesis. (e)_Based on this sample of 64 adult Bald Eagles, construct a 90% (two-sided) confidence interval for μ . Please show your work. The following information may be useful to you: Let T be the t distribution with 63 degrees of freedom, P(T < 1.67) = 0.95, P(T< 1.30) = 0.90 2 points Solution: Lower bound = ¯ X 1.67 ×S / Upper bound = ¯ X + 1.67 ×S /
STAT 3032 Regression and Correlated Data The 90% confidence interval for μ is (2.035, 2.065). (f)_ Would correcting for the finite population size of bald eagles make the confidence intervals wider or narrower? With about 320,000 bald eagles, is it worth correcting? Please explain. 1 point Solution: Finite population corrections will make the confidence intervals narrower. However, with less than 5% of the population sampled it is likely not worth correcting as the difference in confidence interval width will be extremely small. Problem 2 We will use the iris dataset that is already stored in R. Run the following code to open up the help file of the iris dataset. ?iris (a)_Read the help file of the iris dataset and explore the data. How many variables are included in the dataset? Please list the variable names exactly as they appear in the dataset. 1 point Solution: 5 variables are included. They are Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species. (b)_We want to compare the two lengths: petal length and sepal length. Draw the histogram of the petal length and the histogram of the sepal length. How do these two histograms differ? (Reminder: show R code and output) 2 points Solution: > hist(iris$Petal.Length,xlab = "Petal Length",main = "Histogram of Petal Lengths") > hist(iris$Sepal.Length,xlab = "Sepal Length",main = "Histogram of Sepal Lengths")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STAT 3032 Regression and Correlated Data (This is an open-ended question; there are other potential answers) We can see that the petal lengths are bimodal, possibly indicating two groups, whereas the sepal lengths are approximately bell-shaped. (c)_We are curious about how an iris flower’s penal measurements ( petal width and petal length) relate to each other. Draw a scatterplot of the petal width and the petal length of the iris flowers. Place the petal width on the horizontal axis and the patel length on the vertical axis. Do you think we can fit a linear regression model to these two variables? Why or why not. (Reminder: show R code and output) 2 points Solution: > plot(Petal.Length ~ Petal.Width, data = iris)
STAT 3032 Regression and Correlated Data Fitting a linear regression may be inappropriate because there are two clear groups which appear to have slightly different relationships between petal length and petal width. OR: Petal length and petal width roughly have a linear relationship, even though there is a gap in the petal width. We could fit a linear regression model to the data. (d)_ Regardless of your answer in Part (c), let’s fit a linear regression model that uses the petal width to predict the petal length. What is the equation of the fitted model? Please pay attention to the notations. (Reminder: show R code and output) 2 points Solution: > mod = lm(Petal.Length ~ Petal.Width, data = iris) > summary(mod) Call: lm(formula = Petal.Length ~ Petal.Width, data = iris) Residuals: Min 1Q Median 3Q Max -1.33542 -0.30347 -0.02955 0.25776 1.39453
STAT 3032 Regression and Correlated Data Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.08356 0.07297 14.85 <2e-16 *** Petal.Width 2.22994 0.05140 43.39 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.4782 on 148 degrees of freedom Multiple R-squared: 0.9271, Adjusted R-squared: 0.9266 F-statistic: 1882 on 1 and 148 DF, p-value: < 2.2e-16 Using the output above, the equation of the fitted model is ^ Petal .Length = 1.08 + 2.23 Petal .Width Rubric: -0.5 pt for not having the proper notation (no hat on top of Y). Since we are just learning to write the equations of the models, I (Georgia) want to be strict with the notation in HW1. (e)_Does the slope in the fitted model of Part (d) have a meaningful interpretation in context? If so, interpret the slope in context. If not, explain why it does not have a meaningful interpretation in context. 1.5 points Solution: The slope does have a meaningful interpretation: For every one centimeter increase in petal width, the petal length increases by an estimated 2.23 cm. OR the sepal length increases by 2.23 cm on average . OR: A centimeter increase in the petal width is associated with a 2.23 cm increase in petal length on average. (f)_Does the intercept in the fitted model of Part (d) have a meaningful interpretation in context? If so, interpret the slope in context. If not, explain why it does not have a meaningful interpretation in context. 1.5 points Solution: The intercept does have a meaningful interpretation in context: when the petal width is zero, the estimated petal length is 1.08 cm. (I’m no botanist, so perhaps it’s impossible to have no petals with a sepal on an iris)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STAT 3032 Regression and Correlated Data It would be acceptable to say that there is no interpretation of the intercept here since it is hard to imagine a petal with no width that exists.