H11

docx

School

Temple University *

*We aren’t endorsed by this school

Course

2521

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

Uploaded by DeanStar27858

H11 a. Use mydata = data.frame(advertisement = x,sales = y) in R. Create a dataframe named “mydata” with two variables. Store the predictor vector x into a variable called “advertisement”, and store the response vector y into a variable called “sales”. >n=100 >x=5+rnorm(n) >e=rnorm(n) >y=1+2*x+e >mydata = data.frame(advertisement = x, sales = y) >head(mydata) advertisement sales 1 3.649284 6.897327 2 2.951805 7.419058 3 4.702271 10.028348 4 4.587121 9.537453 5 3.790129 8.584801 6 3.790236 9.923865 b. Plot sales v.s. advertisement. What is the trend in this plot? Hint: you can use plot(mydata) or plot(mydata$advertisement,mydata$sales). > plot(mydata$advertisement, mydata$sales, main="Trend of sales v.s. advertisement")

c. Use lm() function in R to fit a linear regression between sales as the response and advertisement as the predictor. Store the output in “myfit”. >myfit <- lm(sales ~ advertisement, data=mydata) >myfit Call: lm(formula = sales ~ advertisement, data = mydata) Coefficients: (Intercept) advertisement 1.522 1.911 d. Use summary(myfit) in R to get the summary statistics in the linear regression. >summary(myfit) Call: lm(formula = sales ~ advertisement, data = mydata) Residuals: Min 1Q Median 3Q Max -2.46228 -0.69433 -0.08517 0.70894 2.33665 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.5220 0.5767 2.639 0.00967 ** advertisement 1.9113 0.1142 16.742 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.02 on 98 degrees of freedom Multiple R-squared: 0.7409, Adjusted R-squared: 0.7383

F-statistic: 280.3 on 1 and 98 DF, p-value: < 2.2e-16 e. Use myfit$coefficients to get the regression coefficients. Write out the fitted regres- sion line ˆy = ˆβ 0 + ˆβ 1 x, and explain the meanings of the estimated regression coefficients. Furthermore, use abline() to add this fitted regression line to the scatterplot in part b. f. Use cor(x,y) to get the sample correlation between x and y. Find the square of this correlation. What is the relation between this squared correlation and the coefficient of determination R 2 ? Hint: get the R 2 value from the summary statistics in part d. >r=cor(x,y) >r^2 [1] .7409345 The coefficient of determination is coefficient of correlation squared. g. Does advertisement has an effect on sales? Set up a formal hypothesis test, find the test statistic, and report your conclusion based on the p-value. Hint: get the test statistic and the p-value from the summary statistics in part d. Based on the p value, advertisement has a high effect on sales. h. Does advertisement has a positive effect on sales? Set up a formal hypothesis test, find the test statistic, and report your conclusion based on the p-value. Hint: the alternative hypothesis should be H a : β 1 > 0. The p value comes out to 0, so yes advertisement does have a high effect on sales . Use anova(myfit) in R. Fill in the blanks (marked by “ ∗ ”) in the ANOVA table for the regression of sales on advertisement.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Source df Sum of Squares Mean Squares F Statistic Regression ∗ SSR= ∗ MSR= ∗ F= ∗ Error ∗ SSE= ∗ MSE= ∗ Total ∗ SST= ∗ What is the degrees of freedom for the F statistic? What about the corresponding p-value? What is the null and alternative hypothesis for this F test? >anova(myfit) Analysis of Variance Table Response: sales Df Sum Sq Mean Sq F value Pr(>F) advertisement 1 291.37 291.37 280.28 < 2.2e-16 *** Residuals 98 101.88 1.04 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1