STAT3220FA23Unit2ClassworkSolutions

pdf

School

University of Virginia *

*We aren’t endorsed by this school

Course

3220

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by AgentTeamDeer43

STAT 3220 Unit 2 Classwork Solutions Your name 1. Which insect repellents protect best against mosquitoes? Consumer Reports (June 2000) tested 14 products that all claim to be an effective mosquito repellent. Each product was classified as either lotion/cream or aerosol/spray. The cost of the product (in dollars) was divided by the amount of the repellent needed to cover exposed areas of the skin (about 1/3 ounce) to obtain a cost-per-use value. Effectiveness was measured as the maximum number of hours of protection (in half-hour increments) provided when human testers exposed their arms to 200 mosquitoes. The data from the report are listed in the object ‘repellent“. a) Consider the relationship between the * type of repellent and the maximum hours of protection . Create a visualization of this relationship and comment on it. #Write code for appropriate visualization boxplot(Hours~Type,data=repellent) Aerosol/Spray Lotion/Cream 0 5 10 15 20 Type Hours We 1

do not see a strong relationship here because the means are very close b) Suppose you want to use repellent type to model the maximum number of hours of protection (y). Create the appropriate number of dummy variables for repellent type and write the model. Use “Aerosol/Spray” as the base level. E ( Hours ) = β 0 + β 1 TypeLotion where TypeLotion= 1 if Type= Lotion/Cream c) Fit a model to the data that includes the repellent type and cost. Include just the regression output #Fit the model with Type and Cost as predictors repc<-lm(Hours~Type+Cost,data=repellent) summary(repc) Call: lm(formula = Hours ~ Type + Cost, data = repellent) Residuals: Min 1Q Median 3Q Max -6.3283 -1.2664 -0.1718 2.4606 4.6804 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.252 1.517 1.485 0.165700 TypeLotion/Cream -2.391 1.854 -1.290 0.223635 Cost 6.830 1.175 5.812 0.000117 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 3.425 on 11 degrees of freedom Multiple R-squared: 0.7586, Adjusted R-squared: 0.7147 F-statistic: 17.29 on 2 and 11 DF, p-value: 0.0004025 d) Test whether cost is a useful predictor of maximum number of hours (y). Use α = . 10 . Include the hypotheses, test statistic, p-value, and conclusion in context. • Hypotheses: H 0 : β 2 = 0 vs H a : β 2 = 0 2

• Test Statistic: 5.812 • Pvalue: 0.000117 • With a pvalue this small, we reject the null hypothesis. There is enough evidence to conclude cost is a useful predictor of hours. 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

2. Earnings of Mexican Street Vendors. Detailed interviews were conducted with over 1,000 street vendors in the city of Puebla, Mexico, in order to study the factors influencing vendors’ incomes. Vendors were defined as individuals working in the street and included vendors with carts and stands on wheels and excluded beggars, drug dealers, and prostitutes. The researchers collected data on gender, age, hours worked per day, annual earnings, and education level. We have a sample of 15 vendors from this list and we will examine the following: • Annual Earnings (response, y) • Age (explanatory, x1) • Hours Worked Per Day (Explanatory, x2) a) Consider the interaction model E ( y ) = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 1 X 2 . Use the street vendor data to fit the model. #Fit the model vena<-lm(Earnings~Age*Hours,data=streentven) summary(vena) Call: lm(formula = Earnings ~ Age * Hours, data = streentven) Residuals: Min 1Q Median 3Q Max -936.5 -281.3 -117.6 255.6 787.4 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1041.894 1303.593 0.799 0.441 Age -13.238 29.234 -0.453 0.659 Hours 103.306 162.014 0.638 0.537 Age:Hours 3.621 3.840 0.943 0.366 Residual standard error: 550.3 on 11 degrees of freedom Multiple R-squared: 0.6135, Adjusted R-squared: 0.5081 F-statistic: 5.82 on 3 and 11 DF, p-value: 0.0124 4

b) What is the estimated slope relating annual earnings (y) to age (x1) when number of hours worked (x2) is 10? Interpret the result. The estimated slope for age is (-13.238+3.621xHours). Therefore, for a one year increase in age the expected earnings will increase by -13.238+3.621x10 =$22.972 when hours is held constant at 10. c) What is the estimated slope relating annual earnings (y ) to hours worked (x2 ) when age (x1 ) is 40? Interpret the result. The estimated slope for age is (103.306+3.621xAge). Therefore, for a one hour increase in hours worked the expected earnings will increase by 103.306+3.621x40 =$248.15 when age is held constant at 40. d) Test whether the interaction between age and hours is a useful predictor of earnings (y). Use α = . 10 . Include the hypotheses, test statistic, p-value, and conclusion in context. • Hypotheses: H 0 : β 3 = 0 vs H a : β 3 = 0 • Test Statistic: 0.943 • Pvalue: 0.366 • With a pvalue this large, we fail to reject the null hypothesis. There is not enough evidence to conclude the interaction between age and hours is a useful predictor of earnings. 5

3. During periods of high electricity demand, especially during the hot summer months, the power output from a gas turbine engine can drop dramatically. One way to counter this drop in power is by cooling the inlet air to the gas turbine. An increasingly popular cooling method uses high pressure inlet fogging. The performance of a sample of 67 gas turbines augmented with high pressure inlet fogging was investigated in the Journal of Engineering for Gas Turbines and Power (January 2005). One measure of performance is heat rate (kilojoules per kilowatt per hour). Heat rates for the 67 gas turbines, saved in the gasturbine file. • Engine: traditional, advanced, and aeroderivative • Shafts: 1, 2, 3 • RPM: cycle speed (revolutions per minute) • CPRatio: cycle pressure ratio • Inlet_temp: inlet temperature ( ◦ C) • Exh_temp: exhaust gas temperature ( ◦ C) • Airflow: air mass flow rate (kilograms per second) • Power: horsepower (Hp units) • Heatrate (response): kilojoules per kilowatt per hour A complete second-order model for heat rate (y) as a function of cycle speed, cycle pressure ratio is written as y = β 0 + β 1 RPM + β 2 CPRatio + β 3 RPM * CPRatio + β 4 RPM 2 + β 5 CPRatio 2 + a) Now consider the qualitative predictor, engine type, at three levels (traditional, advanced, and aeroderivative). Add the dummy variables such that X1={1 if traditional, 0 otherwise} and X2= {1 if advanced, 0 otherwise} to the model above and write the model. Fit the model. Include your regression output and write your prediction equation. #Fit the model recodeENGINE<-relevel(factor(gasturbine$ENGINE),ref="Aeroderiv") gasa<-lm(HEATRATE~RPM*CPRATIO+I(RPMˆ2)+I(CPRATIOˆ2)+recodeENGINE,data=gasturbine) summary(gasa) Call: lm(formula = HEATRATE ~ RPM * CPRATIO + I(RPM^2) + I(CPRATIO^2) + recodeENGINE, data = gasturbine) 6

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Residuals: Min 1Q Median 3Q Max -1242.2 -313.8 -97.7 292.2 1863.4 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.441e+04 1.313e+03 10.977 6.97e-16 *** RPM 1.212e-01 1.145e-01 1.059 0.29387 CPRATIO -4.211e+02 1.240e+02 -3.395 0.00123 ** I(RPM^2) -2.277e-07 2.016e-06 -0.113 0.91042 I(CPRATIO^2) 7.111e+00 2.547e+00 2.791 0.00706 ** recodeENGINEAdvanced -1.088e+02 3.284e+02 -0.331 0.74161 recodeENGINETraditional 2.358e+02 2.899e+02 0.813 0.41930 RPM:CPRATIO 9.100e-04 6.045e-03 0.151 0.88086 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 558.3 on 59 degrees of freedom Multiple R-squared: 0.8905, Adjusted R-squared: 0.8775 F-statistic: 68.53 on 7 and 59 DF, p-value: < 2.2e-16 round(coef(gasa),5) (Intercept) RPM CPRATIO 14413.84263 0.12122 -421.12001 I(RPM^2) I(CPRATIO^2) recodeENGINEAdvanced 0.00000 7.11101 -108.80654 recodeENGINETraditional RPM:CPRATIO 235.77999 0.00091 • New Model: heatrate = β 0 + β 1 RPM + β 2 CPRatio + β 3 RPM * CPRatio + β 4 RPM 2 + β 5 CPRatio 2 + β 6 X 1 + β 7 X 2 + • Prediction Equation: \ heatrate = 14413 . 84 + 0 . 121 RPM - 421 . 12 CPRatio + 0 . 0009 RPM * CPRatio + 0 . 000 RPM 2 + 7 . 11 CPRatio 2 + 235 . 77 X 1 - 109 . 80 X 2 • where X1={1 if traditional, 0 otherwise} and X2= {1 if advanced, 0 otherwise} 7

b) Conduct the test to determine if engine type is important for predicting heat rate. #Fit the model redgas<-lm(HEATRATE~RPM*CPRATIO+I(RPMˆ2)+I(CPRATIOˆ2),data=gasturbine) summary(redgas) Call: lm(formula = HEATRATE ~ RPM * CPRATIO + I(RPM^2) + I(CPRATIO^2), data = gasturbine) Residuals: Min 1Q Median 3Q Max -1196.10 -281.46 -34.99 302.94 1896.08 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.558e+04 1.143e+03 13.635 < 2e-16 *** RPM 7.823e-02 1.104e-01 0.708 0.48144 CPRATIO -5.231e+02 1.034e+02 -5.061 4.11e-06 *** I(RPM^2) -1.806e-07 1.969e-06 -0.092 0.92724 I(CPRATIO^2) 8.840e+00 2.163e+00 4.087 0.00013 *** RPM:CPRATIO 4.452e-03 5.582e-03 0.798 0.42821 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 563.5 on 61 degrees of freedom Multiple R-squared: 0.8846, Adjusted R-squared: 0.8752 F-statistic: 93.55 on 5 and 61 DF, p-value: < 2.2e-16 anova(redgas,gasa) Analysis of Variance Table Model 1: HEATRATE ~ RPM * CPRATIO + I(RPM^2) + I(CPRATIO^2) Model 2: HEATRATE ~ RPM * CPRATIO + I(RPM^2) + I(CPRATIO^2) + recodeENGINE Res.Df RSS Df Sum of Sq F Pr(>F) 1 61 19370350 8

2 59 18389213 2 981137 1.5739 0.2158 • Hypotheses: H 0 : β 6 = β 7 = 0 vs H a : β 6 and/orβ 7 = 0 • Test Statistic: 1.57 • Pvalue: 0.2158 • With a pvalue this large, we fail to reject the null hypothesis. There is not enough evidence to conclude the engine type is a useful predictor of heat rate in this model. c) Do you recommend the model original model or the model from part b Why? We would prefer the model from part 1 because engine type was not significant. So we should remove that predictor. However, we might want to continue evaluating our model to make it parsimonious. 9

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

STAT3220FA23Unit2ClassworkSolutions

Related Documents