Problem Set 5 (With instructions) (1)

pdf

School

University of Hawaii *

*We aren’t endorsed by this school

Course

310

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

6

Uploaded by LieutenantStarlingMaster218

Report
Problem Set 5 (With instructions) Chapter 12 FALL 2023 n Find your problem set via Laulima-> Assignments. LIST ALL YOUR GROUP MEMBERS, FIRST AND LAST NAME, FOR FULL CREDIT. Type and label questions with the answers into a Word document and copy and 'Paste special', any graphs or tables you have generated in Excel. Do not submit your Excel file! Each member must upload a PDF FORMAT version to the Assignment Folder / PS 5. Thank you! Work in a group on the following: File: Southwest Airlines_Fall23.xls 1 Obtain the equation to predict Revenue (Y) by using Flights as an independent (X) variable and answer the following questions. Use the data for Flights and Revenues in the Data1 tab. (4 pts) Use PHStat – Regression - Simple Linear Regression as shown below to answer the questions. Use Coefficient table in “COMPUTE” tab. A) What is the equation that predicts revenue? Y= 0.4576 + 0.0161x + 5.22 B)What is the value of R 2 ? Interpret its meaning. 0.8442 R Square is a statistic stating the regression line compared to the actual data. C)How do you find the correlation coefficient for revenue and Flights? Interpret its meaning. You can check the skewness and kurtosis from the residual data. With the residual data you can view the residual plot and add a trend line to identify issues. The skewness and kurtosis of this model is Kurtosis: 0.3583366 Skewness: -0.2694433 They fall between one and negative one indicating no problem. The residual data plot didn’t show any smiley or sad faces as well. D)What is the standard error in the regression stats section? Interpret its meaning. The standard error is how much the computed numbers could be off by. 5.2183
Use Regression Statistics table in “COMPUTE” tab to read the value. Interpret its practical meaning using the units in the problem for parts b, c and d . Revenue is in millions of dollars. 1. Check the four (4) L.I.N.E. assumptions for making inferences in regression. You need 2 plots here as well as skewness/kurtosis and the Durbin-Watson statistic. For each assumption state whether you feel the assumption is OK or not OK and state why you decided the way you did. (4 pts) Kurtosis: 0.3583366 Skewness: -0.2694433 Durbin Watson: 2.0753 Durbin watson fell between 0-4 at 2.0753 meaning the data is ok and doesn’t have any dependencies. The skewness and kurtosis fell between -1 and 1 meaning the data is ok.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Use the 1 page sheet handout on regression or Excel files in your Laulima support files folder, lecture notes, or the text to help you interpret whether or not these 4 assumptions are satisfied. Include excel results in your answers! a. Residuals against Flights (X) to check linearity. This plot is given in the Residual Plot sheet. b. Residuals against Predicted Y to check constant variability. Do you have constant variability? The data is in the Residuals sheet. Copy and paste the Predicted Y and the Residuals values next to each other with the Predicted Values series first. Highlight both of those data series and use Insert - Scatter Plot. c. Skewness and kurtosis to check normality of residuals; report conclusion. d. The Durbin-Watson statistic to check independence of residuals. Are the residuals independent? Read the D-W value in “DurbinWatson” sheet e. Given your answers above, report on the validity and reliability of making predictions using this regression model. H o B 1 =0
H 1 One of the slopes not equal to zero It was hypothesized that increase in shelf space would not increase overall sales. There is significant evidence to reject the null hypothesis. The significant F is at 0.000000000238511 which is below the level of significance 0.05. I am 95% confident that one of the slopes is not equal to zero. 99% of the increase or decreases in revenue are attributed to the increase or decrease in flights. The slopes for the independent variables indicate (coefficients) If there was 0 flights your revenue would be 4,576,000, and for every flight needing fuel your revenue would increase by $10,061. To validate this model I will check the assumptions of Linearity, Independence, Normality, and equal Variances. The assumption of linearity is valid: there are no patterns of a Parabola (smile, frown. To verify independence in the model I checked the Durban Watson it elicited a number above 1.3 which was 2.0753 . The assumption of independence is met. To verify the assumption of normality I checked skewness and kurtosis which was between -1 and 1 it was Kurtosis:0.3583366 Skewness:0.269443 . The skewness and Kurtosis are both within the normal range between 1 and -1. To validate equal variability I checked the residuals plotted against predicted data. There was no wedge shaped patterns indicating the data is equally variable. The overall error of the model (standard deviation) was 5.2183 3. Answer the following inference questions. (4 pts) You can use the results in the COMPUTE tab. a. Is there a significant linear relationship between Revenue and Flights? Use a level of significance of .05. Make sure to include hypotheses and a statement on how confident you are in H 1 . H0 = B1 = 0 H1 = B2 ≠ 0 Our null hypothesis doubted that there was a significant correlation between the revenue and the flights; however since the P-value was at 00000000023 , there was enough evidence to reject the null hypothesis since the level of significance was measured at 0.05. I am 99.99% confident that there is a positive or negative relationship between revenue and flights because of the p-value being lower than the level of significance. b. State and interpret the 95% interval estimate for the population slope. Make sure you include the units (using revenue ($M) and number of flights). The intercept 95% interval can be discarded because it doesnʻt accurately represent the data however in the fuelʻs 95% interval, it shows that the lower 95% was labeled at 0.0131 and the upper 95% being at 0.0192. This data can help to conclude that 99% of the data lies in between the intervals that were set. By inputting 95% confidence level and 321,000 number of flights, we found that the revenue will be between 0.0178 $M perflight and 0.0325 $M per flight.
c. State and interpret the estimate for the population intercept. Make sure you include the units using revenue ($M). Discuss if the intercept is significant and relevant/valid as a predictor. The data estimates that at 0 flights, we need to expect the loss to be around 45.76$M, however, this is not relevant because the data did not show when the flight number is equal to 0. d. Using your results in (1),(2) and (a) above, explain how good this model is for predicting revenue. If there are problems, explain what might be done to address them. The results from (1) and (2) help explain that the model has the data not normally distributed and the residuals are not independent. Even though this may be true we still are able to reject the null hypothesis because of the p-value of the data being so low. 4. Obtain and interpret a 95% interval estimate for Revenue in q1 2021 if Flights = 321,000 (1 pts) Use the CIE and PI sheet . The 95% prediction interval is given at the bottom of CIE and PI tab under “For Individual Response Y”. Write a one sentence conclusion interpreting your interval. Make sure you include revenue and Flights and the correct units of the problem (Revenues is in millions of dollars). Include excel results. We are 95% confident the predicted value of Y will intercept between $2751.79 million and $4469.67 millions if Flights are equal to 321,000.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help