final-solution

pdf

School

University of California, Irvine *

*We aren’t endorsed by this school

Course

120A

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

3

Uploaded by SuperHumanLoris4024

Report
BANA 201A Statistics for Data Science Summer 2023/So Solution for Final Examination Question 1 a) Regression Equation: Revenue = 159.167+3.438*t+70.313*Q1-6.458*Q2+33.438*Q3 b) Relative to Quarterly revenue in Q4, Quarterly revenue in Q1 is 70.313 (Million$) higher Quarterly revenue in Q2 is 6.458 (Million$) lower Quarterly revenue in Q3 is 33.438 (Million$) higher In other words, Q1 has the highest quarterly revenue, and Q2 has the lowest quarterly revenue in terms of seasonal impact. c) The quarterly revenue goes up by 3.438 (million$) per quarter. d) Period Forecast 2024Q1 274.2 (Million$) 2024Q2 200.8 (Million$) 2024Q3 244.2 (Million$) 2024Q4 214.2 (Million$) Question 2 a) Forecast for period 11= 213.26 gallons (with MAE=15.5/MAPE=0.08/MSE=395.5) b) Forecast for period 11= 205.34 gallons (with MAE=19.0/MAPE=0.10/MSE=505.8) c) It appears that using = 0.6 provides a better historic performance over that of =0.3. (It also provides a better accuracy measure of MAPE/MAD/MSD.) Of course, good historic performance does not guarantee better future forecast. (Note that the sales appear to be going up steadily. A higher alpha might help to track the sales increase better, while a small alpha might react too slowly to the changing trend.) Question 3 a) Costs = -2728.2 + 0.047*sales + 11.947*orders b) R-square=0.876, i.e., the sales and order received explains about 87.6% of the variation in the warehouse distribution costs as provided by the historic information. c) The slope b 1 =0.047 suggests an additional distribution cost of $0.047 for each additional dollar in sales increase when the order received remains the same. The slope b 2 =11.947 suggests an additional distribution cost of $11.947 for each additional order received when the total sales remains the same. d) Predicted distribution costs = -2728.2+0.047(400,000)+11.947*(4500)= $69,878 e) 95% C.I. for this average value of distribution cost = (66,420, 73,337) from SPSS.
BANA 201A Statistics for Data Science Summer 2023/So Solution for Final Examination Question 4 a) Regression Equation with GPA and Major as independent variables: Salary = 18804 + 10982*GPA – 6809*Finance – 7433*Management Regression Equation for Major in Management: Salary = 18804 + 10982*GPA – 6809*(0) – 7433*(1) = 11371 + 10982*GPA b) Using SPSS (with GPA=3.18 and Major=accounting), Point estimate = $53,725 95% C.I. for individual value = ($47,759, $59,691) for a student majoring in Accounting with GPA=3.18. Question 5 a) Let p i denote the proportion of toy-buyers who are familiar with eToys and feel that the toys are overpriced (1=Atlantic, 2=Pacific, 3=Gulf Coast, 4=Central) Use the chi-square test to test the following hypothesis: Ho : p 1 = p 2 = p 3 = p 4 Ha : Not all p i are the same b) Expected frequencies Region Toys Price d Competitively Toys Overpriced Atlantic 500(.2)(.24)=24 500(.8)(.24)=96 Pacific 500(.2)(.22)=22 500(.8)(.22)=88 Gulf Coast 500(.2)(.26)=26 500(.8)(.26)=104 Central 500(.2)(.28)=28 500(.8)(.28)=112 𝜒 = (30 − 24) 24 + (30 − 22) 22 + ⋯ + (120 − 112) 112 = 10.1 p-value =Prob( 𝜒 distribution with 3.d.f. > 10.1) = somewhere between 0.01 and 0.025 (Note: EXCEL = 1-chisq.dist(10,1, 3, true) = 0.018) c) Therefore, we can reject Ho at =0.05, and can conclude that there is a difference in the proportion of toy buyers who feel that the toys are overpriced at the 0.05 significance level
BANA 201A Statistics for Data Science Summer 2023/So Solution for Final Examination Question 6 a) We can use the ANOVA test to evaluate the above assertion. ANOVA requires the following three assumptions for this problem (you need to be specific about exactly what “population” means here): 1. The 3 samples need to be collected independently 2. The learning time of each system must follow a normal distribution 3. The normal distribution of the learning time of each system must have the same variance b) Let j denote the average learning times of system j . Set H o : 1 = 2 = 3 H a : Not all the j ’s are the same c) From EXCEL/SPSS ANOVA output: F-stat=2.964 p-value =0.069 d) Sine p-value > 0.05, we cannot reject Ho. Therefore, the result does not prove that there is a difference in the mean training time among the 3 systems at the 0.05 level of significance.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help