E02 V1 Solutions-2

docx

School

Babson College *

*We aren’t endorsed by this school

Course

AQM2000

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

10

Uploaded by MegaComputer13204

Report
E02 V1 Solutions 1. A study was conducted to determine whether there is a difference in the mean salaries of male and female employees in a company. A random sample of male and female employees was selected, and their salaries were recorded in “Salary.mwx”. Assuming that the population variances are equal. a. Can you use a critical value approach to determine whether there is evidence of a difference in the mean salaries of male and female employees at a 25% level of significance? b. If we would like to avoid Type I error, should they make the significance level bigger or smaller? Why? Part a. H 0 : μ M = μ F H 1 : μ M ≠μ F Since one group is male population and the other is from female population So, independent two-sample t test! Each sample in its own column. Since 2.71 > 1.173, we reject H0 at 25% level. Part b. if we want to minimize Type I error, we will try to avoid rejecting H0. Thus, the rejection region will be smaller and we should make the significance level smaller. 2. A college is considering whether a new teaching method will improve the exam scores of its students. To evaluate the effectiveness of the new method, the college randomly selects 20 students and measures their exam scores before and after the new method is introduced to find out whether the after scores are higher. Data is in "Teaching.mwx".
E02 V1 Solutions a. Please set up H0 and H1. b. Please use p-value approach to decide whether we will reject H0 or not at 5% level. Part a. Please set up H0 and H1. muD = muAfter - muBefore H0: muD <= 0 H1: muD > 0 Part b. Since p value = 0.01 < 0.05, we reject H0 at 5% level.
E02 V1 Solutions 3. Suppose you are working with a dataset of monthly sales figures for a particular product over the past 24 months. The dataset is called "SalesData.mwx" and contains two columns: "Month" and "SalesAmount". a. Please use appropriate test to see how many lags to use if we plan to run the auto- regressive analysis. b. Please use the appropriate lag order determined in part (b) to fit a auto-regressive model on the SalesData and forecast the sales amount for month = 25. Part a. Please use appropriate test to see how many lags to use if we plan to run the auto-regressive analysis. Part b. Please use the appropriate lag order determined in part (b) to fit a auto-regressive model on the SalesData and forecast the sales amount for month = 25. SalesAmount(Month25) = 7359 + 0.481* SalesAmount(Month24) = 7359 + 0.481*15000 = 14574 4. A shipping company charges different rates based on the weight of a package. The pricing follows the following rules:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
E02 V1 Solutions For packages weighing up to 1 pound: $5 For packages weighing between 1 and 5 pounds: $1.5 per pound above 1 For packages weighing between 5 and 10 pounds: $2 per pound above 5 For packages weighing more than 10 pounds: $2.5 per pound above 10 a. Write out the piecewise function. b. Please graph this piecewise function using the “Cost” tab in “E02 Excel Start” file. c. If the cost of shipping is $32, what is the weight of the package? a. Write out the piecewise function. x = weight of a package If 0 <= x <= 1, P(x) = 5 If 1 < x <= 5, P(x) = 5 + 1.5*(x-1) If 5 < x <= 10, P(x) = 5 + 1.5*(5-1) + 2*(x-5) If 10 < x, P(x) = 5 + 1.5*(5-1) + 2*(10-5) + 2.5*(x-10) b. Please graph this piecewise function using the “Cost” tab in “E02 Excel Start” file. c. If the cost of shipping is $32, what is the weight of the package? Since $32 > $21, 32 = 5 + 1.5*(5-1) + 2*(10-5) + 2.5*(x-10). x = 14.4
E02 V1 Solutions 5. A company wants to investigate the relationship between advertising spending and sales revenue. After collecting a random sample of 15 months, they obtained the regression equation: Sales Revenue = 50000 + 4.5*Advertising Spending a. The results of the regression gave SSR = 262500 and SST = 337500. Calculate the values of R 2 and SYX. b. If we add the size of the sales team into our model and R 2 goes to 80%, does that mean the size of the sales team makes the model perform better? Why? Part a. SSE = SST – SSR = 337500 – 262500 = 75000 R-Sq = SSR/SST = 262500/337500 = 77.78% Syx = sqrt(SSE/DFE) = sqrt(75000/(15-1-1))= 75.96 Part b. No. R 2 always goes up when new predictor is added. We should look at adjusted R 2 6. A movie critic rates movies based on their plot, acting, special effects, and overall impression. 48 movies from different genres were rated in the data “Movies”, each row represents a different movie, with the ratings for plot, acting, special effects, and overall impression. The last column indicates the genre of the movie, which can be Action, Drama, Romantic Comedy, Science Fiction, or Comedy. a. At the 0.15 level of significance, is there enough evidence to claim the significant linear relationship between overall impression (Y) and acting (X)? b. If we add “Genre” into the model in part b (overall impression (Y) and acting (X)) and use “Comedy” as the reference level, which genre has the lowest “overall impression” assuming the same acting rating? Part a.
E02 V1 Solutions H 0 : 1 = 0 versus H 1 : 1 ≠ 0 Since P-value = 0.000 < 0.15 = α , we reject the null and conclude that there is sufficient evidence of a linear relationship between overall impression and acting score. Part b. With the acting score holding the same, the Science Fiction has the lowest “overall impression” since it has the lowest intercept (-0.44). 7. Based on “Movies.mwx”, please test a multiple regression model to predict overall impression against Plot, Acting, Special Effects. a. Is the overall model significant at 1% level? Please clearly state the hypotheses, test, and conclusion.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
E02 V1 Solutions b. Generate the 99% prediction interval estimates for rating when plot = 20, acting = 22, and special effects = 21. Round to two decimal places. Please interpret the prediction interval in context. Part a. H0: betaP = betaA = betaSE = 0 H1: At least one beta != 0 Since p value = 0.000 < 0.01, we reject H0 at 1% level. There is evidence that at least one independent variable affects Y at 1% level. Part b. The outputs for the prediction and confidence intervals is
E02 V1 Solutions 99% PI: We are 99% confident that the actual single overall impression rating when plot = 20, acting = 22 and special effects = 21 is between 16.34 and 23.75.
E02 V1 Solutions 8. Based on “Movies.mwx”, please test a multiple regression model using overall impression as the response against Plot, Acting, and Special Effects. a. Are there any collinearity among Plot, Acting, and Special Effects? Which one should we remove? b. Run the best-subsets model building process using overall impression as the dependent variable and results from part a as potential independent variables. Please list the top three models for each size if any. c. Please find the best model for size 3 (Vars = 2) for describing data based on outputs of part b. Explain why. Part a. No collinearity. No predictions will be removed from model since all VIFs are < 5. Part b and c.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
E02 V1 Solutions The highlighted model is the best descriptive model of Vars = 2 (Size = 3) since it has the highest adjusted R-Sq and lowest S. 9. The cost function for producing x units of certain product can be given by: C(x) = 2x 2 + 4x + 2000 where 25 <= x <= 65 a. Please find the average cost function C_bar(x). b. Please graph C_bar(x) based on the given “AvgCost” tab in “E02 V1 Start.xlsx”. c. Please use Excel solver to find the x value that can produce the minimum average cost. Part a. C_bar(x) = (2x 2 + 4x + 2000)/x Part b. Part c.