W09 Tutorial Exercise (1)

docx

School

Simon Fraser University *

*We aren’t endorsed by this school

Course

445

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

docx

Pages

4

Uploaded by huynhnhut28102000

Report
W09 Tutorial Exercise As you work through the tutorial in class, insert your responses to the following questions in this word document. Please then submit one copy per team through Canvas. Name__________________________________ Student Number___________________ Name__________________________________ Student Number___________________ ------------------------------------------------------------------------------------------------------------------------------- Note that the tutorial in the text does not have any questions. Answer the following questions instead. Q1) Copy and paste your importance plot of WesForestAllv (the random forest model with all variables) as well as the estimation result of WesStep (the stepwise regression starting from all variables). Compare the top 15 important variables identified by the two algorithms. Specifically, are there any top important variables in WesForestAllv but not in WesStep? What would be the reasons why they are not included in the final stepwise logistic regression? (4 marks)
DEPT1, CNDN_PCT, ENG_PCT, HH_45PER, OWN_PCT are removed by stepwise regression due to multicollinearity.
Q2) If you are to make a presentation to a manager in the UBC development office, who believes the two metrics, “the years since an alum’s first degree” and “the years since the alum’s latest degree”, are important predictors to a donor’s reaching the Wesbrook level, and you have to use a logistic regression model, how would you explain the effects of these two metrics in the model? (2 marks) The two variables YRFDGR and YRLDGR has a correlation of 0.98, which means they are highly related. This can be due to the donors’ first degree is also their latest degree; or we can say they have only one degree. Therefore, logistic regression would not identify both YRFDGR and YRLDGR as highly significant. Instead, logistic regression will vanish one of the two variables and keep the other variables, which will be extremely significant. Q3) Create a plot of means for the target variable Wesbrook against each of the following three predictor variables: DWEL_VAL AVE_INC SD_INC Compare these plots of means to their corresponding partial dependence plots on page 23 & 24. Are they similar? Which types of plots do you think would work better in a presentation to non-technical managers? Why? (4 marks)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Yes they are similar and have similar line of relations, between these wealth proxies and the probability of being a Wesbrook donors. I would show the PDPs to manager because although the AVE and SD have concavity and a diminishing marginal returns, they show much more details that the linearity of the plot of means.