W09 Tutorial Exercise (1)
docx
keyboard_arrow_up
School
Simon Fraser University *
*We aren’t endorsed by this school
Course
445
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
docx
Pages
4
Uploaded by huynhnhut28102000
W09 Tutorial Exercise
As you work through the tutorial in class, insert your responses to the following questions in this word
document. Please then submit one copy per team through Canvas.
Name__________________________________ Student Number___________________
Name__________________________________ Student Number___________________
-------------------------------------------------------------------------------------------------------------------------------
Note that the tutorial in the text does not have any questions. Answer the following questions instead.
Q1) Copy and paste your importance plot of WesForestAllv (the random forest model with all variables)
as well as the estimation result of WesStep (the stepwise regression starting from all variables). Compare
the top 15 important variables identified by the two algorithms. Specifically, are there any top important
variables in WesForestAllv but not in WesStep? What would be the reasons why they are not included in
the final stepwise logistic regression?
(4 marks)
DEPT1, CNDN_PCT, ENG_PCT, HH_45PER, OWN_PCT are removed by stepwise regression due to
multicollinearity.
Q2) If you are to make a presentation to a manager in the UBC development office, who believes the two
metrics, “the years since an alum’s first degree” and “the years since the alum’s latest degree”, are
important predictors to a donor’s reaching the Wesbrook level, and you have to use a logistic regression
model, how would you explain the effects of these two metrics in the model?
(2 marks)
The two variables YRFDGR and YRLDGR has a correlation of 0.98, which means they are highly related.
This can be due to the donors’ first degree is also their latest degree; or we can say they have only one
degree. Therefore, logistic regression would not identify both YRFDGR and YRLDGR as highly significant.
Instead, logistic regression will vanish one of the two variables and keep the other variables, which will
be extremely significant.
Q3) Create a plot of means for the target variable Wesbrook against each of the following three predictor
variables:
DWEL_VAL
AVE_INC
SD_INC
Compare these plots of means to their corresponding partial dependence plots on page 23 & 24. Are
they similar? Which types of plots do you think would work better in a presentation to non-technical
managers? Why?
(4 marks)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Yes they are similar and have similar line of relations, between these wealth proxies and the probability
of being a Wesbrook donors.
I would show the PDPs to manager because although the AVE and SD have concavity and a diminishing
marginal returns, they show much more details that the linearity of the plot of means.