Chpater 3 Assignment

pdf

School

University of Louisville *

*We aren’t endorsed by this school

Course

615

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

4

Uploaded by cecilal

Report
3.1 Linear Regression Models 3-39 Practice 1. Building a Linear Model to Predict the Amount of Donations In this practice, you use the PVA data set to build and refine a linear regression model. This model attempts to predict the amount of donation given to the PVA. These practices provide practice for using SAS Visual Statistics to do exploratory modeling. a. Practice setup: Create the initial linear regression model. 1) Launch Visual Analytics or start a new report. 2) Select the PVA data source and open it. 3) Start a linear regression, disable auto-refresh, and assign the variable roles as follows: e Response: Target Gift Amount (one total) o Continuous effects: All variables except two that begin with Target Gift (21 total) o Classification effects: Gender, Home Owner (two total) 4) Enable auto-refresh to run the model. b. How many observations were used to build the linear regression model? c. What is the R-square and the adjusted R-square of the model? d. Modify variable measures. In order to change the classification of a variable, it cannot be assigned to an object. Remove Status Category Star All Months from the model before you modify its classification. 1) In the Roles pane, remove Status Category Star All Months from the continuous effects. 2) In the Data pane, change Status Category Star All Months from a measure to a category. 3) Assign both Status Category 96NK and Status Category Star All Months as classification effects. 4) What is the adjusted R-square of this new model? 5) According to the Fit Summary pane, what variables would not contribute to this model if you use a 5% significance criterion? e. Inthe Options pane, select the Informative missingness check box. 1) How many observations were used to build the model? 2) What is the adjusted R-square of the model now? 3) Why are there still unused observations when Informative missingness was selected? Copyright © 2020, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-40 Lesson 3 Models with Continuous Targets 4) How many dummy variables does the Informative missingness option add to the model? Hint: These variables have a suffix of _miss. In the Options pane, set the variable selection method to Backward. Change the selection criterion to Significance level. Accept the default percentage value of .01. Does variable selection exclude any input variables? Domain experts hypothesize an interaction between gift frequency and gift amount. Explore this hypothesis using the Gift Count All Months and Gift Amount Last variables. (Hint: Restore the linear regression to access the Data pane and build the interaction term.) 1) Is the interaction significant? 2) What is the adjusted R-square of the model with the interaction term? Continue to improve the model by cleaning the data. Create a filter on the Gender variable that removes the U or Undefined category from the model. 1) How many observations were used to build the model? 2) What is the adjusted R-square of the model with the “clean” data? Save the report as Practice 3. End of Practices Copyright © 2020, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
3-80 Lesson 3 Models with Continuous Targets Practice 2. Building a GLM Model to Predict the Amount of Donations In this practice, you use the PVA data set to build a GLM model. The target variable is transformed to be more consistent with the restrictions that regulate the analysis. a. Open Practice 3. (It was saved in Practice 1.) Examine the results of the linear model. 1) Does there seem to be a problem with model bias? 2) Do the residuals seem to satisfy the assumption of constant variance? b. Build an appropriate GLM model for the continuous target under the assumption that it follows a log-normal distribution. Duplicate the linear regression model on a new page as a GLM. Select the Informative missingness check box. Use a Backward elimination method with the Significance level selection criterion and default significance level of .01. Set the link function to Log. c. Does the GLM model solve the problems that you found in the linear model, if any? d. What main effects were removed from this model during the backward variable selection process? (Do not include the _miss variables.) e. Save the changes as Practice 4. 3. Building a GAM Model to Predict the Amount of Donations In this practice, you use the PVA data set to build a GAM model. The target variable is transformed to be more consistent with the restrictions that regulate the analysis. To try to improve on the previously built GLM model, a single spline effect is added. a. Open Practice 4, which was saved in the previous practice. Examine the model fit statistics. What is the AIC of the generalized linear model? b. Build an appropriate GAM model for the continuous target under the assumption that it follows a log-normal distribution. Duplicate the generalized linear model on a new page as a GAM. Set the link function to Log. c. Create a one-dimensional spline effect. Use the Gift Amount Average All Months variable. d. Add the new spline effect to the GAM model. 1) Is the spline effect significant to the model at the 5% level? 2) What is the AIC of the generalized additive model? 3) How does the assessment plot of the GAM compare to the plot of the GLM? 4) How many knots were used in the creation of the spline? e. Save the changes as Practice 5. Copyright © 2020, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
3.3 Model Validation 3-81 4. Performing Model Validation on a Linear Regression Model In this practice, you use the PVA data set to perform model validation on a linear regression model. This model attempts to predict the amount of the donations given to the PVA, as you did before. a. Open Practice 3 (saved previously). Examine the results of the linear model. Duplicate the linear regression on a new page. b. Which variables were removed from the model according to the Selection Summary details table? c. Create a new partition that contains a 50% training sample. d. Modify the model to take advantage of the partitioned data. 1) Which variables are not significant to the model? 2) What is the Training ASE and the Validation ASE? e. You do not need to save this report. End of Practices Copyright © 2020, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED.