Homework 4

docx

School

University of Texas, Dallas *

*We aren’t endorsed by this school

Course

6337

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

12

Uploaded by BailiffNeutron16521

Report
Homework 4 This Homework has 3 multiple part questions. You are required to use SAS for answering the questions. Your submission on eLearning must include a pdf/word report which has followed the sample report instructions. You should also upload your SAS code. The SAS dataset HeinzHunts has data on grocery store purchases of Hunts and Heinz ketchup. Each observation corresponds to one purchase occasion (of one of these brands) and consists of the following variables: 1. PriceHeinz : Price of Heinz 2. PriceHunts : Price of Hunts 3. DisplHeinz : = 1 if Heinz had a store display, =0 if Heinz did not have a store display 4. DisplHunts : = 1 if Hunts had a store display, =0 if Hunts did not have a store display 5. FeatureHeinz : = 1 if Heinz had a store feature, =0 if Heinz did not have a store feature 6. FeatureHunts : = 1 if Hunts had a store feature, =0 if Hunts did not have a store feature Question 1 Apply the following steps and provide a screenshot of the output in your report. 1. Create a variable LogPriceRatio = log (PriceHeinz/PriceHunts).
2. Randomly select 80% of the data set as the training sample, remaining 20% as test sample. TEST Data:
TRAINING Data: 3. Estimate a logit probability model for the probability that Heinz is purchased – using LogPriceRatio, DisplHeinz, FeatureHeinz, DisplHunts, FeatureHunts as the explanatory variables. Include interaction terms between display and feature for a particular brand (e.g., DisplHeinz * FeatureHeinz).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4. Interpret the results. What promotional methods (feature / display) are effective for Hunts? For Heinz? How would you interpret the results for the interaction effects?
Based on the result above, we can know that DisplHeinz and FeatHeinz both have a significantly positive relationship with Heinz at the level of 10%. Therefore, feature and display methods are effective for Heinz. Similarly, DisplHunts and FeatHunts are negatively related with Heinz, or positively related with Hunts, at a significance level of 10%. So, feature and display are also effective to Hunts. The coefficient of interaction term is negative, meaning that the effect of the combined promotional methods is less than the sum of the individual effects. However, the P value of interaction terms is higher than 0.1, which means the interaction effect is not significant. 5. Based on the estimated model, and using the logit probability formula, calculate the change in predicted probability that Heinz is purchased if LogPriceRatio changes from 0.5 to 0.6 and Heinz does not use a feature or display, while Hunts uses a feature and a display. Recall that in the logit model: Pr ( Y = 1 ) = e βX 1 + e βX , where Y is the outcome variable, X are the predictor variables, and β are the estimated model coefficients.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
6. The estimated model is to be used for targeting customers for Hunts coupons to build loyalty for the brand. Coupons are to be sent to customers who are likely to buy Hunts, and not to customers who are likely to buy Heinz. Therefore, the coupons should be sent to customers whose predicted probability of buying Heinz is below a certain threshold level that needs to be determined based on the costs of misclassifications (incorrectly sending / not sending a coupon) The following information about the costs of incorrect classification is available: The cost of incorrectly sending a coupon to a customer who would have bought Heinz is $1 per customer, and the cost of incorrectly failing to send a coupon to a customer who would have bought Hunts is $0.25 per customer. Based on these costs, what is the optimal threshold probability level that should be used with the estimated model to decide which consumers should receive coupons. (HINT: Step 1: Using the appropriate SAS command, create an ROC table for the test data from the estimated model. The ROC table provides the number of false positive and false negative classifications for each possible probability threshold. Step 2: Using the cost information, calculate the total cost of misclassification for each probability threshold. Total Cost = # of False Positives * False Positive Cost + # of False Negatives * False Negative Cost Think carefully as to what is false positive and negative in this context. Step 3: Choose the probability threshold that leads to the lowest total cost.)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
The false positive is incorrectly sending a coupon to a customer who would have bought Heinz. The false negative is incorrectly failing to send a coupon to a customer who would have bought Hunts. So the probability threshold that leads to the lowest total cost, which is 15, is 0.7860743617.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help