Assignment_No1
pdf
keyboard_arrow_up
School
University of Alberta *
*We aren’t endorsed by this school
Course
342
Subject
Industrial Engineering
Date
Feb 20, 2024
Type
Pages
2
Uploaded by DoctorStraw26954
Page of 2 1 ECE 447: Data Analysis and Machine Learning for Engineers Assignment A1 due Thursday, February 8
th
, 2024, 11:55 PM possible two submission forms: document (answers “on paper”), Jupyter notebook file (The assignment is worth 100 pts, which is 5% of the final mark) A1-1.
How would you define terms restriction bias and
preference bias
? What is a difference between them 10 pts
A1-2.
The table below shows socioeconomic data for a selection of countries for the year 2009, using the following features: •
COUNTRY: The name of the country •
LIFE EXPECTANCY: The average life expectancy (in years) •
INFANT MORTALITY: The infant mortality rate (per 1,000 live births) •
EDUCATION: Spending per primary student as a percentage of GDP •
HEALTH: Health spending as a percentage of GDP •
HEALTH USD: Health spending per person converted into US dollars Calculate the correlation between the LIFE EXPECTANCY and all other features. Discuss the relationships and comment on the obtained results. 20 pts
Page of 2 2 ECE 447 Assignment No 1 A1-3.
A marketing company working for a charity has developed two different models that predict the likelihood that donors will respond to a mailshot asking them to make a special extra donation. The prediction scores generated for a test set for these two models are shown in the table below. a)
Using a classification threshold of 0.6
, and assuming that true is the positive target level (value), construct a confusion matrix
for each of the models. Use this threshold for all questions below. 15 pts b)
Calculate the simple accuracy and average class accuracy
(using an arithmetic mean
) for each model. 10 pts c)
Based on the average class accuracy measures, which model appears to perform best at this task? 10 pts d)
Generate accumulative gain chart for each model. 10 pts e)
The charity for which the model is being built typically has only enough money to send a mailshot to the top 20% of its contact list. Based on the cumulative gain chart generated in the previous part, would you recommend that Model 1 or Model 2 would perform best for the charity? 10 pts f)
Generate ROC curves for both models (use a few threshold values). 15 pts
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help