1. • Mission: Write Python3 code to do binary classification. • Data set: The Horse Colic dataset. You need to use horse-colic.data and horse-colic.test as training set and test set respectively. The available documentation is analyzed for an assessment on the more appropriate treatment. Missing information is also properly identified. Read the dataset description for the dataset information. The goal is to do the prediction of attributes 23 ("what happened to the horse?") using attributes 1, 2 and 4 to 22 as predictors. We only concern ourselves about if a horse died and not about how it died, therefore you have to treat it as a binary problem (after grouping "euthanized" with "died"). This task has 2 fewer examples due to missing values in the class variable for these two examples. In accordance to the documentation, attributes 3 and 28 are not used because they do not provide useful information. Attributes 25, 26 and 27 ("type of lesion?") are also discarded because they represent alternative class variables. Please take note that the counts of missing values are calculated based on the complete dataset. • Approaches: - Classifier (required): k-nearest neighbors. Please use scikit learn library: sklearn.neighbors.KNeighbors Classifier. -Imputation (required): k-nearest neighbors. Please use scikit learning function: sklearn.impute.KNNImputer - Other data pre-processing or feature engineering methods (optional): note that the types of attributes include continuous, discrete, and categorical. You can apply any technique you prefer. Performance metric: Accuracy classification score. Please user scikit learn library: sklearn metrics accuracy score
1. • Mission: Write Python3 code to do binary classification. • Data set: The Horse Colic dataset. You need to use horse-colic.data and horse-colic.test as training set and test set respectively. The available documentation is analyzed for an assessment on the more appropriate treatment. Missing information is also properly identified. Read the dataset description for the dataset information. The goal is to do the prediction of attributes 23 ("what happened to the horse?") using attributes 1, 2 and 4 to 22 as predictors. We only concern ourselves about if a horse died and not about how it died, therefore you have to treat it as a binary problem (after grouping "euthanized" with "died"). This task has 2 fewer examples due to missing values in the class variable for these two examples. In accordance to the documentation, attributes 3 and 28 are not used because they do not provide useful information. Attributes 25, 26 and 27 ("type of lesion?") are also discarded because they represent alternative class variables. Please take note that the counts of missing values are calculated based on the complete dataset. • Approaches: - Classifier (required): k-nearest neighbors. Please use scikit learn library: sklearn.neighbors.KNeighbors Classifier. -Imputation (required): k-nearest neighbors. Please use scikit learning function: sklearn.impute.KNNImputer - Other data pre-processing or feature engineering methods (optional): note that the types of attributes include continuous, discrete, and categorical. You can apply any technique you prefer. Performance metric: Accuracy classification score. Please user scikit learn library: sklearn metrics accuracy score
Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
Related questions
Question
Write Python3 code to do binary classification. The question is attached
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution!
Trending now
This is a popular solution!
Step by step
Solved in 3 steps
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education