We use the Breast Cancer Wisconsin dataset from UCI machine learning repository: http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29 Data File: breast-cancer-wisconsin.data (class: 2 for benign, 4 for malignant) Data Metafile: breast-cancer-wisconsin.names Please implement this algorithm for logistic regression (i.e., to minimize the cross- entropy loss as discussed in class), and run it over the Breast Cancer Wisconsin dataset. Please randomly sample 80% of the training instances to train a classifier and then testing it on the remaining 20%. Ten such random data splits should be performed and the average over these 10 trials is used to estimate the generalization performance. You are expected to do the implementation all by yourself so you will gain a better understanding of the method. Please submit: (1) your source code (or Jupyter notebook file) that TA should be able to (compile and) run, and the pre-processed dataset if any; (2) a report on a program checklist, how you accomplish the project, and the result of your classification.
We use the Breast Cancer Wisconsin dataset from UCI machine learning repository:
http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29
Data File: breast-cancer-wisconsin.data (class: 2 for benign, 4 for malignant)
Data Metafile: breast-cancer-wisconsin.names
Please implement this
entropy loss as discussed in class), and run it over the Breast Cancer Wisconsin dataset.
Please randomly sample 80% of the training instances to train a classifier and then
testing it on the remaining 20%. Ten such random data splits should be performed and
the average over these 10 trials is used to estimate the generalization performance.
You are expected to do the implementation all by yourself so you will gain a better
understanding of the method.
Please submit: (1) your source code (or Jupyter notebook file) that TA should be able
to (compile and) run, and the pre-processed dataset if any; (2) a report on a program
checklist, how you accomplish the project, and the result of your classification.
Hint: you can use sklearn’s LogisticRegression to verify if you get the same accuracy.
Trending now
This is a popular solution!
Step by step
Solved in 3 steps with 2 images