report
txt
keyboard_arrow_up
School
Georgia Institute Of Technology *
*We aren’t endorsed by this school
Course
6242
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
txt
Pages
2
Uploaded by ramanasaketh
###############################################################################
##
##
##
IMPORTANT NOTE: All accuracies must be reported with two decimal places
##
##
in the range of [0.00, 1.00], e.g. 0.78 and not 78, 78.00, 78%, etc.
##
##
##
###############################################################################
**********************************************
Q 3.1
Linear Regression - Training Accuracy: 0.64
Linear Regression - Testing Accuracy: 0.64
Random Forest - Training Accuracy: 1.00
Random Forest - Testing Accuracy: 0.89
SVM - Training Accuracy: 0.71
SVM - Testing Accuracy: 0.57
**********************************************
Q 3.2 Hyperparameter Tuning
Random Forest - n_estimators values tested (at least 3): 10,20,30
Random Forest - max_depth values tested (at least 3): 10,20,30
Random Forest - Best combination of parameter values - n_estimators: 30
Random Forest - Best combination of parameter values - max_depth: 20
Random Forest - Testing Accuracy before tuning (default parameters): 0.89
Random Forest - Testing Accuracy after tuning: 0.92
SVM - Kernel Values tested: rbf, linear
SVM - C values tested (at Least 3): 0.001, 0.01, 0.1, 1
SVM - Best combination of parameter values - Kernel: rbf
SVM - Best combination of parameter values - C: 1
*********************************************
Q 3.3
SVM - Highest mean testing/cross-validated accuracy (best score): 0.71
SVM - Mean train score: 0.71
SVM Mean fit time: 2.31
*********************************************
Q 3.4 Feature Importance - WITH THE MODEL TRAINED IN Q 3.1
Random Forest
- Most important feature (e.g. X5): X7
Random Forest
- Least important feature (e.g. X1): X9
*********************************************
Q 3.5
Best Classifier and why (in at most 50 words):
Random Forest is the best classifier
It has the highest accuracy on test set and the second lowest fit time among all
the classifiers. It's very convenient on larger dataset since we don't have to
preprocess the data.
Q 3.6 Principal Component Analysis
"PCA - Percentage of variance explained by each of the selected components (enter
the entire array as [0.12, …, 0.012])": [0.37621847 0.34132531 0.15698888
0.06358059 0.03440553 0.00924896
0.00648665 0.00427297 0.00357153 0.0019226 ]
"PCA - Singular values corresponding to each of the selected components (enter the
entire array as [0.09, …, 0.037])": [235.01143877 223.84798625 151.81105308
96.61203088
71.06947312
36.84812871
30.85880931
25.04573858
22.89793107
16.80016161]
*********************************************
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help