report

txt

School

Georgia Institute Of Technology *

*We aren’t endorsed by this school

Course

6242

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

txt

Pages

Uploaded by ramanasaketh

############################################################################### ## ## ## IMPORTANT NOTE: All accuracies must be reported with two decimal places ## ## in the range of [0.00, 1.00], e.g. 0.78 and not 78, 78.00, 78%, etc. ## ## ## ############################################################################### ********************************************** Q 3.1 Linear Regression - Training Accuracy: 0.64 Linear Regression - Testing Accuracy: 0.64 Random Forest - Training Accuracy: 1.00 Random Forest - Testing Accuracy: 0.89 SVM - Training Accuracy: 0.71 SVM - Testing Accuracy: 0.57 ********************************************** Q 3.2 Hyperparameter Tuning Random Forest - n_estimators values tested (at least 3): 10,20,30 Random Forest - max_depth values tested (at least 3): 10,20,30 Random Forest - Best combination of parameter values - n_estimators: 30 Random Forest - Best combination of parameter values - max_depth: 20 Random Forest - Testing Accuracy before tuning (default parameters): 0.89 Random Forest - Testing Accuracy after tuning: 0.92 SVM - Kernel Values tested: rbf, linear SVM - C values tested (at Least 3): 0.001, 0.01, 0.1, 1 SVM - Best combination of parameter values - Kernel: rbf SVM - Best combination of parameter values - C: 1 ********************************************* Q 3.3 SVM - Highest mean testing/cross-validated accuracy (best score): 0.71 SVM - Mean train score: 0.71 SVM Mean fit time: 2.31 ********************************************* Q 3.4 Feature Importance - WITH THE MODEL TRAINED IN Q 3.1 Random Forest - Most important feature (e.g. X5): X7 Random Forest - Least important feature (e.g. X1): X9 ********************************************* Q 3.5 Best Classifier and why (in at most 50 words): Random Forest is the best classifier

It has the highest accuracy on test set and the second lowest fit time among all the classifiers. It's very convenient on larger dataset since we don't have to preprocess the data. Q 3.6 Principal Component Analysis "PCA - Percentage of variance explained by each of the selected components (enter the entire array as [0.12, …, 0.012])": [0.37621847 0.34132531 0.15698888 0.06358059 0.03440553 0.00924896 0.00648665 0.00427297 0.00357153 0.0019226 ] "PCA - Singular values corresponding to each of the selected components (enter the entire array as [0.09, …, 0.037])": [235.01143877 223.84798625 151.81105308 96.61203088 71.06947312 36.84812871 30.85880931 25.04573858 22.89793107 16.80016161] *********************************************

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

report

Related Documents