Screen Shot 2024-06-25 at 11

.png

School

Massachusetts Institute of Technology *

*We aren’t endorsed by this school

Course

CTL.

Subject

Industrial Engineering

Date

Jun 27, 2024

Type

png

Pages

1

Uploaded by MinisterFlower14402

Part 2 3. QUESTION (6 points) We will now treat your cluster assignments as labels for supervised learning. Fit a logistic regression model to the original data (not principal components), with your clustering as the target labels. Since the data is high-dimensional, make sure to regularize your model using your choice of , , or elastic net, and separate the data into training and validation or use cross-validation to select your model. Report your choice of regularization parameter and validation performance. ANSWER: For the logistic regression model using the cluster labels as targets, | opted for L1 regularization (Lasso) due to its ability to perform feature selection by driving coefficients of less important features to zero. The regularization parameter was selected using cross-validation to optimize model performance. The best- performing model had a regularization parameter of 0.1. In validation, this model achieved an accuracy of 0.926, indicating a high degree of reliability in predicting the assigned cluster based on the gene expression profiles.
Discover more documents: Sign up today!
Unlock a world of knowledge! Explore tailored content for a richer learning experience. Here's what you'll get:
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help