250

jpg

School

Govt. College for the Elementary Teachers, Kasur *

*We aren’t endorsed by this school

Course

MISC

Subject

Computer Science

Date

Nov 24, 2024

Type

jpg

Pages

1

Uploaded by hdhdbdn

Report
cancer dataset. To make the task a bit harder, we’ll add some noninformative noise features to the data. We expect the feature selection to be able to identify the features that are noninformative and remove them: In[39]: from sklearn.datasets import load_breast_cancer from sklearn.feature_selection import SelectPercentile from sklearn.model_selection import train_test_split cancer = load_breast_cancer() # get deterministic random numbers rng = np.random.RandomState(42) noise = rng.normal(size=(len(cancer.data), 50)) # add noise features to the data # the first 30 features are from the dataset, the next 50 are noise X_w_noise = np.hstack([cancer.data, noise]) X_train, X_test, y_train, y_test = train_test_split( X_w_noise, cancer.target, random_state=0, test_size=.5) # use f classif (the default) and SelectPercentile to select 50% of features select = SelectPercentile(percentile=50) select.fit(X_train, y_train) # transform training set X_train_selected = select.transform(X_train) print("X_train.shape: {}".format(X_train.shape)) print("X_train_selected.shape: {}".format(X_train_selected.shape)) Out[39]: X_train.shape: (284, 80) X_train_selected.shape: (284, 40) As you can see, the number of features was reduced from 80 to 40 (50 percent of the original number of features). We can find out which features have been selected using the get_support method, which returns a Boolean mask of the selected features (visualized in Figure 4-9): In[40]: mask = select.get_support() print(mask) # visualize the mask -- black is True, white is False plt.matshow(mask.reshape(1, -1), cmap='gray_r') plt.xlabel("Sample index") Out[40]: [ True True True True True True True True True False True False True True True True True True False False True True True True True True True True True True False False False True False True Automatic Feature Selection | 237
Discover more documents: Sign up today!
Unlock a world of knowledge! Explore tailored content for a richer learning experience. Here's what you'll get:
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help