Take the first 1000 entries of the training portion of the Scikit-learn 20 Newsgroups dataset (in the original order) as the training set, and set aside the next 100 entries as the test set. Predict whether a post belongs to a political discussion group using a bag-of-words model (category includes 'talk.politics'). Provide the accuracy on the test set, the input shape of the network, and the predictions of the network for the last two entries in the test set as "accuracy on the test set, network input shape, network predictions for the last two entries in the test set as 'politics' or 'non-politics'".   Here's what I have so far:  from sklearn.datasets import fetch_20newsgroupsfrom sklearn.feature_extraction.text import CountVectorizerimport numpy as npfrom keras.models import Sequentialfrom keras.layers import Dense, Dropoutfrom keras.optimizers import Adam categories = ['talk.politics.guns', 'talk.politics.mideast', 'talk.politics.misc']newsgroups = fetch_20newsgroups(subset='train') train_data = newsgroups.data[:1000]test_data = newsgroups.data[1000:1100] vectorizer = CountVectorizer()X_train = vectorizer.fit_transform(train_data)X_test = vectorizer.transform(test_data) y_train = np.array(['talk.politics' in newsgroups.target_names[y] for y in newsgroups.target[:1000]], dtype=int)y_test = np.array(['talk.politics' in newsgroups.target_names[y] for y in newsgroups.target[1000:1100]], dtype=int) model = Sequential()input_shape = X_train.shape[1] model.add(Dense(128, input_shape=(input_shape,), activation='relu'))model.add(Dense(64, activation='relu'))model.add(Dense(32, activation='relu'))model.add(Dropout(0.5))model.add(Dense(3, activation='softmax')) model.compile(loss='binary_crossentropy', optimizer=Adam(), metrics=['accuracy']) model.fit(X_train, y_train, epochs=5, batch_size=128, validation_data=(X_test, y_test)) # Evaluate the modelaccuracy = model.evaluate(X_test, y_test, verbose=0)[1]predictions = model.predict(X_test[-2:]) However, this code does not work, what am i doing wrong?

icon
Related questions
Question
100%

Take the first 1000 entries of the training portion of the Scikit-learn 20 Newsgroups dataset (in the original order) as the training set, and set aside the next 100 entries as the test set. Predict whether a post belongs to a political discussion group using a bag-of-words model (category includes 'talk.politics'). Provide the accuracy on the test set, the input shape of the network, and the predictions of the network for the last two entries in the test set as "accuracy on the test set, network input shape, network predictions for the last two entries in the test set as 'politics' or 'non-politics'".

 

Here's what I have so far: 

from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam


categories = ['talk.politics.guns', 'talk.politics.mideast', 'talk.politics.misc']
newsgroups = fetch_20newsgroups(subset='train')

train_data = newsgroups.data[:1000]
test_data = newsgroups.data[1000:1100]

vectorizer = CountVectorizer()
X_train = vectorizer.fit_transform(train_data)
X_test = vectorizer.transform(test_data)

y_train = np.array(['talk.politics' in newsgroups.target_names[y] for y in newsgroups.target[:1000]], dtype=int)
y_test = np.array(['talk.politics' in newsgroups.target_names[y] for y in newsgroups.target[1000:1100]], dtype=int)

model = Sequential()
input_shape = X_train.shape[1]

model.add(Dense(128, input_shape=(input_shape,), activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))

model.compile(loss='binary_crossentropy', optimizer=Adam(), metrics=['accuracy'])


model.fit(X_train, y_train, epochs=5, batch_size=128, validation_data=(X_test, y_test))

# Evaluate the model
accuracy = model.evaluate(X_test, y_test, verbose=0)[1]
predictions = model.predict(X_test[-2:])

However, this code does not work, what am i doing wrong?

Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer