Apologies for the persistent issues

docx

School

Birla Institute of Technology, Mesra *

*We aren’t endorsed by this school

Course

AI

Subject

Computer Science

Date

Nov 24, 2024

Type

docx

Pages

2

Uploaded by UltraCrown1625

Report
Apologies for the persistent issues. It appears that there are some compatibility problems between the XGBoost version used and the scikit-learn version. To resolve this, we can modify the code to use a different approach that doesn't rely on scikit-learn's `RandomizedSearchCV`. Instead, we can use XGBoost's built-in hyperparameter tuning capabilities. Here's an updated version of the code: ```python import xgboost as xgb from tqdm import tqdm from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split # Initialize tqdm tqdm.pandas() # Define the parameter grid for random search param_grid = { 'subsample': [0.8, 0.9, 1], 'colsample_bytree': [0.6, 0.7, 0.8], 'max_depth': [2, 3, 4], 'min_child_weight': [10, 15, 20], 'learning_rate': [0.01, 0.05, 0.1], 'n_estimators': [500, 1000, 2000] } # Split the data into training and validation sets train_x, val_x, train_y, val_y = train_test_split(X, y, test_size=0.2, random_state=42) # Define a list to store the best models best_models = [] # Perform the hyperparameter search for _ in tqdm(range(10)): # Randomly sample a set of hyperparameters from the grid params = {param: value[np.random.choice(len(value))] for param, value in param_grid.items()} # Create an XGBoost classifier with the sampled hyperparameters xgb_model = xgb.XGBClassifier(objective='binary:logistic', n_jobs=-1, **params) # Fit the model xgb_model.fit(train_x, train_y, eval_metric=['error', 'logloss'], eval_set=[(train_x, train_y), (val_x, val_y)], early_stopping_rounds=25, verbose=False) # Get the best iteration based on early stopping best_iteration = xgb_model.best_iteration
# Create a new XGBoost classifier with the best iteration and the sampled hyperparameters best_model = xgb.XGBClassifier(objective='binary:logistic', n_jobs=-1, **params, n_estimators=best_iteration) # Fit the best model using the entire training data best_model.fit(X, y) # Store the best model best_models.append(best_model) # Evaluate the best models best_scores = [] for model in best_models: predictions = model.predict(val_x) accuracy = accuracy_score(val_y, predictions) best_scores.append(accuracy) # Find the index of the best model best_model_index = np.argmax(best_scores) best_model = best_models[best_model_index] best_params = best_model.get_params() best_score = best_scores[best_model_index] print("Best Model Parameters:", best_params) print("Best Score:", best_score) ``` In this updated code, we manually perform the hyperparameter search by iterating over the desired number of trials (10 in this example). For each trial, we randomly sample a set of hyperparameters from the defined grid. We fit the XGBoost model with these hyperparameters and determine the best iteration based on early stopping. Then, we create a new XGBoost model with the best iteration and the sampled hyperparameters. Finally, we evaluate the best models and select the one with the highest accuracy. Make sure to replace `X` and `y` with your actual feature matrix and target variable.s
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help