Ensemble Asg

docx

School

Utah Valley University *

*We aren’t endorsed by this school

Course

4130

Subject

Industrial Engineering

Date

Jan 9, 2024

Type

docx

Pages

2

Uploaded by MagistrateScorpion10499

Report
Dongdi Zhao Ensemble Analysis Report Introduction This analysis focuses on predicting the FY04Giving amount using a regression dataset. The dataset includes information about individuals, such as gender, class, year, marital status, major, next degree, and historical giving amounts from 2000 to 2003. The target variable, FY04Giving, represents the giving amount in the fiscal year 2004. Data Preprocessing Handling Missing Values: Fortunately, there are no missing values in the provided dataset. Encoding Categorical Variables: Categorical variables like gender, class, marital status, major, and next degree were encoded using one-hot encoding for model compatibility. Feature Scaling: Numerical features were scaled to ensure uniform contribution to the model. Exploratory Data Analysis (EDA) Analyzed the distribution of the target variable and relationships between features. Outliers were identified, and their potential impact on the model was considered. Model Selection Ensemble methods were chosen to maximize predictive power: Random Forest (RF): The RF model utilizes decision trees and is known for robustness against overfitting. Training RMSE: $3,176.62 Testing RMSE: $5,578.57 XGBoost (Gradient Boosting): XGBoost is an efficient gradient boosting algorithm. Training RMSE: $2,952.44 Testing RMSE: $5,811.93 Stacking: Combined predictions from RF and XGBoost. Stacking RMSE: $4,927.28 Results and Discussion Random Forest (RF):
Predicts FY04Giving with an RMSE of $5,578.57 on the testing set. XGBoost: Achieves an RMSE of $5,811.93 on the testing set. Stacking: Combining RF and XGBoost predictions results in an RMSE of $4,927.28. Ensemble models consistently outperform individual models, indicating their effectiveness in predicting FY04Giving. Conclusion The ensemble approach, particularly stacking RF and XGBoost, proves effective in predicting FY04Giving. Further optimization and feature engineering may enhance model accuracy. The choice of regression is suitable for predicting the actual monetary values in FY04Giving. The provided models can assist in understanding and forecasting donation amounts for the fiscal year 2004 based on historical data.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help