Crowd Funding

pptx

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

MISC

Subject

Computer Science

Date

Jan 9, 2024

Type

pptx

Pages

11

Uploaded by ConstableNeutronRam141

Report
CROWD FUNDING -KICK STARTER DATASET By: SWETHA LENKALA
Data set Description Field Type Description Project Id Int 64 Project ID name object Project Name main_category object Project Category, Kickstarter list out 15 categories for projects category object Project Subcategory, Kickstarter list out 52 sub-categories for projects. currency object Project currency deadline object Project deadline goal float64 Project goal in USD launched object Project lunching date Pledged float64 Pledged amount state object Project Status;five status of historical projects: failed; successful, canceled, live, suspended backers Int 64 Number of Backers country object Location of project taking off usd_pledged_real float64 Amount of pledge in USD usd_goal_real float64 Amount of Goal in USD
Data Preprocessing Removed na from the dataset Calculated duration between launch date and deadline Filtered “live “ status from the dataset from state column Subset the data to 50000 rows Converted state column with successful to 1 and other project status to failed as 0
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
EDA
Logistic Regression with one-hot encoding
Logistic regression
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Naïve bayes
Random Forest
Neural Networks
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Comparison & conclusion Naïve bayes has low accuracy due its assumption of independence among features. Random Forest, being an ensemble learning method, often provides robust performance. Neural networks has highest accuracy among all models due ability to capture complex relations in data. Model Accurac y Naïve bayes 72% Logistic Regression 98.3% Random Forest 98% Neural Networks 99.4%
Thank you