Data-Driven Decision Making in a Tech-Driven World In an era dominated by data, businesses across industries are leveraging machine learning to drive strategic decisions. Imagine a mid-sized real estate firm, Urban Nest Realty, struggling to accurately price properties. Previously, they relied on traditional appraisal methods, which often led to over- or under-pricing homes, affecting sales and revenue. With fierce competition in the housing market, UrbanNest Realty decided to embrace machine learning to enhance their pricing strategy. Their data science team collected extensive real estate data, including factors like location, square footage, number of bedrooms, and nearby amenities. By applying Linear Regression and Decision Tree Regressor models, they were able to predict home prices with greater accuracy. The insights helped them optimize property listings, leading to faster sales and increased customer satisfaction. At the same time, a major retail chain, ShopEase, was facing difficulties in understanding customer behavior. They struggled to personalize promotions effectively, leading to wasted marketing expenditures. To solve this, ShopEase implemented k-Means Clustering to segment their customers based on annual income and spending patterns. The results allowed them to target high-value customers with tailored promotions, significantly boosting sales and customer engagement. Meanwhile, in the financial sector, SafeBank was dealing with a rising number of fraudulent credit card transactions. Their traditional rule-based fraud detection system was unable to keep up with evolving fraudulent tactics. By training Logistic Regression and Random Forest Classifier models, SafeBank improved fraud detection rates while reducing false alarms. This balance ensured legitimate transactions were not disrupted, maintaining a smooth customer experience. In the world of e-commerce, ReviewMaster faced challenges in analyzing vast amounts of customer feedback. Sorting through thousands of product reviews manually was inefficient. They adopted Sentiment Analysis using Naïve Bayes and Logistic Regression models to classify reviews as positive or negative. This helped them identify trending customer concerns, adjust product offerings, and improve customer satisfaction. These real-world scenarios showcase how machine learning can transform industries, making data-driven decision-making more efficient and impactful. The following case studies will allow you to explore similar challenges, implement solutions, and gain hands-on experience with real datasets.

icon
Related questions
Question

please answer everything correctly , include comments etc and show me outputs as well, make sure eevrything is done well ,the image is the first page before the questions, note: this module is machine learning 700

Question One       
Predicting House Prices using Real Estate Data (25 Marks) 
25 Marks 
A real estate company, UrbanNest Realty, wants to improve its property pricing strategy. 
Traditional valuation methods often lead to mispricing, affecting revenue and customer 
satisfaction. The company aims to use machine learning to predict house prices more 
accurately. 
Dataset: California Housing Prices from sklearn.datasets 
(a) Load the dataset and perform basic exploratory data analysis (EDA) by displaying summary 
statistics, handling missing values, and visualizing key features. (6 Marks) 
(b) Train a Linear Regression and a Decision Tree Regressor to predict house prices. Evaluate 
the models using Mean Absolute Error (MAE) and R² Score. (12 Marks) 
(c) Interpret the model performances and discuss how feature selection or additional 
preprocessing could improve accuracy. (7 Marks) 
Question Two       
Customer Segmentation using Mall Customers Dataset (20 Marks) 
25 Marks 
A large retail chain, ShopEase, is struggling with ineffective marketing campaigns. The 
company wants to use machine learning to segment its customer base and deliver targeted 
promotions that improve engagement and sales. 
Dataset: Mall Customers Dataset (available from Kaggle or UCI repository) 
(a) Load the dataset and perform basic exploratory data analysis (EDA), including missing 
values handling and visualizing spending patterns. (6 Marks) 
(b) Apply k-Means Clustering to segment customers based on Annual Income and Spending 
Score. Determine the optimal number of clusters using the Elbow Method. (12 Marks) 
(c) Visualize and interpret the resulting clusters. Discuss how a retail business could use this 
information for marketing strategies. (7 Marks) 
Question Three       
Credit Card Fraud Detection using Machine Learning (25 Marks) 
25 Marks 
Scenario: A major financial institution, SafeBank, is facing increased credit card fraud cases. 
Their rule-based system is failing to detect modern fraudulent techniques. The bank wants to 
implement a machine learning model to detect fraudulent transactions while minimizing false 
positives. 
Dataset: Credit Card Fraud Detection Dataset (available on Kaggle) 
(a) Load the dataset and perform exploratory data analysis (EDA) to understand fraud and 
non-fraud transactions. Use data balancing techniques if necessary. (6 Marks) 
(b) Train a Logistic Regression and a Random Forest Classifier to detect fraudulent 
transactions. Compare their performance using Precision, Recall, and F1-Score. (12 Marks) 
(c) Discuss the ethical considerations of deploying such a fraud detection system, including 
issues of false positives and customer experience. (7 Marks) 
Question Four       
Sentiment Analysis on Product Reviews (25 Marks) 
25 Marks 
Scenario: An e-commerce platform, ReviewMaster, is overwhelmed by the volume of 
customer reviews. The company wants to automate the sentiment analysis process to identify 
common complaints and improve product recommendations. 
Dataset: Amazon Product Reviews Dataset (available on Kaggle) 
(a) Load the dataset and preprocess the text data (cleaning, tokenization, stopword removal, 
and vectorization). (6 Marks) 
(b) Train a Naïve Bayes and a Logistic Regression model to classify reviews as positive or 
negative. Evaluate using Accuracy, Precision, and Recall. (12 Marks) 
(c) Discuss how sentiment analysis could help e-commerce businesses improve customer 
satisfaction and product recommendations. (7 Marks)

Data-Driven Decision Making in a Tech-Driven World
In an era dominated by data, businesses across industries are leveraging machine learning to
drive strategic decisions. Imagine a mid-sized real estate firm, Urban Nest Realty, struggling to
accurately price properties. Previously, they relied on traditional appraisal methods, which
often led to over- or under-pricing homes, affecting sales and revenue. With fierce
competition in the housing market, UrbanNest Realty decided to embrace machine learning
to enhance their pricing strategy.
Their data science team collected extensive real estate data, including factors like location,
square footage, number of bedrooms, and nearby amenities. By applying Linear Regression
and Decision Tree Regressor models, they were able to predict home prices with greater
accuracy. The insights helped them optimize property listings, leading to faster sales and
increased customer satisfaction.
At the same time, a major retail chain, ShopEase, was facing difficulties in understanding
customer behavior. They struggled to personalize promotions effectively, leading to wasted
marketing expenditures. To solve this, ShopEase implemented k-Means Clustering to segment
their customers based on annual income and spending patterns. The results allowed them to
target high-value customers with tailored promotions, significantly boosting sales and
customer engagement.
Meanwhile, in the financial sector, SafeBank was dealing with a rising number of fraudulent
credit card transactions. Their traditional rule-based fraud detection system was unable to
keep up with evolving fraudulent tactics. By training Logistic Regression and Random Forest
Classifier models, SafeBank improved fraud detection rates while reducing false alarms. This
balance ensured legitimate transactions were not disrupted, maintaining a smooth customer
experience.
In the world of e-commerce, ReviewMaster faced challenges in analyzing vast amounts of
customer feedback. Sorting through thousands of product reviews manually was inefficient.
They adopted Sentiment Analysis using Naïve Bayes and Logistic Regression models to classify
reviews as positive or negative. This helped them identify trending customer concerns, adjust
product offerings, and improve customer satisfaction.
These real-world scenarios showcase how machine learning can transform industries, making
data-driven decision-making more efficient and impactful. The following case studies will
allow you to explore similar challenges, implement solutions, and gain hands-on experience
with real datasets.
Transcribed Image Text:Data-Driven Decision Making in a Tech-Driven World In an era dominated by data, businesses across industries are leveraging machine learning to drive strategic decisions. Imagine a mid-sized real estate firm, Urban Nest Realty, struggling to accurately price properties. Previously, they relied on traditional appraisal methods, which often led to over- or under-pricing homes, affecting sales and revenue. With fierce competition in the housing market, UrbanNest Realty decided to embrace machine learning to enhance their pricing strategy. Their data science team collected extensive real estate data, including factors like location, square footage, number of bedrooms, and nearby amenities. By applying Linear Regression and Decision Tree Regressor models, they were able to predict home prices with greater accuracy. The insights helped them optimize property listings, leading to faster sales and increased customer satisfaction. At the same time, a major retail chain, ShopEase, was facing difficulties in understanding customer behavior. They struggled to personalize promotions effectively, leading to wasted marketing expenditures. To solve this, ShopEase implemented k-Means Clustering to segment their customers based on annual income and spending patterns. The results allowed them to target high-value customers with tailored promotions, significantly boosting sales and customer engagement. Meanwhile, in the financial sector, SafeBank was dealing with a rising number of fraudulent credit card transactions. Their traditional rule-based fraud detection system was unable to keep up with evolving fraudulent tactics. By training Logistic Regression and Random Forest Classifier models, SafeBank improved fraud detection rates while reducing false alarms. This balance ensured legitimate transactions were not disrupted, maintaining a smooth customer experience. In the world of e-commerce, ReviewMaster faced challenges in analyzing vast amounts of customer feedback. Sorting through thousands of product reviews manually was inefficient. They adopted Sentiment Analysis using Naïve Bayes and Logistic Regression models to classify reviews as positive or negative. This helped them identify trending customer concerns, adjust product offerings, and improve customer satisfaction. These real-world scenarios showcase how machine learning can transform industries, making data-driven decision-making more efficient and impactful. The following case studies will allow you to explore similar challenges, implement solutions, and gain hands-on experience with real datasets.
Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer