this module is machine learning 700 , please answer everything correctly thank you Data-Driven Decision Making in a Tech-Driven World In an era dominated by data, businesses across industries are leveraging machine learning to drive strategic decisions. Imagine a mid-sized real estate firm, UrbanNest Realty, struggling to accurately price properties. Previously, they relied on traditional appraisal methods, which often led to over- or under-pricing homes, affecting sales and revenue. With fierce competition in the housing market, UrbanNest Realty decided to embrace machine learning to enhance their pricing strategy. Their data science team collected extensive real estate data, including factors like location, square footage, number of bedrooms, and nearby amenities. By applying Linear Regression and Decision Tree Regressor models, they were able to predict home prices with greater accuracy. The insights helped them optimize property listings, leading to faster sales and increased customer satisfaction. At the same time, a major retail chain, ShopEase, was facing difficulties in understanding customer behavior. They struggled to personalize promotions effectively, leading to wasted marketing expenditures. To solve this, ShopEase implemented k-Means Clustering to segment their customers based on annual income and spending patterns. The results allowed them to target high-value customers with tailored promotions, significantly boosting sales and customer engagement. Meanwhile, in the financial sector, SafeBank was dealing with a rising number of fraudulent credit card transactions. Their traditional rule-based fraud detection system was unable to keep up with evolving fraudulent tactics. By training Logistic Regression and Random Forest Classifier models, SafeBank improved fraud detection rates while reducing false alarms. This balance ensured legitimate transactions were not disrupted, maintaining a smooth customer experience. In the world of e-commerce, ReviewMaster faced challenges in analyzing vast amounts of customer feedback. Sorting through thousands of product reviews manually was inefficient. They adopted Sentiment Analysis using Naïve Bayes and Logistic Regression models to classify reviews as positive or negative. This helped them identify trending customer concerns, adjust product offerings, and improve customer satisfaction. These real-world scenarios showcase how machine learning can transform industries, making data-driven decision-making more efficient and impactful. The following case studies will allow you to explore similar challenges, implement solutions, and gain hands-on experience with real datasets. Question One Predicting House Prices using Real Estate Data (25 Marks) 25 Marks A real estate company, UrbanNest Realty, wants to improve its property pricing strategy. Traditional valuation methods often lead to mispricing, affecting revenue and customer satisfaction. The company aims to use machine learning to predict house prices more accurately. Dataset: California Housing Prices from sklearn.datasets (a) Load the dataset and perform basic exploratory data analysis (EDA) by displaying summary statistics, handling missing values, and visualizing key features. (6 Marks) (b) Train a Linear Regression and a Decision Tree Regressor to predict house prices. Evaluate the models using Mean Absolute Error (MAE) and R² Score. (12 Marks) (c) Interpret the model performances and discuss how feature selection or additional preprocessing could improve accuracy. (7 Marks) Question Two Customer Segmentation using Mall Customers Dataset (20 Marks) 25 Marks A large retail chain, ShopEase, is struggling with ineffective marketing campaigns. The company wants to use machine learning to segment its customer base and deliver targeted promotions that improve engagement and sales. Dataset: Mall Customers Dataset (available from Kaggle or UCI repository) (a) Load the dataset and perform basic exploratory data analysis (EDA), including missing values handling and visualizing spending patterns. (6 Marks) (b) Apply k-Means Clustering to segment customers based on Annual Income and Spending Score. Determine the optimal number of clusters using the Elbow Method. (12 Marks) (c) Visualize and interpret the resulting clusters. Discuss how a retail business could use this information for marketing strategies. (7 Marks)
this module is machine learning 700 , please answer everything correctly thank you
Data-Driven Decision Making in a Tech-Driven World
In an era dominated by data, businesses across industries are leveraging machine learning to
drive strategic decisions. Imagine a mid-sized real estate firm, UrbanNest Realty, struggling to
accurately price properties. Previously, they relied on traditional appraisal methods, which
often led to over- or under-pricing homes, affecting sales and revenue. With fierce
competition in the housing market, UrbanNest Realty decided to embrace machine learning
to enhance their pricing strategy.
Their data science team collected extensive real estate data, including factors like location,
square footage, number of bedrooms, and nearby amenities. By applying Linear Regression
and Decision Tree Regressor models, they were able to predict home prices with greater
accuracy. The insights helped them optimize property listings, leading to faster sales and
increased customer satisfaction.
At the same time, a major retail chain, ShopEase, was facing difficulties in understanding
customer behavior. They struggled to personalize promotions effectively, leading to wasted
marketing expenditures. To solve this, ShopEase implemented k-Means Clustering to segment
their customers based on annual income and spending patterns. The results allowed them to
target high-value customers with tailored promotions, significantly boosting sales and
customer engagement.
Meanwhile, in the financial sector, SafeBank was dealing with a rising number of fraudulent
credit card transactions. Their traditional rule-based fraud detection system was unable to
keep up with evolving fraudulent tactics. By training Logistic Regression and Random Forest
Classifier models, SafeBank improved fraud detection rates while reducing false alarms. This
balance ensured legitimate transactions were not disrupted, maintaining a smooth customer
experience.
In the world of e-commerce, ReviewMaster faced challenges in analyzing vast amounts of
customer feedback. Sorting through thousands of product reviews manually was inefficient.
They adopted Sentiment Analysis using Naïve Bayes and Logistic Regression models to classify
reviews as positive or negative. This helped them identify trending customer concerns, adjust
product offerings, and improve customer satisfaction.
These real-world scenarios showcase how machine learning can transform industries, making
data-driven decision-making more efficient and impactful. The following case studies will
allow you to explore similar challenges, implement solutions, and gain hands-on experience
with real datasets.
Question One
Predicting House Prices using Real Estate Data (25 Marks)
25 Marks
A real estate company, UrbanNest Realty, wants to improve its property pricing strategy.
Traditional valuation methods often lead to mispricing, affecting revenue and customer
satisfaction. The company aims to use machine learning to predict house prices more
accurately.
Dataset: California Housing Prices from sklearn.datasets
(a) Load the dataset and perform basic exploratory data analysis (EDA) by displaying summary
statistics, handling missing values, and visualizing key features. (6 Marks)
(b) Train a Linear Regression and a Decision Tree Regressor to predict house prices. Evaluate
the models using Mean Absolute Error (MAE) and R² Score. (12 Marks)
(c) Interpret the model performances and discuss how feature selection or additional
preprocessing could improve accuracy. (7 Marks)
Question Two
Customer Segmentation using Mall Customers Dataset (20 Marks)
25 Marks
A large retail chain, ShopEase, is struggling with ineffective marketing campaigns. The
company wants to use machine learning to segment its customer base and deliver targeted
promotions that improve engagement and sales.
Dataset: Mall Customers Dataset (available from Kaggle or UCI repository)
(a) Load the dataset and perform basic exploratory data analysis (EDA), including missing
values handling and visualizing spending patterns. (6 Marks)
(b) Apply k-Means Clustering to segment customers based on Annual Income and Spending
Score. Determine the optimal number of clusters using the Elbow Method. (12 Marks)
(c) Visualize and interpret the resulting clusters. Discuss how a retail business could use this
information for marketing strategies. (7 Marks)

Step by step
Solved in 2 steps
