IE6400_Quiz21_Day23
pdf
keyboard_arrow_up
School
Northeastern University *
*We aren’t endorsed by this school
Course
6400
Subject
Marketing
Date
Feb 20, 2024
Type
Pages
1
Uploaded by ColonelStraw13148
Product-Specific Customer Sentiment Analysis
Data Preparation: Load and examine the dataset, focusing on the structure relevant to each product.
import
pandas as
pd
import
random
import
faker
def
generate_customer_comments
(
num_comments
=
500
):
# Initialize Faker for generating fake data
fake =
faker
.
Faker
()
# Define sample product names and IDs
product_names =
[
'AlphaPhone'
, 'BetaBook Pro'
, 'GammaPad'
, 'Delta Earbuds'
, 'Epsilon Charger'
]
product_ids =
range
(
1001
, 1006
)
# Sample comments
comments =
[
"Great product, very satisfied!"
,
"Could be better."
,
"Not what I expected."
,
"Fantastic quality, will buy again!"
,
"Poor customer service."
,
"Excellent value for money."
,
"Disappointed with the purchase."
,
"Exceeded my expectations."
,
"Shipping was slow."
,
"Absolutely love this product!"
]
# Generate the dataset
data =
[]
for
_ in
range
(
num_comments
):
customer_id =
fake
.
random_number
(
digits
=
5
)
first_name =
fake
.
first_name
()
last_name =
fake
.
last_name
()
product_name =
random
.
choice
(
product_names
)
product_id =
random
.
choice
(
product_ids
)
comment =
random
.
choice
(
comments
)
data
.
append
([
customer_id
, first_name
, last_name
, product_name
, product_id
, comment
])
# Create DataFrame
df =
pd
.
DataFrame
(
data
, columns
=
[
'CustomerID'
, 'FirstName'
, 'LastName'
, 'ProductName'
, 'ProductID'
, 'Comment'
])
return
df
# Generate the dataset
df_comments =
generate_customer_comments
()
df_comments
.
head
()
CustomerID
FirstName
LastName
ProductName
ProductID
Comment
0
17767
Megan
Brooks
Delta Earbuds
1002
Disappointed with the purchase.
1
71559
Christopher
Scott
BetaBook Pro
1002
Absolutely love this product
!
2
56682
Stephen
Conner
GammaPad
1001
Absolutely love this product
!
3
55352
Leslie
Smith
AlphaPhone
1002
Excellent value for money.
4
60923
Katrina
King
Epsilon Charger
1003
Exceeded my expectations.
Sentiment Classification: Develop a model to classify comments into sentiments (positive, neutral, negative) for each
product.
from
textblob import
TextBlob
def
get_sentiment
(
text
):
analysis =
TextBlob
(
text
)
if
analysis
.
sentiment
.
polarity >
0
:
return
'positive'
elif
analysis
.
sentiment
.
polarity ==
0
:
return
'neutral'
else
:
return
'negative'
# Apply sentiment classification to the 'Comment' column
df_comments
[
'Sentiment'
] =
df_comments
[
'Comment'
]
.
apply
(
get_sentiment
)
import
matplotlib.pyplot as
plt
import
seaborn as
sns
# Create a bar plot for sentiment distribution by product
plt
.
figure
(
figsize
=
(
12
,
6
))
ax =
sns
.
countplot
(
x
=
'ProductName'
, hue
=
'Sentiment'
, data
=
df_comments
)
plt
.
title
(
'Sentiment Distribution by Product'
)
# Add count labels to each bar
for
p in
ax
.
patches
:
ax
.
annotate
(
f'{
p
.
get_height
()
}'
, (
p
.
get_x
() +
p
.
get_width
() /
2.
, p
.
get_height
()),
ha
=
'center'
, va
=
'center'
, xytext
=
(
0
, 10
), textcoords
=
'offset points'
)
plt
.
show
()
Analysis and Insights: Interpret the results, compare sentiments across different products.
# Print Each product polarity count
product_sentiment_counts =
df_comments
.
groupby
([
'ProductName'
, 'Sentiment'
])
.
size
()
.
unstack
(
fill_value
=
0
)
print
(
product_sentiment_counts
)
Sentiment negative neutral positive
ProductName AlphaPhone 45 9 45
BetaBook Pro 34 8 41
Delta Earbuds 52 13 43
Epsilon Charger 34 11 48
GammaPad 46 11 60
Interpretation:
GammaPad has the highest count of positive sentiments, followed by Epsilon Charger and AlphaPhone.
Delta Earbuds have a higher count of negative sentiments compared to other products.
AlphaPhone shows a balanced distribution between negative and positive sentiments.
Visualization: Create visual representations to illustrate sentiment distribution for each product.
# Function to get sentiment score
def
get_sentiment_score
(
comment
):
analysis =
TextBlob
(
comment
)
return
analysis
.
sentiment
.
polarity
# Apply sentiment scoring to the 'Comment' column
df_comments
[
'SentimentScore'
] =
df_comments
[
'Comment'
]
.
apply
(
get_sentiment_score
)
avg_sentiment_per_product =
df_comments
.
groupby
(
'ProductName'
)[
'SentimentScore'
]
.
mean
()
.
reset_index
()
# Create a bar plot for average sentiment score per product
plt
.
figure
(
figsize
=
(
10
, 6
))
sns
.
barplot
(
x
=
'ProductName'
, y
=
'SentimentScore'
, data
=
avg_sentiment_per_product
)
plt
.
title
(
'Average Sentiment Score per Product'
)
plt
.
ylabel
(
'Average Sentiment Score'
)
plt
.
xlabel
(
'Product Name'
)
plt
.
show
()
In [1]:
Out[1]:
In [2]:
In [4]:
In [5]:
In [6]:
In [7]:
In [8]:
Average
Sentiment
Score
per
Product
0.25
-
0.20
-
B
。
1
0
-
Avel
0.05
-
0.00
AlphaPhone
BetaBook
Pro
Delta
Earbuds
Product
Name
Epsilon
Charger
GammaPad
In [ ]:
Discover more documents: Sign up today!
Unlock a world of knowledge! Explore tailored content for a richer learning experience. Here's what you'll get:
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you
Recommended textbooks for you