question 20.1
docx
keyboard_arrow_up
School
Georgia State University *
*We aren’t endorsed by this school
Course
CSE6242
Subject
Industrial Engineering
Date
Dec 6, 2023
Type
docx
Pages
3
Uploaded by DrSummerHornet24
Question 20.1
Describe analytics models that could be used to help the company monetize their data: How could the
company use these data sets to generate value, and what analytics models might they need to do it?
There are lots of good answers, and I want you to think about two types – at least one of your answers
should be based on just one data set, the one they’ve collected internally on customer browsing patterns
on the web site; and at least one of your other answers should be based on combining more than one of
the data sets.
Think about the problem and your approach.
Then talk about it with other learners, and share and
combine your ideas.
And then, put your approaches up on the discussion forum, and give feedback and
suggestions to each other.
You can use the {given, use, to} format to guide the discussions: Given {data}, use {model} to {result}.
Given data sets 3 data, we can classify and profile customers to identify their spending potential. So,
given the list of products purchased in the past, the price of the product, the date of purchase, and the
ship-to-address, we can use K-means clustering to cluster customers into nine categories depending on
the frequency of spending and the expensiveness of the product. This will be the first model created.
By identifying the type of spender each customer is and their associated spending potential, the
company would be able to price their products to the recommended price points for each customer. If a
customer is a high spender, then the company would be able to recommend more expensive products to
them and hopefully generate more revenue.
The data of purchase can be used to determine whether the customer frequently goes shopping
or not. One way would be to take the difference between days shopped and average the
differences. If the customer is a frequent shopper, the company can update their
recommendations much more frequently. Since the customer will be shown more products,
they may be encouraged to spend more if a certain product calls to their attention.
The ship-to address can be taken into consideration because even though the purchases are on
one account, there may be multiple users and the spending habits of these users shouldn’t be
confused with each other.
One way to evaluate how expensive a product is would be to compare the product’s price
compared to similar products. The company can then use the product price clusters to
determine the spending potential for each customer. If a customer has a lot of expensive
products, they’re probably a high spender.
So, the next model, given the list of products purchased in the past, the price of the product,
and the interquartile range of prices for similar products, we’ll use K-means clustering to cluster
the produce price and determine whether the product is less expensive, average expensive, or
more expensive item.
Data like the interquartile range could be used for the product type to exclude any outliers.
Meanwhile, the price of the product at the time of sale could be used to help take other factors,
like sales/discounts into consideration. Even if the normal price point of the product is average
expensive, if the discounted price puts the product in the less expensive cluster, that product
would be considered less expensive for that purchase.
Next, using both data set 1 and data set 3, we’ll be matching customers across data sets in order
to combine data sets.
So the next model, given similar customer fields across the data sets, we’ll use logistic
regression to determine if a customer in one data set is the same as a customer in another data
set.
The city part of the Ship-to Address can be used as the current city for the customer. If there are
multiple Ship-to Addresses that result in more than one city, a challenge would be to determine
the Ship-to Address for the customer. For services like Amazon, there is often a name associated
with the address used. If that’s the case with this company, that address can be used as the
home address. If not, the address that is used most often can be considered the home address.
The similarity of these text fields can be evaluated using Levenshtein distance to calculate how
many edits would be necessary to get from one word to the other. Edits can include deletions,
additions, and substitutions of letters. Then logistic regression can be used to determine the
likelihood that the Data Set #1 customer is the same as the Data Set #3 customer.
Interests and list of products, though not the same, can be compared as well. The two aren’t
the same because someone can purchase lots of toilet paper, but they might not be interested
in toilet paper. On the other hand, someone who purchases a lot of groceries regularly can be
interested in cooking.
Product types can be mapped to the different interests contained in Data Set #1
A purchased product that ties to an interested listed for the customer in Data Set #1 can be used
to help support if the two people are the same, like the cooking example above or pet products if
they're interested in pets and/or animals. However, the absence of a purchased product despite
an interest doesn't indicate that the people are not the same.
Using both Data Set #1 and Data Set #3, we can generate better and more diverse product
suggestions.
So, the next model, given the list of products purchased in the past (Data Set #3), interests (Data
Set #1), the results from the model containing (customer profiles), and the results from the
model that matched customers, we use optimization to determine the best set of product
suggestions to present to the customer and generate better and more diverse product
suggestions for each customer and proactively predict what other items the customer may be
interested in.
The next model, given the list of products purchased in the past for each customer (Data Set #3),
we’ll use machine learning with pairwise association mining to determine product purchasing
relationships
The lists of products for every customer can be used to perform pairwise association mining: if a
customer purchases an item, then does the customer also purchase another item?
A popular example is cereal and milk, since the two are often bought together, this relationship
can impact the location of the items (if in a brick-and-mortar store) and encourage cross-market
pricing (e.g., mark one down and upsell the other).
These relationships allow the company to predict what other products the customer may be
interested in. This is a more pro-active suggestion, as opposed to a reactive suggestion based on
past products/searches. These relationships can also help increase revenue because sometimes,
people don't know what they want until it's in front of them.
The last model, given the list of interests for each customer (Data Set #a), we can use machine
learning with pairwise association mining to determine product purchasing relationships. Using
the results of the last two models to help determine what other products a customer would be
interested in, based on both what was previously purchased and what they are interested in. A
potential constraint for the model would be that the suggested product(s) need to be one of the
relationships from the pairwise association mining. The model that is based on interests, is
particularly useful if the customer has a sparse purchasing history and can be used to encourage
more spending through more tailored and targeted suggestions.
However, using interests to predict relevant products can only be used for customers in Data Set
#3 that have been matched with themselves in Data Set #1. Then, using the customer profiles
from the first model, the price points of the products can be taken into consideration when
determining which products (less expensive, medium expensive, expensive) to suggest. This then
not only encourages more purchases, but also maximizes the potential revenue from each
purchase
(which would be the objective statement for the optimization model).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help