question 20.1

docx

School

Georgia State University *

*We aren’t endorsed by this school

Course

CSE6242

Subject

Industrial Engineering

Date

Dec 6, 2023

Type

docx

Pages

Uploaded by DrSummerHornet24

Question 20.1 Describe analytics models that could be used to help the company monetize their data: How could the company use these data sets to generate value, and what analytics models might they need to do it? There are lots of good answers, and I want you to think about two types – at least one of your answers should be based on just one data set, the one they’ve collected internally on customer browsing patterns on the web site; and at least one of your other answers should be based on combining more than one of the data sets. Think about the problem and your approach. Then talk about it with other learners, and share and combine your ideas. And then, put your approaches up on the discussion forum, and give feedback and suggestions to each other. You can use the {given, use, to} format to guide the discussions: Given {data}, use {model} to {result}. Given data sets 3 data, we can classify and profile customers to identify their spending potential. So, given the list of products purchased in the past, the price of the product, the date of purchase, and the ship-to-address, we can use K-means clustering to cluster customers into nine categories depending on the frequency of spending and the expensiveness of the product. This will be the first model created. By identifying the type of spender each customer is and their associated spending potential, the company would be able to price their products to the recommended price points for each customer. If a customer is a high spender, then the company would be able to recommend more expensive products to them and hopefully generate more revenue. The data of purchase can be used to determine whether the customer frequently goes shopping or not. One way would be to take the difference between days shopped and average the differences. If the customer is a frequent shopper, the company can update their recommendations much more frequently. Since the customer will be shown more products, they may be encouraged to spend more if a certain product calls to their attention. The ship-to address can be taken into consideration because even though the purchases are on one account, there may be multiple users and the spending habits of these users shouldn’t be confused with each other. One way to evaluate how expensive a product is would be to compare the product’s price compared to similar products. The company can then use the product price clusters to determine the spending potential for each customer. If a customer has a lot of expensive products, they’re probably a high spender. So, the next model, given the list of products purchased in the past, the price of the product, and the interquartile range of prices for similar products, we’ll use K-means clustering to cluster the produce price and determine whether the product is less expensive, average expensive, or

more expensive item. Data like the interquartile range could be used for the product type to exclude any outliers. Meanwhile, the price of the product at the time of sale could be used to help take other factors, like sales/discounts into consideration. Even if the normal price point of the product is average expensive, if the discounted price puts the product in the less expensive cluster, that product would be considered less expensive for that purchase. Next, using both data set 1 and data set 3, we’ll be matching customers across data sets in order to combine data sets. So the next model, given similar customer fields across the data sets, we’ll use logistic regression to determine if a customer in one data set is the same as a customer in another data set. The city part of the Ship-to Address can be used as the current city for the customer. If there are multiple Ship-to Addresses that result in more than one city, a challenge would be to determine the Ship-to Address for the customer. For services like Amazon, there is often a name associated with the address used. If that’s the case with this company, that address can be used as the home address. If not, the address that is used most often can be considered the home address. The similarity of these text fields can be evaluated using Levenshtein distance to calculate how many edits would be necessary to get from one word to the other. Edits can include deletions, additions, and substitutions of letters. Then logistic regression can be used to determine the likelihood that the Data Set #1 customer is the same as the Data Set #3 customer. Interests and list of products, though not the same, can be compared as well. The two aren’t the same because someone can purchase lots of toilet paper, but they might not be interested in toilet paper. On the other hand, someone who purchases a lot of groceries regularly can be interested in cooking. Product types can be mapped to the different interests contained in Data Set #1 A purchased product that ties to an interested listed for the customer in Data Set #1 can be used to help support if the two people are the same, like the cooking example above or pet products if they're interested in pets and/or animals. However, the absence of a purchased product despite an interest doesn't indicate that the people are not the same. Using both Data Set #1 and Data Set #3, we can generate better and more diverse product suggestions. So, the next model, given the list of products purchased in the past (Data Set #3), interests (Data Set #1), the results from the model containing (customer profiles), and the results from the

model that matched customers, we use optimization to determine the best set of product suggestions to present to the customer and generate better and more diverse product suggestions for each customer and proactively predict what other items the customer may be interested in. The next model, given the list of products purchased in the past for each customer (Data Set #3), we’ll use machine learning with pairwise association mining to determine product purchasing relationships The lists of products for every customer can be used to perform pairwise association mining: if a customer purchases an item, then does the customer also purchase another item? A popular example is cereal and milk, since the two are often bought together, this relationship can impact the location of the items (if in a brick-and-mortar store) and encourage cross-market pricing (e.g., mark one down and upsell the other). These relationships allow the company to predict what other products the customer may be interested in. This is a more pro-active suggestion, as opposed to a reactive suggestion based on past products/searches. These relationships can also help increase revenue because sometimes, people don't know what they want until it's in front of them. The last model, given the list of interests for each customer (Data Set #a), we can use machine learning with pairwise association mining to determine product purchasing relationships. Using the results of the last two models to help determine what other products a customer would be interested in, based on both what was previously purchased and what they are interested in. A potential constraint for the model would be that the suggested product(s) need to be one of the relationships from the pairwise association mining. The model that is based on interests, is particularly useful if the customer has a sparse purchasing history and can be used to encourage more spending through more tailored and targeted suggestions. However, using interests to predict relevant products can only be used for customers in Data Set #3 that have been matched with themselves in Data Set #1. Then, using the customer profiles from the first model, the price points of the products can be taken into consideration when determining which products (less expensive, medium expensive, expensive) to suggest. This then not only encourages more purchases, but also maximizes the potential revenue from each purchase (which would be the objective statement for the optimization model).

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

question 20.1

Related Documents