Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine, similar to how some people invest in rare coins and fine art. To educate herself on the properties of fine wine, she has collected data on 13 different characteristics of 178 wines. Amanda has applied k-means clustering to this data for k = 1, ... , 10 and generated the following plot of total sums of squared deviations. After analyzing this plot, Amanda generates summaries for k = 2, 3, and 4. Which value of k is the most appropriate to categorize these wines? Justify your choice with calculations. Answer the following: Do not round intermediate calculations. If required, round your answers to two decimal places. k = Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Average = k = 3 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance = Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance = Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance = Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance = Average = k = 4 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance = Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance = Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance = Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance = Cluster 1 to Cluster 4 Distance / Cluster 1 Average Distance = Cluster 4 to Cluster 1 Distance / Cluster 4 Average Distance = Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance = Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance = Cluster 2 to Cluster 4 Distance / Cluster 2 Average Distance = Cluster 4 to Cluster 2 Distance / Cluster 4 Average Distance = Cluster 3 to Cluster 4 Distance / Cluster 3 Average Distance = Cluster 4 to Cluster 3 Distance / Cluster 4 Average Distance = Average = Based on the individual ratio values and the average ratio values for each value of k, it appears that k = ? is the best clustering.
1. Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine, similar to how some people invest in rare coins and fine art. To educate herself on the properties of fine wine, she has collected data on 13 different characteristics of 178 wines. Amanda has applied k-means clustering to this data for k = 1, ... , 10 and generated the following plot of total sums of squared deviations. After analyzing this plot, Amanda generates summaries for k = 2, 3, and 4. Which value of k is the most appropriate to categorize these wines? Justify your choice with calculations.
Answer the following:
Do not round intermediate calculations. If required, round your answers to two decimal places.
k =
Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =
Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =
Average =
k = 3
Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =
Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =
Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance =
Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance =
Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance =
Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance =
Average =
k = 4
Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =
Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =
Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance =
Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance =
Cluster 1 to Cluster 4 Distance / Cluster 1 Average Distance =
Cluster 4 to Cluster 1 Distance / Cluster 4 Average Distance =
Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance =
Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance =
Cluster 2 to Cluster 4 Distance / Cluster 2 Average Distance =
Cluster 4 to Cluster 2 Distance / Cluster 4 Average Distance =
Cluster 3 to Cluster 4 Distance / Cluster 3 Average Distance =
Cluster 4 to Cluster 3 Distance / Cluster 4 Average Distance =
Average =
Based on the individual ratio values and the average ratio values for each value of k, it appears that
k = ? is the best clustering.
Trending now
This is a popular solution!
Step by step
Solved in 2 steps