Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine, similar to how some people invest in rare coins and fine art. To educate herself on the properties of fine wine, she has collected data on 13 different characteristics of 178 wines. Amanda has applied k-means clustering to this data for k = 1, ... , 10 and generated the following plot of total sums of squared deviations. After analyzing this plot, Amanda generates summaries for k = 2, 3, and 4. Which value of k is the most appropriate to categorize these wines? Justify your choice with calculations.   Answer the following: Do not round intermediate calculations. If required, round your answers to two decimal places.        k =         Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =     Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =     Average =  k = 3 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =     Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =     Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance =     Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance =     Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance =     Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance =     Average =  k = 4 Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =     Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =     Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance =     Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance =     Cluster 1 to Cluster 4 Distance / Cluster 1 Average Distance =     Cluster 4 to Cluster 1 Distance / Cluster 4 Average Distance =     Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance =     Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance =     Cluster 2 to Cluster 4 Distance / Cluster 2 Average Distance =     Cluster 4 to Cluster 2 Distance / Cluster 4 Average Distance =     Cluster 3 to Cluster 4 Distance / Cluster 3 Average Distance =     Cluster 4 to Cluster 3 Distance / Cluster 4 Average Distance =     Average =        Based on the individual ratio values and the average ratio values for each value of k, it appears that  k = ? is the best clustering.

MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
icon
Related questions
Question

1. Amanda Boleyn, an entrepreneur who recently sold her start-up for a multi-million-dollar sum, is looking for alternate investments for her newfound fortune. She is considering an investment in wine, similar to how some people invest in rare coins and fine art. To educate herself on the properties of fine wine, she has collected data on 13 different characteristics of 178 wines. Amanda has applied k-means clustering to this data for k = 1, ... , 10 and generated the following plot of total sums of squared deviations. After analyzing this plot, Amanda generates summaries for k = 2, 3, and 4. Which value of k is the most appropriate to categorize these wines? Justify your choice with calculations.

 

Answer the following:

Do not round intermediate calculations. If required, round your answers to two decimal places.
      
k = 
      
Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =    
Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =    
Average = 

k = 3
Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =    
Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =    
Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance =    
Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance =    
Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance =    
Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance =    
Average = 

k = 4
Cluster 1 to Cluster 2 Distance / Cluster 1 Average Distance =    
Cluster 2 to Cluster 1 Distance / Cluster 2 Average Distance =    
Cluster 1 to Cluster 3 Distance / Cluster 1 Average Distance =    
Cluster 3 to Cluster 1 Distance / Cluster 3 Average Distance =    
Cluster 1 to Cluster 4 Distance / Cluster 1 Average Distance =    
Cluster 4 to Cluster 1 Distance / Cluster 4 Average Distance =    
Cluster 2 to Cluster 3 Distance / Cluster 2 Average Distance =    
Cluster 3 to Cluster 2 Distance / Cluster 3 Average Distance =    
Cluster 2 to Cluster 4 Distance / Cluster 2 Average Distance =    
Cluster 4 to Cluster 2 Distance / Cluster 4 Average Distance =    
Cluster 3 to Cluster 4 Distance / Cluster 3 Average Distance =    
Cluster 4 to Cluster 3 Distance / Cluster 4 Average Distance =    
Average = 
     

Based on the individual ratio values and the average ratio values for each value of k, it appears that 
k = ? is the best clustering.
      

Sum of WithinSS
1500 2000
1000
500
0
Sum of WithinSS Over Number of Clusters
2
Sum(WithinSS)
Diff previous Sum(WithinSS)
6
Number of Clusters
X
8
X
--X
10
Transcribed Image Text:Sum of WithinSS 1500 2000 1000 500 0 Sum of WithinSS Over Number of Clusters 2 Sum(WithinSS) Diff previous Sum(WithinSS) 6 Number of Clusters X 8 X --X 10
k = 2
Cluster 1
Cluster 2
Cluster 1
Cluster 2
Total
k=3
Cluster 1
Cluster 2
Cluster 3
Cluster 1
Cluster 2
Cluster 3
Total
k = 4
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Total
Cluster 1
0
5.640
Size
87
91
178
Cluster 1
0
5.147
6.078
Cluster 1
0
5.255
6.070
4.853
Within-Cluster Summary
Size
Inter-Cluster Distances
56
45
49
28
178
62
65
51
178
Within-Cluster Summary
Average Distance
Inter-Cluster Distances
Cluster 2
5.147
0
5.432
Cluster 2
5.640
0
Cluster 2
5.255
0
5.136
4.789
Within-Cluster Summary
Size Average Distance
Average Distance
4.003
4.260
4.134
3.355
3.999
3.483
3.627
Inter-Cluster Distances
Cluster 3
6.070
5.136
0
6.074
3.024
3.490
3.426
4.580
3.498
Cluster 3
6.078
5.432
0
Cluster 4
4.853
4.789
6.074
0
Transcribed Image Text:k = 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2 Total k=3 Cluster 1 Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3 Total k = 4 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Total Cluster 1 0 5.640 Size 87 91 178 Cluster 1 0 5.147 6.078 Cluster 1 0 5.255 6.070 4.853 Within-Cluster Summary Size Inter-Cluster Distances 56 45 49 28 178 62 65 51 178 Within-Cluster Summary Average Distance Inter-Cluster Distances Cluster 2 5.147 0 5.432 Cluster 2 5.640 0 Cluster 2 5.255 0 5.136 4.789 Within-Cluster Summary Size Average Distance Average Distance 4.003 4.260 4.134 3.355 3.999 3.483 3.627 Inter-Cluster Distances Cluster 3 6.070 5.136 0 6.074 3.024 3.490 3.426 4.580 3.498 Cluster 3 6.078 5.432 0 Cluster 4 4.853 4.789 6.074 0
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Similar questions
Recommended textbooks for you
MATLAB: An Introduction with Applications
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman