3. Use the training data of Gauss2 example (Gauss2.train). (a). Run the K-means algorithm with K = 2 clusters and make a plot identifying the clusters. Also label the points with their known classes. = 2 clusters and make (b). Multiple all the x1 by 10. Run the K-means algorithm with K a plot identifying the clusters. Also label the points with their known classes. (c). Compare the class and cluster groupings for both plots in (a) and (b). Explain any differences between them. (d). Use agglomerative hierarchical clustering (hclust in the MASS library) to cluster the data, and divide them into 2 groups. You can use the function cutree to divide the clusters into 2 groups. Report on how the predicted cluster labels compare with actual cluster labels for each of the three methods for measuring distance between clusters, i.e. single linkage, complete linkage and average linkage. (e). Plot the dendrograms for the three methods in (d). What does the shape of the the dendrograms suggest about the performance of each method? (f). Top-down (or divisive) hierarchical clustering is another approach, in which the algo- rithm begins with all data points together and seeks to separate them into subgroups. This is implemented in R in the function diana which is part of the cluster library. Compare the top-down clustering with bottom-up methods in the previous question. Base the comparison on a classification of the points into 2 clusters.

MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
icon
Related questions
Question
3. Use the training data of Gauss2 example (Gauss2.train).
(a). Run the K-means algorithm with K
=
2 clusters and make a plot identifying the
clusters. Also label the points with their known classes.
= 2 clusters and make
(b). Multiple all the x1 by 10. Run the K-means algorithm with K
a plot identifying the clusters. Also label the points with their known classes.
(c). Compare the class and cluster groupings for both plots in (a) and (b). Explain any
differences between them.
(d). Use agglomerative hierarchical clustering (hclust in the MASS library) to cluster the
data, and divide them into 2 groups. You can use the function cutree to divide the
clusters into 2 groups. Report on how the predicted cluster labels compare with actual
cluster labels for each of the three methods for measuring distance between clusters,
i.e. single linkage, complete linkage and average linkage.
(e). Plot the dendrograms for the three methods in (d). What does the shape of the the
dendrograms suggest about the performance of each method?
(f). Top-down (or divisive) hierarchical clustering is another approach, in which the algo-
rithm begins with all data points together and seeks to separate them into subgroups.
This is implemented in R in the function diana which is part of the cluster library.
Compare the top-down clustering with bottom-up methods in the previous question.
Base the comparison on a classification of the points into 2 clusters.
Transcribed Image Text:3. Use the training data of Gauss2 example (Gauss2.train). (a). Run the K-means algorithm with K = 2 clusters and make a plot identifying the clusters. Also label the points with their known classes. = 2 clusters and make (b). Multiple all the x1 by 10. Run the K-means algorithm with K a plot identifying the clusters. Also label the points with their known classes. (c). Compare the class and cluster groupings for both plots in (a) and (b). Explain any differences between them. (d). Use agglomerative hierarchical clustering (hclust in the MASS library) to cluster the data, and divide them into 2 groups. You can use the function cutree to divide the clusters into 2 groups. Report on how the predicted cluster labels compare with actual cluster labels for each of the three methods for measuring distance between clusters, i.e. single linkage, complete linkage and average linkage. (e). Plot the dendrograms for the three methods in (d). What does the shape of the the dendrograms suggest about the performance of each method? (f). Top-down (or divisive) hierarchical clustering is another approach, in which the algo- rithm begins with all data points together and seeks to separate them into subgroups. This is implemented in R in the function diana which is part of the cluster library. Compare the top-down clustering with bottom-up methods in the previous question. Base the comparison on a classification of the points into 2 clusters.
Expert Solution
steps

Step by step

Solved in 1 steps

Blurred answer
Similar questions
Recommended textbooks for you
MATLAB: An Introduction with Applications
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman