The file ClassificationData.xlsx (Screenshot attached below) contains the following information about the top 25 MBA programs: percentage of applicants accepted, percentage of accepted applicants who enroll, mean GMAT score of enrollees, mean undergraduate GPA of enrollees, annual cost of school (for state schools, this is the cost for out-of-state students), percentage of students who are minorities, percentage of students who are non-U.S. residents, and mean starting salary of graduates (in thousands of dollars).Use these data to divide the top 25 schools into 4 clusters, using for example the K-Means clustering algorithm, and interpret your clusters. The method is explained in our textbook: Section 8.8 in the Fourth and Fifth Edition or Section 14.3 in the Sixth Edition. More precisely, use Evolutionary Solver to find 4 schools to be used as cluster centers and to assign all other schools to one of these cluster centers. Each school is then assigned to the nearest cluster center, where nearest is defined in terms of the eight attributes. The objective is to minimize the sum of the distances from each school to its cluster center.
The file ClassificationData.xlsx (Screenshot attached below) contains the following information about the top 25 MBA programs: percentage of applicants accepted, percentage of accepted applicants who enroll, mean GMAT score of enrollees, mean undergraduate GPA of enrollees, annual cost of school (for state schools, this is the cost for out-of-state students), percentage of students who are minorities, percentage of students who are non-U.S. residents, and mean starting salary of graduates (in thousands of dollars).Use these data to divide the top 25 schools into 4 clusters, using
for example the K-Means clustering algorithm, and interpret your clusters. The method is explained in our textbook: Section 8.8 in the Fourth and Fifth Edition or Section 14.3 in the Sixth Edition. More precisely, use Evolutionary Solver to find 4 schools to be used as cluster centers and to assign all other schools to one of these cluster centers. Each school is then assigned to the nearest cluster center, where nearest is defined in terms of the eight attributes. The objective is to minimize the sum of the distances from each school to its cluster center.
Hint: Your model will have four decision variables (changing cells) corresponding to the indexes of the four schools chosen as cluster centers.
In addition, please note that you need to first standardize the value of each attribute by subtracting the attribute’s mean and dividing the difference by the attribute’s standard deviation.
Step by step
Solved in 3 steps with 2 images