Homework1

pdf

School

Georgia Institute Of Technology *

*We aren’t endorsed by this school

Course

4641

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

7

Uploaded by ChefHorse1515

Report
ISyE 4803 Machine Learning Homework #1 Jada Wilson Concept Questions 1. The main difference between supervised and unsupervised learning is the need for labelled training data. Supervised machine learning relies on labelled input, x, and output training data, y, whereas unsupervised learning processes unlabeled raw input data, x. Benefit Drawback Unsupervised Learning: requires less manual difficult to measure accuracy data preparation of model Supervised Learning: accurate and easily requires large human knowledge interpretable results and domain expertise for labeling 2. K-means algorithm will converge in a finite number of iterations because when we choose our clusters, we know that we have at most k N number of ways to cluster the N data points. In the algorithm, if the new cluster is different from the last cluster, then we know that it has a lower cost and we iterate again, but when our last cluster is the same as the current cluster, then the next cluster will be the same as the last and we end the algorithm. Since we have a finite number of clusters that we are iterating through, the algorithm has to end in a finite number of iterations. 3. Different initializations of k-means lead to different results because the k-means algorithm provides a local optimal solution not a global optimal solution. The k-means algorithm is a low-complexity heuristic and when we use it, we are finding a particular solution, not necessarily a unique solution. When you choose the initial number of clusters, k, if it is too small, the clusters will be too broad and if it is too large, the clusters will be too specific. Either one of these initializations will change the results and the usefulness of the results of the algorithm. 4. The main difference between k-means and generalized k-means algorithm is the choice of similarity measure function. K-means uses the l-2 distance function to determine the similarity of a data point to a cluster, while the generalized k-means algorithm uses any distance function that has symmetry, positive separability, and triangular inequality. 1
5. The graph Laplacian matrix for the following image: 2 3 5 1 4 is D = 3 777 777 777 777 55 2 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 0 0 1 44 888 888 888 888 66 A = 3 777 777 777 777 55 0 1 1 0 0 1 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 44 888 888 888 888 66 L = 3 777 777 777 777 55 2 1 1 0 0 1 2 1 0 0 1 1 2 0 0 0 0 0 1 1 0 0 0 1 1 44 888 888 888 888 66 The eigenvectors associated with the zero eigenvalues are: x 1 = 3 777 777 777 777 55 0 . 57735027 0 . 57735027 0 . 57735027 0 0 44 888 888 888 888 66 x 2 = 3 777 777 777 777 55 0 0 0 0 . 70710678 0 . 70710678 44 888 888 888 888 66 We can find the number of disconnected clusters in the graph by using spectral clustering. We sort the eigenvalues then count the number of zero eigenvalues that exist in the sorted list of eigenvalues. Each zero eigenvalue represents a connected component. After applying the k-means algorithm, we get that the disconnected clusters are nodes: [1, 2, 3] & [4, 5]. Math of k-means clustering J = QQQQQQQ m i =1 QQQQQQQ k j =1 r ij || x i µ j || 2 1. Take derivative of J w.r.t. µ j and set it equal to 0: ∂J ∂µ j = 0 a. Start by differentiating squared sum: ∂J ∂µ j = QQQQQQQ m i =1 r ij ∂µ j || x i µ j || 2 ∂µ j || x i µ j || 2 = 2( x i µ j ) (by Chain Rule) b. Take result and plug back in ∂J ∂µ j = 2 QQQQQQQ m i =1 r ij ( x i µ j ) 2. Solve for µ j : 2 QQQQQQQ m i =1 r ij ( x i µ j ) = 0 QQQQQQQ m i =1 r ij ( x i µ j ) = 0 QQQQQQQ m i =1 r ij x i QQQQQQQ m i =1 r ij µ j = 0 QQQQQQQ m i =1 r ij x i = µ j QQQQQQQ m i =1 r ij µ j = QQQQQQQ m i =1 r ij x i QQQQQQQ m i =1 r ij This result shows that the centroid that minimizes the disortion function J for the j-th cluster is µ j = QQQQQQQ m i =1 r ij x i QQQQQQQ m i =1 r ij . 2. To minimize the expected value of the distortion function, J, we now take the derivative w.r.t. r ij and set it equal to zero. ∂J ∂r ij = 0 ∂r ij r ij || x i µ j || 2 = 2 r ij ( x i µ j ) a. Take result and plug back in ∂r ij r ij = 2 r ij ( x i µ j ) b. Solve for r ij : 2 r ij ( x i µ j ) = 0 2
if x i is assigned to cluster: r ij is set to 1 if x i is not assigned to cluster: r ij is set to 0 Then we have: 1 if j = min k || x i µ k || 2 0 otherwise Here, each data point is assigned to the cluster whose centroid is closest in terms of the squared Euclidean distance. The assignment minimizes the distortion function while keeping the centroids fixed. Image compression using clustering 1. & 2. ## Image: footballImage ## K=2: Best Quality Metric: 1619.6563026266592 ## K=2: Cluster assignments: [0 0 0 ... 0 0 0] ## K=2: Centroids: [[ 74.59407487 76.04869875 66.44037077] ## [189.91441097 182.64826808 172.45584686]] ## K=2: Converged in 23 iterations. ## K=2: Elasped time of best run: 0.75 seconds ## K=3: Best Quality Metric: 827.5573587495618 ## K=3: Cluster assignments: [1 1 1 ... 0 0 0] ## K=3: Centroids: [[134.23584777 125.34568766 101.93988768] ## [ 38.7890798 45.35163548 48.17869416] ## [202.52108949 198.84707504 192.62337998]] ## K=3: Converged in 19 iterations. ## K=3: Elasped time of best run: 0.83 seconds ## K=4: Best Quality Metric: 613.5712841776818 ## K=4: Cluster assignments: [3 3 1 ... 3 3 3] ## K=4: Centroids: [[182.28367015 166.69982352 148.67129213] ## [ 35.63656913 42.20839938 45.50039282] ## [212.76599581 217.42816254 220.33402448] ## [121.11760406 115.964445 93.0774777 ]] ## K=4: Converged in 27 iterations. ## K=4: Elasped time of best run: 1.52 seconds ## K=5: Best Quality Metric: 480.82975055890756 ## K=5: Cluster assignments: [1 1 1 ... 2 2 2] ## K=5: Centroids: [[213.40453832 218.33917992 221.68489333] ## [ 59.1690565 73.45339834 82.8659067 ] ## [132.30906889 123.07173898 95.11312959] ## [ 28.60494856 28.9483428 26.02401895] ## [184.31018342 169.637439 152.17032239]] ## K=5: Converged in 39 iterations. ## K=5: Elasped time of best run: 2.73 seconds ## K=6: Best Quality Metric: 401.2996890657265 ## K=6: Cluster assignments: [3 3 3 ... 0 0 0] ## K=6: Centroids: [[133.57036354 124.44775464 96.98863403] ## [ 25.42655779 26.29012346 24.88762516] ## [ 27.09671758 66.36594085 118.68319792] ## [ 78.07040753 76.10805292 58.68560774] ## [213.49163037 218.4630724 221.87079459] ## [184.83996395 170.22343615 152.74774228]] ## K=6: Converged in 62 iterations. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## K=6: Elasped time of best run: 5.77 seconds ## Image: hestainImage ## K=2: Best Quality Metric: 871.8897639740259 ## K=2: Cluster assignments: [1 1 0 ... 0 0 0] ## K=2: Centroids: [[123.70097352 59.83881555 182.81042112] ## [196.28561336 120.40485442 236.30471313]] ## K=2: Converged in 8 iterations. ## K=2: Elasped time of best run: 0.07 seconds ## K=3: Best Quality Metric: 502.00299101619987 ## K=3: Cluster assignments: [1 1 1 ... 0 0 0] ## K=3: Centroids: [[ 99.30062456 51.32998519 167.32728092] ## [175.63603709 86.82524434 219.8836633 ] ## [214.54481317 161.97243271 250.04651163]] ## K=3: Converged in 30 iterations. ## K=3: Elasped time of best run: 0.33 seconds ## K=4: Best Quality Metric: 365.97480136916437 ## K=4: Cluster assignments: [0 0 0 ... 3 3 3] ## K=4: Centroids: [[159.15037837 74.42126988 206.77552507] ## [194.16105938 112.06979093 238.40464687] ## [225.51215434 190.18109325 252.36745981] ## [ 89.34823448 45.70777527 158.58149243]] ## K=4: Converged in 38 iterations. ## K=4: Elasped time of best run: 0.51 seconds ## K=5: Best Quality Metric: 282.8821601685428 ## K=5: Cluster assignments: [0 0 0 ... 2 2 3] ## K=5: Centroids: [[176.48284686 79.29114802 215.81480678] ## [228.80826709 198.38648649 252.8009539 ] ## [121.72374927 66.87267307 189.46320535] ## [ 72.38655022 32.81170306 138.52087336] ## [197.31035252 123.34270415 243.10932619]] ## K=5: Converged in 62 iterations. ## K=5: Elasped time of best run: 1.06 seconds ## K=6: Best Quality Metric: 234.95851115951683 ## K=6: Cluster assignments: [1 1 1 ... 0 0 5] ## K=6: Centroids: [[118.85054477 63.83285962 186.54192326] ## [158.04147753 101.28277704 222.40934579] ## [229.27655899 201.84453273 252.88374503] ## [185.04153591 71.22374008 213.49363116] ## [202.76877942 126.02340576 245.32529499] ## [ 70.54262542 31.68313844 136.45152773]] ## K=6: Converged in 64 iterations. ## K=6: Elasped time of best run: 1.21 seconds ## Image: yardSignImage ## K=2: Best Quality Metric: 4282.008785707137 ## K=2: Cluster assignments: [0 0 1 ... 0 1 0] ## K=2: Centroids: [[162.42074932 101.45617498 151.26677883] ## [ 79.85715878 250.73400698 111.0528271 ]] ## K=2: Converged in 8 iterations. ## K=2: Elasped time of best run: 0.11 seconds ## K=3: Best Quality Metric: 2492.4379333915813 ## K=3: Cluster assignments: [0 1 0 ... 1 0 2] ## K=3: Centroids: [[ 86.36640495 187.75645102 91.37979365] ## [245.23417286 123.04008777 120.1830806 ] ## [114.25157993 76.85285168 254.07376921]] 4
## K=3: Converged in 17 iterations. ## K=3: Elasped time of best run: 0.30 seconds ## K=4: Best Quality Metric: 1584.8491759685367 ## K=4: Cluster assignments: [2 0 3 ... 0 3 1] ## K=4: Centroids: [[247.60321789 122.86037633 121.53050859] ## [114.1580508 76.80414342 254.01822012] ## [102.63817352 106.80748858 66.33410959] ## [ 77.23295343 254.39587984 112.73888693]] ## K=4: Converged in 24 iterations. ## K=4: Elasped time of best run: 0.57 seconds ## K=5: Best Quality Metric: 1197.7439021968637 ## K=5: Cluster assignments: [0 2 1 ... 3 1 4] ## K=5: Centroids: [[101.30964774 107.88089042 66.39571522] ## [ 70.69476177 254.91667013 109.35168031] ## [251.4567945 76.31544623 80.92989296] ## [233.67956631 171.17771456 165.77644786] ## [109.12039383 69.94262806 254.07560965]] ## K=5: Converged in 108 iterations. ## K=5: Elasped time of best run: 2.92 seconds ## K=6: Best Quality Metric: 1009.4258486984667 ## K=6: Cluster assignments: [2 1 4 ... 5 4 3] ## K=6: Centroids: [[ 55.56494523 66.92729024 50.46864091] ## [252.04248131 75.02781771 81.16266532] ## [133.37183688 137.0566748 77.94367375] ## [109.23801797 69.97424696 254.19167083] ## [ 70.14196268 254.98455566 109.85031515] ## [236.630051 170.99013619 168.09527546]] ## K=6: Converged in 120 iterations. ## K=6: Elasped time of best run: 3.49 seconds 3. One method to find the best k is to plot the within-cluster sum of squares metric for each run of k value and look for the elbow point in the plot where the rate of decrease in the sum starts to slow down significantly. We can choose the K that corresponds to this point as the optimal K. ## Image: footballImage ## <Figure size 1600x1200 with 0 Axes> ## [<matplotlib.lines.Line2D object at 0x7fc3bffc1d60>] ## Text(0.5, 1.0, ' WCSS vs. Number of Clusters for footballImage ' ) ## Text(0.5, 0, ' Number of Clusters (K) ' ) ## Text(0, 0.5, ' WCSS ' ) ## Image: hestainImage ## <Figure size 1600x1200 with 0 Axes> ## [<matplotlib.lines.Line2D object at 0x7fc3c328a0d0>] ## Text(0.5, 1.0, ' WCSS vs. Number of Clusters for hestainImage ' ) ## Text(0.5, 0, ' Number of Clusters (K) ' ) ## Text(0, 0.5, ' WCSS ' ) ## Image: yardSignImage ## <Figure size 1600x1200 with 0 Axes> ## [<matplotlib.lines.Line2D object at 0x7fc3c32806d0>] ## Text(0.5, 1.0, ' WCSS vs. Number of Clusters for yardSignImage ' ) ## Text(0.5, 0, ' Number of Clusters (K) ' ) ## Text(0, 0.5, ' WCSS ' ) 5
For the football image, the best k is 3. For the hestain image, the best k is 4. For my image, the best k is 4. MNIST Dataset clustering 1. ## Cluster 0 - Majority Label: 1, Purity Score: 0.6753364859949073 ## Cluster 1 - Majority Label: 7, Purity Score: 0.42212257100149475 ## Cluster 2 - Majority Label: 9, Purity Score: 0.3416689693285438 ## Cluster 3 - Majority Label: 2, Purity Score: 0.9044228694714131 ## Cluster 4 - Majority Label: 6, Purity Score: 0.8509731232622799 ## Cluster 5 - Majority Label: 8, Purity Score: 0.43934890304317054 ## Cluster 6 - Majority Label: 4, Purity Score: 0.40398818316100443 ## Cluster 7 - Majority Label: 3, Purity Score: 0.49903735078937234 ## Cluster 8 - Majority Label: 1, Purity Score: 0.5620162356050594 ## Cluster 9 - Majority Label: 0, Purity Score: 0.9398994974874372 2. ## Cluster 0 - Majority Label: 1, Purity Score: 0.3884182305630027 ## Cluster 1 - Majority Label: 7, Purity Score: 0.49723261032161553 ## Cluster 2 - Majority Label: 1, Purity Score: 0.4667779126213592 ## Cluster 3 - Majority Label: 3, Purity Score: 0.5308565786225623 ## Cluster 4 - Majority Label: 9, Purity Score: 0.3229414231790857 ## Cluster 5 - Majority Label: 6, Purity Score: 0.8200713835817762 ## Cluster 6 - Majority Label: 0, Purity Score: 0.8040629095674967 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
## Cluster 7 - Majority Label: 0, Purity Score: 0.8922340061580568 ## Cluster 8 - Majority Label: 2, Purity Score: 0.5766136576239476 ## Cluster 9 - Majority Label: 4, Purity Score: 0.44356435643564357 7 We can compare the metric that gives the better result by evaluating which purity score is higher for each cluster and majority label. In the current iterations, the first metric gives the better result, but these outcomes are affected by the k centroid luster initializations.