201

.jpg

School

Govt. College for the Elementary Teachers, Kasur *

*We aren’t endorsed by this school

Course

AI

Subject

Computer Science

Date

Nov 24, 2024

Type

jpg

Pages

1

Uploaded by hdhdbdn

Report
In the end, there are three kinds of points: core points, points that are within distance eps of core points (called boundary points), and noise. When the DBSCAN algorithm is run on a particular dataset multiple times, the clustering of the core points is always the same, and the same points will always be labeled as noise. However, a boundary point might be neighbor to core samples of more than one cluster. Therefore, the cluster membership of boundary points depends on the order in which points are vis- ited. Usually there are only few boundary points, and this slight dependence on the order of points is not important. Let’s apply DBSCAN on the synthetic dataset we used to demonstrate agglomerative clustering. Like agglomerative clustering, DBSCAN does not allow predictions on new test data, so we will use the fit_predict method to perform clustering and return the cluster labels in one step: In[65]: from sklearn.cluster import DBSCAN X, v = make_blobs(random_state=0, n_samples=12) dbscan = DBSCAN() clusters = dbscan.fit_predict(X) print("Cluster memberships:\n{}".format(clusters)) Out[65]: Cluster memberships: [1 -1 -1 -1 <1 -1 -1 -1 -1 -1 -1 -1] As you can see, all data points were assigned the label -1, which stands for noise. This is a consequence of the default parameter settings for eps and min_samples, which are not tuned for small toy datasets. The cluster assignments for different values of min_samples and eps are shown below, and visualized in Figure 3-37: In[66]: mglearn.plots.plot_dbscan() Out[66]: min_samples: 2 eps: 1.000000 cluster: [-1 06 0 -1 06 -1 1 1 0 1 -1 -1] min_samples: 2 eps: 1.500000 cluster: [011110221220] min_samples: 2 eps: 2.000000 cluster: [0 1111000100 0] min_samples: 2 eps: 3.000000 cluster: [0 OO OO0 00000 0] min_samples: 3 eps: 1.000000 cluster: [-1 © 6 -1 0 -1 1 1 0 1 -1 -1] min_samples: 3 eps: 1.500000 cluster: [0 11110221220] min_samples: 3 eps: 2.000000 cluster: [01111000100 0] min_samples: 3 eps: 3.000000 cluster: [0 OO O OO 00000 O0] min_samples: 5 eps: 1.000000 cluster: [-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1] min_samples: 5 eps: 1.500000 cluster: [-1 © 06 © © -1 -1 -1 0 -1 -1 -1] min_samples: 5 eps: 2.000000 cluster: [-1 06 06 © 0 -1 -1 -1 0 -1 -1 -1] min_samples: 5 eps: 3.000000 cluster: [PO OO OO0 O000O0 0] 188 | Chapter3: Unsupervised Learning and Preprocessing
Discover more documents: Sign up today!
Unlock a world of knowledge! Explore tailored content for a richer learning experience. Here's what you'll get:
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help