@@ -34,19 +34,21 @@ class OPTICS(ClusterMixin, BaseEstimator):
34
34
"""Estimate clustering structure from vector array.
35
35
36
36
OPTICS (Ordering Points To Identify the Clustering Structure), closely
37
- related to DBSCAN, finds core sample of high density and expands clusters
38
- from them [1]_. Unlike DBSCAN, keeps cluster hierarchy for a variable
37
+ related to DBSCAN, finds core samples of high density and expands clusters
38
+ from them [1]_. Unlike DBSCAN, it keeps cluster hierarchy for a variable
39
39
neighborhood radius. Better suited for usage on large datasets than the
40
- current sklearn implementation of DBSCAN.
40
+ current scikit-learn implementation of DBSCAN.
41
41
42
- Clusters are then extracted using a DBSCAN-like method
43
- (cluster_method = 'dbscan') or an automatic
42
+ Clusters are then extracted from the cluster-order using a
43
+ DBSCAN-like method (cluster_method = 'dbscan') or an automatic
44
44
technique proposed in [1]_ (cluster_method = 'xi').
45
45
46
46
This implementation deviates from the original OPTICS by first performing
47
- k-nearest-neighborhood searches on all points to identify core sizes, then
48
- computing only the distances to unprocessed points when constructing the
49
- cluster order. Note that we do not employ a heap to manage the expansion
47
+ k-nearest-neighborhood searches on all points to identify core sizes of
48
+ all points (instead of computing neighbors while looping through points).
49
+ Reachability distances to only unprocessed points are then computed, to
50
+ construct the cluster order, similar to the original OPTICS.
51
+ Note that we do not employ a heap to manage the expansion
50
52
candidates, so the time complexity will be O(n^2).
51
53
52
54
Read more in the :ref:`User Guide <optics>`.
@@ -68,9 +70,9 @@ class OPTICS(ClusterMixin, BaseEstimator):
68
70
69
71
metric : str or callable, default='minkowski'
70
72
Metric to use for distance computation. Any metric from scikit-learn
71
- or scipy.spatial.distance can be used.
73
+ or :mod:` scipy.spatial.distance` can be used.
72
74
73
- If metric is a callable function, it is called on each
75
+ If ` metric` is a callable function, it is called on each
74
76
pair of instances (rows) and the resulting value recorded. The callable
75
77
should take two arrays as input and return one value indicating the
76
78
distance between them. This works for Scipy's metrics, but is less
@@ -90,8 +92,7 @@ class OPTICS(ClusterMixin, BaseEstimator):
90
92
'yule']
91
93
92
94
Sparse matrices are only supported by scikit-learn metrics.
93
- See the documentation for scipy.spatial.distance for details on these
94
- metrics.
95
+ See :mod:`scipy.spatial.distance` for details on these metrics.
95
96
96
97
.. note::
97
98
`'kulsinski'` is deprecated from SciPy 1.9 and will be removed in SciPy 1.11.
@@ -105,9 +106,9 @@ class OPTICS(ClusterMixin, BaseEstimator):
105
106
metric_params : dict, default=None
106
107
Additional keyword arguments for the metric function.
107
108
108
- cluster_method : str , default='xi'
109
+ cluster_method : {'xi', 'dbscan'} , default='xi'
109
110
The extraction method used to extract clusters using the calculated
110
- reachability and ordering. Possible values are "xi" and "dbscan".
111
+ reachability and ordering.
111
112
112
113
eps : float, default=None
113
114
The maximum distance between two samples for one to be considered as
0 commit comments