Skip to content

Commit a2ceff3

Browse files
authored
DOC Minor updates to OPTICS docstring (scikit-learn#31363)
1 parent 67c72f9 commit a2ceff3

File tree

1 file changed

+15
-14
lines changed

1 file changed

+15
-14
lines changed

sklearn/cluster/_optics.py

Lines changed: 15 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -34,19 +34,21 @@ class OPTICS(ClusterMixin, BaseEstimator):
3434
"""Estimate clustering structure from vector array.
3535
3636
OPTICS (Ordering Points To Identify the Clustering Structure), closely
37-
related to DBSCAN, finds core sample of high density and expands clusters
38-
from them [1]_. Unlike DBSCAN, keeps cluster hierarchy for a variable
37+
related to DBSCAN, finds core samples of high density and expands clusters
38+
from them [1]_. Unlike DBSCAN, it keeps cluster hierarchy for a variable
3939
neighborhood radius. Better suited for usage on large datasets than the
40-
current sklearn implementation of DBSCAN.
40+
current scikit-learn implementation of DBSCAN.
4141
42-
Clusters are then extracted using a DBSCAN-like method
43-
(cluster_method = 'dbscan') or an automatic
42+
Clusters are then extracted from the cluster-order using a
43+
DBSCAN-like method (cluster_method = 'dbscan') or an automatic
4444
technique proposed in [1]_ (cluster_method = 'xi').
4545
4646
This implementation deviates from the original OPTICS by first performing
47-
k-nearest-neighborhood searches on all points to identify core sizes, then
48-
computing only the distances to unprocessed points when constructing the
49-
cluster order. Note that we do not employ a heap to manage the expansion
47+
k-nearest-neighborhood searches on all points to identify core sizes of
48+
all points (instead of computing neighbors while looping through points).
49+
Reachability distances to only unprocessed points are then computed, to
50+
construct the cluster order, similar to the original OPTICS.
51+
Note that we do not employ a heap to manage the expansion
5052
candidates, so the time complexity will be O(n^2).
5153
5254
Read more in the :ref:`User Guide <optics>`.
@@ -68,9 +70,9 @@ class OPTICS(ClusterMixin, BaseEstimator):
6870
6971
metric : str or callable, default='minkowski'
7072
Metric to use for distance computation. Any metric from scikit-learn
71-
or scipy.spatial.distance can be used.
73+
or :mod:`scipy.spatial.distance` can be used.
7274
73-
If metric is a callable function, it is called on each
75+
If `metric` is a callable function, it is called on each
7476
pair of instances (rows) and the resulting value recorded. The callable
7577
should take two arrays as input and return one value indicating the
7678
distance between them. This works for Scipy's metrics, but is less
@@ -90,8 +92,7 @@ class OPTICS(ClusterMixin, BaseEstimator):
9092
'yule']
9193
9294
Sparse matrices are only supported by scikit-learn metrics.
93-
See the documentation for scipy.spatial.distance for details on these
94-
metrics.
95+
See :mod:`scipy.spatial.distance` for details on these metrics.
9596
9697
.. note::
9798
`'kulsinski'` is deprecated from SciPy 1.9 and will be removed in SciPy 1.11.
@@ -105,9 +106,9 @@ class OPTICS(ClusterMixin, BaseEstimator):
105106
metric_params : dict, default=None
106107
Additional keyword arguments for the metric function.
107108
108-
cluster_method : str, default='xi'
109+
cluster_method : {'xi', 'dbscan'}, default='xi'
109110
The extraction method used to extract clusters using the calculated
110-
reachability and ordering. Possible values are "xi" and "dbscan".
111+
reachability and ordering.
111112
112113
eps : float, default=None
113114
The maximum distance between two samples for one to be considered as

0 commit comments

Comments
 (0)