k-DBCV

k-DBCV is an efficient python implementation of the density based cluster validation (DBCV) score proposed by Moulavi et al. (2014). The implementation leverages a k-dimensional tree to efficiently calculate intercluster distances resulting in improved performance when compared with previous implementations.

For optimizing k-DBCV to choose parameters from commonly used density-based clustering algorithms (DBSCAN, HDBSCAN, OPTICS) we recommend looking at our DBOpt library: https://github.com/Kaufman-Lab-Columbia/DBOpt

Getting Started

Dependencies

SciPy
NumPy

Installation

k-DBCV can be installed via pip:

pip install kDBCV

Usage

To score clustering scenarios, the following libraries are used:

scikit-learn
ClustSim

For visualization:

matplotlib

DBCV Score

Simple Scenario

The half moons dataset simulated from scikit-learn is shown:

DBCV_Score(X,labels)

Output: 0.5068928345037831

Scenario II

A larger dataset of clusters simulated with Clust_Sim-SMLM is shown:

score = DBCV_score(X,labels)

Output: 0.6171526846848352

Extracting Individual Cluster Scores

k-DBCV enables individual cluster score extraction where each cluster is assigned a score without consideration for noise: Individual Cluster Score = separation-sparseness/max(separation,sparseness)

By default, ind_clust_scores is set to False

score, ind_clust_score_array = DBCV_Score(X,labels, ind_clust_scores = True)

Individual cluster scores are displayed by color below:

Memory cutoff

A memory cutoff is necessary to prevent attempts to score clusters that would exceed available memory. This cutoff should be set dependent on the machine being used. The default is set to a maximum of 25.0 GB. The score will output a -1 if the cutoff would be exceeded, along with an error message. To remove these error messages set batch_mode = True (Default is False).

score = DBCV_score(X,labels, memory_cutoff = 25.0)

Relevant Citations

Density Based Cluster Validation

Moulavi, D., Jaskowiak, P. A., Campello, R. J. G. B., Zimek, A. & Sander, J. Density-based clustering validation. SIAM Int. Conf. Data Min. 2014, SDM 2014 2, 839–847 (2014)

k-DBCV implementation

Hammer, J. L., Devanny, A. J. & Kaufman, L. J. Density-based optimization for unbiased, reproducible clustering applied to single molecule localization microscopy. Preprint at https://www.biorxiv.org/content/10.1101/2024.11.01.621498v1 (2024)

License

k-DBCV is licensed with an MIT license. See LICENSE file for more information.

Referencing

In addition to citing Moulavi et al., if you use this repository, please cite with the following (currently in preprint):

Hammer, J. L., Devanny, A. J. & Kaufman, L. J. Density-based optimization for unbiased, reproducible clustering applied to single molecule localization microscopy. Preprint at https://www.biorxiv.org/content/10.1101/2024.11.01.621498v1 (2024)

Contact

kaufmangroup.rubylab@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
kDBCV		kDBCV
.gitignore		.gitignore
Intro_kDBCV.ipynb		Intro_kDBCV.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

k-DBCV

Getting Started

Dependencies

Installation

Usage

DBCV Score

Simple Scenario

Scenario II

Extracting Individual Cluster Scores

Memory cutoff

Relevant Citations

Density Based Cluster Validation

k-DBCV implementation

License

Referencing

In addition to citing Moulavi et al., if you use this repository, please cite with the following (currently in preprint):

Contact

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Kaufman-Lab-Columbia/k-DBCV

Folders and files

Latest commit

History

Repository files navigation

k-DBCV

Getting Started

Dependencies

Installation

Usage

DBCV Score

Simple Scenario

Scenario II

Extracting Individual Cluster Scores

Memory cutoff

Relevant Citations

Density Based Cluster Validation

k-DBCV implementation

License

Referencing

In addition to citing Moulavi et al., if you use this repository, please cite with the following (currently in preprint):

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages