A Python implementation of the Diffusion Maps algorithm for non-linear dimensionality reduction, designed to follow scikit-learn conventions.
Diffusion Maps model data as a graph and analyze the diffusion process (random walk) on this graph. This allows capturing the underlying geometry and connectivity of the data, embedding it into a lower-dimensional space where Euclidean distances correspond to "diffusion distances" in the original space.
- Implements the Diffusion Maps algorithm with alpha-normalization.
- Scikit-learn compatible API (
fit
,transform
,fit_transform
). - Nystrom extension for out-of-sample transformation.
- RBF (Gaussian) kernel for affinity calculation.
- Configurable diffusion time steps (
steps
).
You can install this package directly from GitHub using pip:
pip install git+https://github.com/sgh14/diffusion-maps-with-nystrom.git
Or, to install a specific version (e.g., v0.1.0 tag):
pip install git+https://github.com/sgh14/diffusion-maps-with-nystrom.git@v0.1.0
Here's a basic template of how to use the DiffusionMaps class:
from diffusionmaps import DiffusionMaps
# Prepare your data
X_train = ... # Your training data
# Initialize and fit the Diffusion Maps model
dm = DiffusionMaps(n_components=2, sigma=2.0, steps=1, alpha=0.5)
embeddings_train = dm.fit_transform(X_train)
# Use Nyström's method for embedding new data
X_new = ... # New data
embeddings_new = dm.transform(X_new)
The main class is diffusionmaps.DiffusionMaps.
DiffusionMaps(n_components: int, sigma: float, steps: int = 1, alpha: float = 0.0)
-
n_components
: Target dimensionality. -
sigma
: Scale parameter for the RBF kernel ($\exp(-||x-y||^2 / (2 \sigma^2))$). Controls locality. -
steps
: Diffusion time (exponent t for eigenvalues lambda^t). Default is 1. -
alpha
: Kernel normalization parameter (0.0, 0.5, or 1.0 are common). Controls density influence.
See the class docstrings for detailed information on methods and attributes.
This project is licensed under the MIT License - see the LICENSE file for details.