Skip to content

A Python implementation of the Diffusion Maps algorithm for non-linear dimensionality reduction, designed to follow scikit-learn conventions.

License

Notifications You must be signed in to change notification settings

sgh14/diffusion-maps-with-nystrom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Diffusion Maps with Nyström out-of-sample extension

License: MIT

A Python implementation of the Diffusion Maps algorithm for non-linear dimensionality reduction, designed to follow scikit-learn conventions.

Diffusion Maps model data as a graph and analyze the diffusion process (random walk) on this graph. This allows capturing the underlying geometry and connectivity of the data, embedding it into a lower-dimensional space where Euclidean distances correspond to "diffusion distances" in the original space.

Key Features

  • Implements the Diffusion Maps algorithm with alpha-normalization.
  • Scikit-learn compatible API (fit, transform, fit_transform).
  • Nystrom extension for out-of-sample transformation.
  • RBF (Gaussian) kernel for affinity calculation.
  • Configurable diffusion time steps (steps).

Installation

You can install this package directly from GitHub using pip:

pip install git+https://github.com/sgh14/diffusion-maps-with-nystrom.git 

Or, to install a specific version (e.g., v0.1.0 tag):

pip install git+https://github.com/sgh14/diffusion-maps-with-nystrom.git@v0.1.0

Usage

Here's a basic template of how to use the DiffusionMaps class:

from diffusionmaps import DiffusionMaps

# Prepare your data
X_train = ...  # Your training data

# Initialize and fit the Diffusion Maps model
dm = DiffusionMaps(n_components=2, sigma=2.0, steps=1, alpha=0.5)
embeddings_train = dm.fit_transform(X_train)

# Use Nyström's method for embedding new data
X_new = ...  # New data
embeddings_new = dm.transform(X_new)

API Overview

The main class is diffusionmaps.DiffusionMaps.

DiffusionMaps(n_components: int, sigma: float, steps: int = 1, alpha: float = 0.0)
  • n_components: Target dimensionality.
  • sigma: Scale parameter for the RBF kernel ($\exp(-||x-y||^2 / (2 \sigma^2))$). Controls locality.
  • steps: Diffusion time (exponent t for eigenvalues lambda^t). Default is 1.
  • alpha: Kernel normalization parameter (0.0, 0.5, or 1.0 are common). Controls density influence.

See the class docstrings for detailed information on methods and attributes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A Python implementation of the Diffusion Maps algorithm for non-linear dimensionality reduction, designed to follow scikit-learn conventions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages