Add package by opening the Julia package manager with "]" then type in REPL
add https://github.com/bastikusch/DiffusionMap.jl
This package is a concise implementation of the diffusion mapping method. It takes a given dataset in matrix form
data = rand(100,10)
and frames the diffusion map problem as follows.
dp = Diffusionmap(data; kernel, laplace_type, threshold)
As a kernel for computing the adjacency matrix the following ones can be used
Supported types of Laplacian matrices are
laplace_type | Description | Default |
---|---|---|
RegularLaplacian() | ✔️ | |
RowNormalizedLaplacian() | ❌ | |
SymmetricLaplacian() | ❌ | |
Adjacency() | ❌ | |
NormalizedAdjacency() | ❌ |
As a next neighbor threshold, any integer betwen 0 and n (n being the number of data rows) can be chosen (default is n). Determines the number of next neighbors to be kept in the adjacency matrix, which controls the amount of locality within the data set.
Having framed the diffusion map problem, one can perform the eigen decomposition of the Laplacian with the 'solve()'-method.
evals, evecs = solve(dm, eigensolver=FullEigen())
It returns the eigen values and the eigen vecors, both in vector format and sorted for their impact and the underlying embedding. This means that graph laplacians are sorted by their smallest real part and adjacency type laplacians (Adjacency() and NormalizedAdjacency()) by their largest real part. Both types have the constant eigen vector eigen value pair as their first vector entry, so analyses should keep in mind to start with vector entries 2, as this is the first relevant eigen vector, given that the local network is connected.
Possible eigen solvers contain full decomposition, as well as faster partial decompositions.
eigensolver | Description | Default |
---|---|---|
FullEigen() | Uses the method eigen(L) from the package LinearAlgebra.jl |
✔️ |
ArpackEigen(n_first) | Uses the method eigs(L) from the package Arpack.jl to get the n first eigenvectors |
❌ |
KrylovEigen(n_first) | Uses the method eigsolve(L) from the package KrylovKit.jl to get the n first eigenvectors |
❌ |
using StatsBase, Plots, DiffusionMap
# put any data matrix you want here
data = standardize(ZScoreTransform, rand(1000,20), dims=2)
# Frame diffusion problem
k = InverseDistanceKernel() # kernel
laplace = RowNormalizedLaplacian() # laplace_type
nn = 5 # next neighbors
dm = Diffusionmap(data, kernel = k, laplace_type = laplace, threshold=nn)
# Perform eigen decomposition
evals, evecs = solve(dm, eigensolver=FullEigen());
# Scatter plot eigen vectors, cloured by one arbitrary data column
scatter(evecs[2], evecs[3], marker_z=data[:,1], label="")