Code accompanying the research paper "Neurophysiologically Realistic Environment for Comparing Adaptive Deep Brain Stimulation Algorithms in Parkinson’s Disease" by Kuzmina E., Kriukov D., Lebedev M., Dylov. D, 2025, published in ???
This repository provides a neurophysiologically realistic environment for developing and benchmarking Reinforcement Learning (RL) algorithms for adaptive Deep Brain Stimulation (aDBS) in Parkinson’s Disease (PD). The framework is built around a computationally efficient Kuramoto model, simulating neural dynamics with three key feature groups:
- Bandwidth Features: Beta-band oscillations (12–35 Hz) and other frequency bands to model PD biomarkers.
- Spatial Features: Neuron-electrode interactions, partial observability, and directional stimulation.
- Temporal Features: Neural/electrode drift, beta bursting, and non-stationary dynamics.
Pipeline for adaptive Deep Brain Stimulation. And features, introduced in our environment to create realistic neural activity.
The environment is designed for both neuroscientists (to bridge modeling with ML) and ML engineers (to tackle neurobiological challenges).
- Flexible Configuration: Supports three pre-defined environment regimes (
env0
,env1
,env2
) with increasing complexity. - Realistic PD Dynamics: Simulates pathological oscillations, spatial coupling, and temporal drift.
- RL Integration: Wrapped as a gymnasium environment (SpatialKuramoto) for seamless RL training.
- Benchmarking: Includes baseline controllers (HF-DBS, PID) and tested RL algorithms.
- Clone the repository:
git clone https://github.com/NevVerVer/DBS-Gym
- Navigate to the repository directory:
cd DBS-Gym
- Install the required packages:
pip install -r requirements.txt
from neurokuranto.model_v1 import SpatialKuramoto
from stable_baselines3 import PPO
# Initialize environment
env = SpatialKuramoto(params_dict=params_dict_train)
# Train RL agent
agent = PPO("MlpPolicy", env, n_steps=2048)
agent.learn(total_timesteps=2e6)
Example of how usual high-frequency DBS works in the environment.
env0
: Bandwidth features only (beta oscillations).env1
: Adds spatial features (electrode placement, partial observability).env2
: Includes temporal drift and non-stationary dynamics.
Configuration files are in environment/env_configs/
. Modify params_dict
to adjust features.
Tested algorithms:
Agents were trained separately for each environment level and evaluated under perturbations (e.g., electrode drift).
To evaluate robustness, agents were tested under progressive disturbances. This an example of benchmarking task:
- Electrode Encapsulation: Conductance reduced by 25% every 5 episodes.
- Neural Drift: Natural frequencies shifted by 1% per episode.
- Electrode Movement: Random spatial shifts every 7 episodes.
Example results (see paper): SAC
outperformed others in maintaining suppression under harsh drift.
- Implement custom reward functions (e.g., threshold-based efficiency).
- Extend with additional features (e.g., multi-contact electrodes, charge-balanced pulses).
- Benchmark novel RL algorithms or hybrid controllers.
The Kuramoto model is lightweight and modular — ideal for rapid experimentation!
If you use the code or the findings from our paper, please cite:
For any questions or clarifications, please reach out to: ekaterina.kuzmina@skoltech.ru
Thanks to @michelllepan and @vivekmyers for the CQL implementation.
We welcome contributions to this repository! If you're found errors in code or experiments, please open an issue to discuss your ideas.
This project is licensed under the MIT License - see the LICENSE file for details.