Daniel Sage and Emilien Silly, Biomedical Imaging Group, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. 11 July 2025.
This project investigates the use of Vision Transformers (ViTs) for single particle tracking in noisy microscopy data. The goal is to train models capable of predicting the diffusion coefficient (D) from synthetic fluorescence image sequences and performing accurate tracking on real experimental data, with the objective of surpassing traditional particle tracking methods in challenging conditions.
-
ViT-based Regression: Vision Transformers are used to regress the diffusion coefficient directly from sequences of images showing Brownian motion. The model captures both spatial and temporal using the vision transformer architecture.
-
Synthetic Data Generation: Simulated datasets are generated by modeling particle movement through Brownian motion. Each simulation encodes parameters such as the diffusion coefficient, point spread function (PSF), pixel resolution, and frame rate.
-
Real Data Tracking: The trained model is applied to real fluorescence microscopy recordings to detect and track particle positions over time and infer physical properties such as diffusivity.
-
Submission at SMLMS 2025: This projet was submitted at SMLMS 2025 (Single Molecule Localization Microscopy Symposium) as Poster.
- PyTorch-based implementation of Vision Transformers for regression tasks
- Modular code, the image encoding method, size of images, type of model, supplementar input can be adapted to other scenarios
- Efficient pipeline for generating syntethic data using the AnDi-challenge trajectory simulator and self-created image generator
- Adapted pipeline for prediction on real data by tracking, extracting patches and prediction using models
├── Experiments
│ ├── Denoising
│ ├── Embeddings
│ ├── Framerate
│ ├── ImagesFeatures
│ ├── PSFNoise
│ └── mitochondria_simulation
│ └── old_version
│ └── validation_trajectories
│ ├── 20
│ ├── 30
│ ├── valTrajsInOrder.npy
│ └── valTrajsInOrderImFt.npy
├── ProjectReport.pdf
├── README.md
├── Real_data_example.ipynb
├── helpers
│ ├── helpersFeatures.py
│ ├── helpersGeneration.py
│ ├── helpersMSD.py
│ ├── helpersPlot.py
│ └── helpersTracking.py
├── models.py
├── outPoster
├── real-data
│ ├── 70_01_7.tif
│ └── 83_01_3.tif
├── tests
│ ├── RealDataTests.ipynb
│ ├── Simulator_tests
│ ├── models_tests
│ └── train_tests
The experiments folder contains the different changes done to the models done to estimate the limitations and gains over other classical/ML models. Each subfolder ABC named after 1 type of experiment, contains 1 trainSettingsABC.py file with all the settings used in this experiments, a file trainModelsABC.py with the code for training and a .ipynb notebook for analysis of the training run. A training run usually run in 1 hour on a system with GPU and will save the results in train_results_ABC.pth
See for example the folder PSFNoise comparing models trained on different levels of noise and PSF size, the result of which is published in our SMLMS Submission.
The folder validation_trajctories stores trajectories used for computing the validation loss, which the models do not see during training.
The folder mitochondria_simulation contains the tentative simulations to match confined diffusion inside mitochondria christae. This part of the project could not be completed due to lack of time.
helpers folder contains functions for the other notebooks and training files, split depending on needed purpose.
All models pytorch implementation are in models.py. It contains the Vision Transformer model implementation which is highly configurable. The embedding method, patch size, internal vector size and number of outputs can be changed to match the target implementation. A standard ResNet is also implemented for comparison.
outPoster folder contains the images and graphs used in the poster of SMLMS 2025 submission.
Contains the real images used for prediction using our models. Results are not as accurate as expected, showing supplementar work needed for this task.
This is a Masters Project (12 ECTS) done at EPFL in the Biomedical-Imaging Group (BIG) by Emilien Silly (Master student in Data science) under the supervision of Daniel Sage.