July 2025: revamped repository with LLM coding assist.
This repository contains Python code for my peer-reviewed publication published in Computer Physics Communications.
During my doctoral studies, I creatively combined machine learning with "traditional" computer science techniques, in order to automatically conduct a biophysical analysis. In particular, my amalgam of hierarchical clustering and Breath-First Search (BFS) allowed me to quickly analyze simulated synaptic molecular clusters, the organization of which are critical to brain functions. More details can be found in my peer-reviewed publication in Physical Review E.
Technical problem statement:
- We have a 2D grid of 0s and 1s. We want to count how many connected clusters of 1s there are.
- The catch is that there are periodic boundary conditions: grid edges wrap around and connect to the opposite side. This is a common method for running compute-limited simulations of a very large grid.
- And that is why we need both BFS and hierarchical clustering!
Tested on Python 3.5, 3.6, 3.7, 3.12. Please see requirements.txt file for dependencies.
All code files are located under src
. For a visual demonstration of hierarchical clustering
see ClusteringConcept-Showcase.ipynb
. As for a possible workflow using all code files,
please see CompletWorkflow-Showcase.ipynb
.
Here are their purposes, briefly:
clustering.py
- Wrappers for hierarchical clustering routines provided by
SciPy
; output feeds into functions defined ingrouping.py
grouping.py
- Handles periodic boundaries conditions; invokes
BFS.py
BFS.py
- Executes breadth-first search to identify clusters touching across periodic boundaries'
tests.py
For now: basic unit tests on BFS.py
. More to be implemented.