GitHub

FCM Cluster Annotation 🔬

This script is for the detection and visualization of heterogeneous and homogeneous expression clusters in Flow Cytometry (FCM) data. It was originally developed to identify patient-specific and subtype-specific leukemic clusters, serving as a form of automated gating for FCM data. While primarily created for CLL and B-ALL leukemias, it can be adapted for other leukemia types upon request. Any questions should be directed to lorenzo.olmomarchal@gmail.com

Installation ⚙️

Clone the repository:

git clone [https://github.com/lolmomarchal/FCM_cluster_annotation.git](https://github.com/lolmomarchal/FCM_cluster_annotation.git)

Install the FlowSOM dependency: The script uses the FlowSOM algorithm for its clustering analysis.

pip install git+[https://github.com/saeyslab/FlowSOM_Python](https://github.com/saeyslab/FlowSOM_Python)

Set up the environment: It's recommended to use a Conda environment to manage dependencies.

# Create and activate a new conda environment
conda create -n fcm_env python=3.8
conda activate fcm_env

# Install the required packages
pip install -r requirements.txt

(Note: You will need a requirements.txt file listing all necessary packages.)

How to Run ▶️

You can run the script from the command line using the following parameters:

--significance_value (default: 0.05): Sets the p-value threshold for statistical significance.
--input_directory: Path to the directory containing your FCM data files in .csv or .txt format.
--metadata_csv: Path to the CSV file with metadata for the FCM data.
--percentile (default: 0.9): Threshold for filtering data points based on percentile.
--standard_dev_threshold (default: 3): Threshold for filtering based on standard deviations.
--neighbors (default: 30): Number of neighbors to consider in UMAP analysis.
--min_dist (default: 0.5): Minimum distance parameter for UMAP clustering.
--leukemia_type (default: "CLL"): Specifies the leukemia type, such as CLL or ALL, for specific processing.
--percent_sampling (default: 1): The percentage of cells to sample (e.g., 0.5 for 50%).
--times_sampled (default: 1): The number of times to resample the data.
--output_directory: The directory where processed data and results will be saved.
--processes (default: os.cpu_count()): The number of CPU cores to use for parallel processing.

Example Command

python FCM_preprocessing.py --input_directory /data/fcm_input --metadata_csv /data/metadata.csv --output_directory /data/fcm_output --significance_value 0.01 --plot_UMAP --percentile 0.95 --neighbors 15 --min_dist 0.3 --leukemia_type ALL --processes 4

Citations 📖

This project uses the FlowSOM algorithm. Please cite the following papers if you use this tool in your work:

A. Couckuyt, B. Rombaut, Y. Saeys, and S. Van Gassen, “Efficient cytometry analysis with FlowSOM in Python boosts interoperability with other single-cell tools,” Bioinformatics, vol. 40, no. 4, p. btae179, Apr. 2024.

S. Van Gassen et al., “FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data,” Cytometry Part A, vol. 87, no. 7, pp. 636–645, 2015.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
FlowSOM_Python		FlowSOM_Python
CLL_files.csv		CLL_files.csv
README.md		README.md
preprocessing.py		preprocessing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FCM Cluster Annotation 🔬

Installation ⚙️

How to Run ▶️

Example Command

Citations 📖

About

Uh oh!

Releases

Packages

Uh oh!

Languages

lolmomarchal/FCM_cluster_annotation

Folders and files

Latest commit

History

Repository files navigation

FCM Cluster Annotation 🔬

Installation ⚙️

How to Run ▶️

Example Command

Citations 📖

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages