-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Background
Galaxy is an open-source platform designed to make advanced bioinformatics analyses accessible and reproducible. Among its many applications, constraint-based metabolic modeling (CBM) plays a pivotal role in exploring cellular metabolism through predictive simulations of flux distributions in metabolic networks.
Current tools in the Galaxy ecosystem, such as MaREA4Galaxy, provide robust capabilities for analyzing metabolic networks based on gene expression data. However, these tools are primarily focused on bulk RNA-seq data and do not yet fully support single-cell RNA-seq (scRNA-seq) or spatial transcriptomics data. High-resolution single-cell and spatial data offer unprecedented opportunities to study metabolic heterogeneity and spatially localized metabolic activities but require significant adaptations to workflows and computational tools.
Existing Python libraries, such as COBRApy, implement key CBM techniques, including flux balance analysis (FBA) and flux variability analysis (FVA). Efforts like cobraxy have already ported some of these functionalities into Galaxy, but gaps remain—particularly in the support for single-cell FBA (scFBA) and the integration of spatial transcriptomics workflows. Moreover, there is a need to improve the computational efficiency of sampling algorithms by interfacing directly with solvers such as Gurobi o Glpk.
This project aims to address these gaps by extending Galaxy’s CBM capabilities to support single-cell and spatial data integration, along with optimizations to statistical testing and computational efficiency.
Goal
This project builds on the foundation of MaREA4Galaxy, a Galaxy tool designed for metabolic reaction enrichment analysis, expanding its scope to:
- Support single-cell metabolic analysis: Implement models like scFBA, which integrate transcriptomics data into population-based flux models to capture metabolic heterogeneity at single-cell resolution.
- Integrate spatial transcriptomics workflows: Enable mapping of metabolic activities onto physical tissue architectures and co-localization analyses.
- Improve computational efficiency: Optimize the sampling algorithm by directly interfacing with solvers like Gurobi or Glpk, bypassing intermediate steps in COBRApy to reduce runtime.
- Enhance statistical testing: Introduce advanced methods for pathway enrichment analyses, including p-value adjustments (e.g., Bonferroni, FDR) and Bayesian approaches to improve result reliability.
- Develop visualization tools: Enable spatial overlays and interactive visualizations for flux distributions and pathway activities.
Difficulty Level: Medium
This project is categorized as medium difficulty because the integration of existing Python tools into Galaxy workflows is straightforward but requires careful adaptation to handle single-cell and spatial data effectively.
Size and Length of Project
- medium: 175 hours
- 12 weeks
Note that the project length for small projects should be 10-12 weeks.
Skills
Essential skills:
- Python programming
- Constraint-based metabolic modeling (e.g., FBA, scFBA)
Nice to have skills:
- Galaxy tool development
- Experience with single-cell scRNA-seq
Public Repository
The existing COBRAxy tools can be found in the following repository:
COBRAxy on Galaxy ToolShed
Potential Mentors
Chiara Damiani, chiara.damiani@unimib.it
Bruno Galuzzi , brunogiovanni.galuzzi@cnr.it
Getting started:
- have a look at the demo of current COBRAxy version here: http://marea4galaxy.cloud.ba.infn.it/galaxy
- read the references listed in the readme of each tool of the toolsuite
- try perform an analysis using this tutorial: https://drive.google.com/file/d/125l1SqjJ2wrnQl9e7D6_UdxqmK9afsbI/view?usp=sharing
- read the preprint on spatial flux balance analysis for flux visualization ideas: https://www.biorxiv.org/content/10.1101/2024.11.28.625842v2