Skip to content

tsinghua-fib-lab/EcoPretrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multiscale structure-encoded pretraining

A python implementation of our manuscript "Multiscale structure-encoded pretraining for modeling evolutionary microbial communities”.

This work introduces a unified, self-supervised pretraining framework to model the complex dynamics of evolutionary microbial communities. Evolution unfolds across individual (species traits), community (species interactions), and generational (temporal sequences) scales, creating high-dimensional and stochastic processes that are challenging to analyze.

framework

This repository contains the source code for preprocessing data and reproducing experimental results. The source code is implemented in Python.


Installation Guide

  • Install Anaconda https://docs.anaconda.com/anaconda/install/index.html

  • Create an experimental environment and install request packages: (this will take about 0.5 hours)

    conda create --name ecopretrain python=3.9
    conda activate ecopretrain
    pip install omegaconf tqdm scikit-learn torchdiffeq torch

Data

  • Landscape: The Landscape/data folder provides the original data, preprocessing scripts, and processed data for the three experimental adaptive landscape systems discussed in the manuscript:
    • E. coli toxin-antitoxin system
    • Saccharomyces cerevisiae transfer RNA (tRNA) system
    • Entacmaea quadricolor fluorescent protein system
  • MacArthur: The MacArthur/ folder provides the simulation scripts and data preprocessing code for the resource competition model, adapted from the framework developed by Tikhonov et al.
  • Coli: The Coli/data folder provides the original data, preprocessing scripts, and processed data for the real-world E. coli laboratory evolution system with multi-antibiotic resistance.

Experiments

This repository provides the source code for all experiments in the main text and supplementary materials. The specific scripts are organized as follows:

  • Landscape/: Contains the experimental scripts to reproduce Figures 2, 3, and 4 of the main text and Supplementary Figure 2.
  • MacArthur/: Contains the experimental scripts to reproduce Figure 5 of the main text and Supplementary Figure 1.
  • Coli/: Contains the experimental scripts to reproduce Figure 6 of the main text.

Acknowledgments

The real-world and simulated datasets used in this paper were collected and adapted from the following open-source studies and projects:

  • CCU: Saccharomyces cerevisiae tRNA system
  • eqFP611: Entacmaea quadricolor fluorescent protein system
  • ParD3: E. coli toxin-antitoxin system
  • MacArthur: Resource-competition model
  • E. Coli: Multi-antibiotic resistance E. coli evolution experiment.

License

This repo is covered under the MIT License.

About

Multiscale structure-encoded pretraining for modeling evolutionary microbial communities

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published