Skip to content

Plant-Net/TomTom_GRN_Framework

Repository files navigation

🍅Tomato bulk RNA-seq GRN and TDA analysis

Conditions 🧪

The conditions represents tomato infected with 7 different conditions (Meloidogyne incognita 7 and 14 dpi, Botrytis cinerea, Phytophthora infestans, Cladosporium fulvum, and Potato spindle tuber viroid mild and severe strains). Representing a total of 83 samples across 7 infections. The data are publicaly available throught their respective bioproject:

Infection Tissue Controls hpi Controls replicates Infected hpi Infected Replicates BioProject Reference DOI
M.incognita Root 168 8 168 8 PRJNA734743 DOI
M.incognita Root 336 8 336 8 PRJNA734743 DOI
PSTVd Mild strain Root 408 6 408 6 PRJNA515609 DOI
PSTVd Severe strain Root 408 6 PRJNA515609 DOI
B.cinerea Leaf 0 7 30 8 PRJNA662936 DOI
P.infestans Leaf 0 6 72 6 PRJNA505207 DOI
C.fulvum Leaf 72 3 72 3 PRJNA781749 DOI

Summary 🧭

From the multi-transcriptomics bulk RNA-seq data, we applied HIVE. HIVE returned a list of genes available in the data folder but the following framework can be applied to any gene list. From the list, we retrieved the GRN using TomTom neo4j database. We further curate the GRN to have a balance between confidence and sparsity.

We used decoupleR's ULM to infer TF activities and retrieve the significant ones. We consider the previous GRN and t-stat output of DESeq2 perform on each infection independently. The analysis resulted in 43 significant regulatory TFs out of the 71 present in the GRN. We then used decoupleR's MLM to infer pathways activities from KEGG pathways (also available using TomTom).

Topological Data Analysis was performed on the same GRN with corresponding TF activities. We first applied the mapper algorithm to find a simpler representation of the GRN, and we further used the ToMATo algorithm to find groups on the mapper graph obtained before.

To find representatives nodes in each of the groups, we performed hub detection using degree and betweenness as metrics. We identified hubs as having the max value of either degree or betweenness in each group.

Installation ⚙️

conda create --name ENV_NAME python=3.12 pip install -r requirements.txt

You also need R requirements such as DESEq2, edgeR, ggplot2, ... present in the R scripts.

Run the framework ▶️

You need the HIVE selection present in Data/ or any other matrix

From the raw transcriptomics table: DEA/DEA.R to perform the DEA and get the needed Wald stats. Then DEA/Merge_matrix.ipynb to get the merged data used after.

GRN and Activities: GRN.ipynb to retrieve the necessary networks (GRN and KEGG) from TomTom and check them. TF_pathway_activity.ipynb to perform TF and pathway activites.

TDA: TDA/Prepare_data.ipynb to format the data for TDA. TDA/mapper.py to obtain the TDA network colored for all pathogens and the four configuration. Finally, TDA/Hub_TDA.ipynb to detect hub in each of the configuration and TDA/Pathway_acts_hubs.ipynb to check for pathways activity in the sub-GRN of the hubs.

For the plots, most of them are obtain with Plot/Plot_clean.ipynb or directly within TF_pathway_acitivity.ipynb

Reference ✍️

You can find all the detailed and explained results here

DOI

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages