The conditions represents tomato infected with 7 different conditions (Meloidogyne incognita 7 and 14 dpi, Botrytis cinerea, Phytophthora infestans, Cladosporium fulvum, and Potato spindle tuber viroid mild and severe strains). Representing a total of 83 samples across 7 infections. The data are publicaly available throught their respective bioproject:
Infection | Tissue | Controls hpi | Controls replicates | Infected hpi | Infected Replicates | BioProject | Reference DOI |
---|---|---|---|---|---|---|---|
M.incognita | Root | 168 | 8 | 168 | 8 | PRJNA734743 | DOI |
M.incognita | Root | 336 | 8 | 336 | 8 | PRJNA734743 | DOI |
PSTVd Mild strain | Root | 408 | 6 | 408 | 6 | PRJNA515609 | DOI |
PSTVd Severe strain | Root | 408 | 6 | PRJNA515609 | DOI | ||
B.cinerea | Leaf | 0 | 7 | 30 | 8 | PRJNA662936 | DOI |
P.infestans | Leaf | 0 | 6 | 72 | 6 | PRJNA505207 | DOI |
C.fulvum | Leaf | 72 | 3 | 72 | 3 | PRJNA781749 | DOI |
From the multi-transcriptomics bulk RNA-seq data, we applied HIVE. HIVE returned a list of genes available in the data folder but the following framework can be applied to any gene list. From the list, we retrieved the GRN using TomTom neo4j database. We further curate the GRN to have a balance between confidence and sparsity.
We used decoupleR's ULM to infer TF activities and retrieve the significant ones. We consider the previous GRN and t-stat output of DESeq2 perform on each infection independently. The analysis resulted in 43 significant regulatory TFs out of the 71 present in the GRN. We then used decoupleR's MLM to infer pathways activities from KEGG pathways (also available using TomTom).
Topological Data Analysis was performed on the same GRN with corresponding TF activities. We first applied the mapper algorithm to find a simpler representation of the GRN, and we further used the ToMATo algorithm to find groups on the mapper graph obtained before.
To find representatives nodes in each of the groups, we performed hub detection using degree and betweenness as metrics. We identified hubs as having the max value of either degree or betweenness in each group.
conda create --name ENV_NAME python=3.12 pip install -r requirements.txt
You also need R requirements such as DESEq2, edgeR, ggplot2, ... present in the R scripts.
You need the HIVE selection present in Data/ or any other matrix
From the raw transcriptomics table:
DEA/DEA.R
to perform the DEA and get the needed Wald stats. Then DEA/Merge_matrix.ipynb
to get the merged data used after.
GRN and Activities:
GRN.ipynb
to retrieve the necessary networks (GRN and KEGG) from TomTom and check them.
TF_pathway_activity.ipynb
to perform TF and pathway activites.
TDA:
TDA/Prepare_data.ipynb
to format the data for TDA.
TDA/mapper.py
to obtain the TDA network colored for all pathogens and the four configuration.
Finally, TDA/Hub_TDA.ipynb
to detect hub in each of the configuration and TDA/Pathway_acts_hubs.ipynb
to check for pathways activity in the sub-GRN of the hubs.
For the plots, most of them are obtain with Plot/Plot_clean.ipynb
or directly within TF_pathway_acitivity.ipynb
You can find all the detailed and explained results here