This repository is a re-implementation of PANDA: Prioritization of autism‐genes using network‐based deep‐learning approach in PyTorch.
Code environment is managed via Anaconda in this project.
To create an environment and install all dependencies:
conda env create -f environment.yml
To clone and run this project locally:
git clone https://github.com/ChengyuanSha/PANDA
data
folder:- HMIN_edgelist.csv: Human molecular interaction network (HMIN) in an edge list format
- labeled_genes.csv: Graph nodes labels corresponding to HMIN
src
folder contains implementation codes:tests
folder: tests for quality assurance.experiments
folder: other testing experiments- GCN.py: definition of graph convolution network model
- main.ipynb: the training and evaluation jupyter-notebook.
- read_data.py: data preprocessing
- baselines.ipynb: other algorithms to compare with
Our model takes a small amount of time to train since the dataset is small.
To train my model, run main.ipynb
in src
. The training code is under the Model training
section.
To save model, uncomment code in Save model
section.
To evaluate my model, run main.ipynb
in src
.
The evaluation code is under the Model Evaluation
section.
Our model trained on HMIN with labeled autism genes dataset.
You can download pretrained model in: src/pretrained_model.pth
.
Our model (PANDA) achieves the following performance comparing with Random Forest (RF), support vector machine (SVM),
Linear Genetic Programming(LGP) on HMIN dataset:
The baseline code can be found in src/baselines.ipynb
with sections: Random Forest
, SVM
, LGP
Model name | Accuracy | Precision | Recall | F1 score |
---|---|---|---|---|
PANDA | 88% | 1.00 | 0.88 | 0.93 |
RF | 86% | 0.98 | 0.84 | 0.91 |
SVM | 85% | 0.99 | 0.85 | 0.92 |
LGP | 86% | 0.98 | 0.84 | 0.91 |
If you'd like to contribute, or have any suggestions, you can open an issue on this GitHub repository. All contributions welcome! All content in this repository is licensed under the MIT license.
Zhang, Y., Chen, Y., & Hu, T. (2020). PANDA: Prioritization of autism‐genes using network‐based deep‐learning approach. Genetic epidemiology, 44(4), 382-394.