Hyperparameter tuning using bio-inspired algorithms in GCN and GAN models for link prediction in a PPI network
This project is made to compare different biologically inspired algorithms like Genetic Algorithm, Particle Swarm Optimization, Ant Colony Optimization, Artificial Bee Colony and classic ones like Grid Search, Simulated Annealing, Hill Climbing and Random Search as well as Optuma and Bayesian Optimization in hyperparameter optimization in Graph Convolutional Network and Generative Adversarial Network models for link prediction in a SNAPS protein-protein interaction network.
This repository explores the PP-Pathways dataset from the Stanford SNAP BioData collection. It represents a large-scale protein-protein interaction (PPI) network derived from pathway databases.
- Nodes: 21,554 proteins
- Edges: 342,338 interactions
- Data Type: Undirected, unweighted graph
- Source: Pathway-based protein associations
- Format: Edge list (.csv) with each row representing a protein-protein interaction
The following results are obtained after 10 epochs of training in each GCN and GAN model.
Model | Algorithm | F1 | AUC | Loss / Avg Loss | NDCG | Hidden Channels | Learning Rate | # Layers | Dropout | Time |
---|---|---|---|---|---|---|---|---|---|---|
GCN | None | 0.8071 | 0.8775 | 1.2861 | 0.9875 | 256 | 0.01 | 2 | 0 | 1m 19s |
GA | 0.8506 | 0.9125 | 1.2876 | 0.9913 | 73 | 0.0122 | 3 | 0.4 | 9m 20s | |
PSO | 0.8506 | 0.779 | 1.392 | 0.968 | 107 | 0.01339 | 3 | 0.55 | 11m 56s | |
ABC | 0.8496 | 0.8831 | 1.4161 | 0.9885 | 106 | 0.00691 | 3 | 0.15 | 14m 28s | |
Simulated Annealing | 0.8435 | 0.7841 | 1.3751 | 0.9685 | 160 | 0.00974 | 3 | 0.23 | 13m 18s | |
Hill Climbing | 0.843 | 0.9109 | 1.3705 | 0.9914 | 244 | 0.01102 | 3 | 0.66 | 9m 10s | |
Random Search | 0.8493 | 0.9147 | 1.2535 | 0.992 | 38 | 0.01378 | 3 | 0.18 | 12m 20s | |
ACO | 0.8419 | 0.9145 | 1.2698 | 0.9918 | 224 | 0.00215 | 3 | 0.7 | 4m 30s | |
Bayesian Search | 0.8504 | 0.9154 | 1.2674 | 0.9911 | 80 | 0.00785 | 4 | 0.1 | 10m 22s | |
Grid Search | 0.8497 | 0.9151 | 1.2031 | 0.992 | 64 | 0.01 | 3 | 0.3 | 9m 13s | |
Optuna | 0.8512 | 0.9166 | 1.2522 | 0.992 | 277 | 0.00153 | 4 | 0 | 22m 22s | |
GAN | None | 0.7337 | 0.7528 | -0.3044 | 0.9699 | 256 | 1e-4 | - | 0.3 | 1m 21s |
GA | 0.7538 | 0.7772 | 0.0088 | 0.9723 | 459 | 0.0019 | - | 0.33 | 8m 59s | |
PSO | 0.7571 | 0.7781 | 0.0147 | 0.9706 | 512 | 0.002 | - | 0.31 | 15m 22s | |
ABC | 0.7545 | 0.7773 | 0.0421 | 0.9724 | 480 | 0.00136 | - | 0.22 | 8m 32s | |
Simulated Annealing | 0.7584 | 0.7583 | -0.026 | 0.9699 | 412 | 0.002 | - | 0.1 | 18m 12s | |
Hill Climbing | 0.7559 | 0.7773 | 0.0015 | 0.9726 | 393 | 0.002 | - | 0.4 | 11m 47s | |
Random Search | 0.7541 | 0.7752 | 0.0432 | 0.9725 | 367 | 0.00166 | - | 0.2 | 8m 36s | |
ACO | 0.7790 | 0.7736 | 0.5720 | 0.9716 | 64 | 0.00167 | - | 0.4 | 13m 29s | |
Bayesian Search | 0.7505 | 0.7743 | 0.2257 | 0.9725 | 144 | 0.00062 | - | 0.35 | 9m 55s | |
Grid Search | 0.7428 | 0.763 | 0.0554 | 0.9712 | 512 | 0.0001 | - | 0 | 12m 6s | |
Optuna | 0.756 | 0.7817 | 0.2407 | 0.9738 | 373 | 0.00068 | - | 0.51 | 20m 24s |
- Python 3.8+
- scikit-learn
- PyTorch
- PyTorch Geometric
- NiaPy
git clone https://github.com/milagjurovska/PPI-link-prediction-with-optimized-gcn-and-gan.git
cd PPI-link-prediction-with-optimized-gcn-and-gan
The results can be displayed in the Jupyter Notebook provided, however if you want to run the code in Python only, there is a results.py file.