GNNPerf: Towards Effective Performance Profiling and Analysis across GNN Frameworks

Overview

This repository contains the code of our paper "GNNPerf : Towards Effective Performance Profiling and Analysis across GNN Frameworks".

Installation

Our Python environment is based on CUDA version 11.8. Please ensure you have a compatible CUDA setup before proceeding.

# Clone the repository
git clone https://github.com/buaa-hipo/GNNPerf.git

# Create and activate a conda environment
conda create -n gnnperf python=3.11
conda activate gnnperf

# Run the installation script
cd GNNPerf
bash install.sh

Download Datasets

The datasets used in the experiments can be downloaded from here.

Download the datasets (.graph files) and place them into the GNNPerf/datasets directory. Alternatively, you can modify the dataset_prefix variable in GNNPerf/examples/run.py to point to the location where your datasets are stored.

How to reproduce our results

Our experiments were conducted on a platform with two Intel E5-2680 v4 CPUs and an NVIDIA V100 (32GB memory) GPU.

The end-to-end script is located in GNNPerf/examples/run.py, and the training task process is defined in GNNPerf/examples/task.py. To reproduce our results, run the following commands:

cd GNNPerf/examples
bash run.py

During execution, warnings and errors will be redirected to GNNPerf/examples/profiler_errors.txt. After completion, the results will be output to the GNNPerf/examples/results directory:

Figure 2 - motivation_training_time.pdf
Figure 4(a) - training_time.pdf
Figure 4(b) - peak_memory.pdf
Figure 4(c) - device.pdf
Figure 5 - operators.pdf

Quick Start

We provide several examples in GNNPerf.examples to demonstrate the usage of GUL.

The core components of GNNPerf are as follows:

gnnperf.translate: Translates GUL into code for DGL or PyG.
gnnperf.profile: Profiles model training.
gnnperf.analyze: Analyzes and visualizes performance statistics.

Here is a simple example:

from gnnperf import translate, profile, analyze

models = ["GCN", "SAGE"]
datasets = ["cora", "citeseer", "pubmed"]


for model in models:
    # Translation
    translate(
        modes=["dgl", "pyg"], labels=["DGL", "PyG"], source_path=f"./{model}/gul.py"
    )
    
    # Profiling
    for dataset in datasets:
        print(f"Profiling {model} on {dataset}...")
        profile(
            modes=["dgl", "pyg"],  # Framework modes to use
            labels=["DGL", "PyG"],  # Corresponding labels for the modes
            model_path=f"./{model}",  # Path to the model
            task_path="./task.py",  # Path to the task definition file
            dataset_path=f"../datasets/{dataset}.graph",  # Path to the dataset
            epoch=200,  # Number of training epochs
            device=0,  # GPU device ID
            test_times=20,  # Number of repetitions for performance measurement
        )

# Analyzing and Visualizing
# Configure charts to draw: quick_start_training_time / quick_start_operators
charts = [
    {
        "type": "histogram",  # Type of chart
        "save_fig": "quick_start_training_time.pdf",  # Output file name
        "bar": "training_time",  # # Data to plot
        "y_label_bar": "Training time (s)",  # Y-axis label
    },
    {
        "type": "operator",
        "save_fig": "quick_start_operators.pdf",
        "titles": [
            "(a) Operator decomposition for DGL's models.",
            "(b) Operator decomposition for PyG's models.",
        ],  # Subplot titles
    },
]

analyze(
    model_prefix="./",  # Directory containing the models
    models=models,  # List of models to analyze
    datasets=datasets,  # List of datasets used
    labels=["DGL", "PyG"],  # Labels for the frameworks
    GPU_clock_frequency=1230 * 10 ^ 6,  # GPU clock frequency (V100)
    file_name="perf_quick_start.json",  # Output file for analysis results
    # visualize settings
    charts=charts,  # Configuration for the charts to generate
)

Citation

If you use this project in your research, please cite our paper:

@inproceedings{Ma2025gnnperf,
  title={GNNPerf: Towards Effective Performance Profiling and Analysis across GNN Frameworks},
  author={Kejie Ma, Hailong Yang, Zizheng Zhang, Xin You, Zhibo Xuan, Qingxiao Sun, Zhongzhi Luan, Yi Liu, Depei Qian},
  booktitle={2025 IEEE International Parallel and Distributed Processing Symposium (IPDPS), June 3-7, 2025, Milano, Italy},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
examples		examples
python/gnnperf		python/gnnperf
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GNNPerf: Towards Effective Performance Profiling and Analysis across GNN Frameworks

Overview

Installation

Download Datasets

How to reproduce our results

Quick Start

Citation

About

Uh oh!

Releases

Packages

Languages

License

buaa-hipo/GNNPerf

Folders and files

Latest commit

History

Repository files navigation

GNNPerf: Towards Effective Performance Profiling and Analysis across GNN Frameworks

Overview

Installation

Download Datasets

How to reproduce our results

Quick Start

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages