Advancing Graph Generation through Beta Diffusion (ICLR 2025)

By Xinyang Liu*¹, Yilin He*¹, Bo Chen², Mingyuan Zhou¹
¹The University of Texas at Austin, ²Xidian University
*Equal contribution

We introduce GBD, a Graph Beta Diffusion that specifically designed to handle the diverse nature of graph data, including both continuous and discrete graph elements. Additionally, we propose a modulation technique that enhances the realism of generated graphs by stabilizing critical graph topology while maintaining flexibility for other components. In doing so, GBD competes strongly with existing models across multiple general and biochemical graph benchmarks.

In this repo, we will release (updating):

Environment Dependencies ✅
Checkpoints 📌
Reproducing Experiments ✅
- Dataset preparations
- Configurations
- Training
- Generation and Evaluation
Visualization ✅

Environment Dependencies

Our code was built on DiGress, please refer to the Environment installation in DiGress.

Reproducing Experiments

1. Dataset preparations

We provide two commands for generating general graph datasets as follows:

python src/datasets/data_generators_gdss.py --dataset <dataset>

where <dataset> is one of the general graph datasets: community_small, ego_small and grid. This will create the <dataset>.pkl file in the data directory.

python src/datasets/data_generators_grum.py --dataset <dataset> --mmd

where <dataset> is one of the general graph datasets: planar and sbm. This will create the <dataset>.pkl file in the data directory.

For the molecule datasets:

python src/datasets/preprocess.py --dataset <dataset>
python src/datasets/preprocess_for_nspdk.py --dataset <dataset>

where <dataset> is one of the 2d molecule datasets: qm9 and zinc250k.

Note that the .pkl/.pt files are saved in ./data/raw/<dataset>, if the directory is not found, please build the directory manually.

Optional:

For Community-small, Ego-small and Grid datasets, please refer to GDSS.
For Planar, SBM, QM9 and ZINC250k datasets, please refer to GruM.

For the evaluation of general graph generation tasks, run the following command to compile the ORCA program.

cd src/analysis/orca 
g++ -O2 -std=c++11 -o orca orca.cpp

2. Configurations

The configurations are provided in the config/ directory in YAML format. Hyperparameters used in the experiments are specified in the Appendix E of our paper.

3. Training

Training GBD on general and biochemical graph benchmark:

python src/main.py +experiment=<dataset>.yaml general.gps=[0,1,2,3]

Training GBD with Concentration Modulation technique:

Set concentration_m = True in experiment configs for traing GBD with concentration modulation technique!
We provide specialized concentration modulation strategies for different datasets in concentration.py.

For the new dataset, custom concentration modulation strategy can be defined in a similar way with CustomConcentrationModule module.

4. Generation and Evaluation

Provide the path of checkpoints for general.resume or general.test_only, then run the follow command to generate samples.

python src/main.py +experiment=<dataset>.yaml

or

python src/main.py +experiment=<dataset>.yaml general.resume='to/your/path'

Visualization

Edge generation process of GBD with concentration modulation defined by node degree on Community-small dataset.

python main.py +experiment=<dataset>.yaml general.sample_visualization=True.

Set general.forward_visualization=True or general.sample_visualization=True in config file for visualization of forward process or reverse process by graph beta diffusion. (currently only supports general graph)

Acknowledgements

This repository was heavily built off of DiGress, GDSS and GruM.

Citation

@article{liu2024advancing,
  title={Advancing Graph Generation through Beta Diffusion},
  author={Liu, Xinyang and He, Yilin and Chen, Bo and Zhou, Mingyuan},
  journal={arXiv preprint arXiv:2406.09357},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
assets		assets
configs		configs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Advancing Graph Generation through Beta Diffusion (ICLR 2025)

Environment Dependencies

Reproducing Experiments

1. Dataset preparations

2. Configurations

3. Training

4. Generation and Evaluation

Visualization

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

xinyangATK/GraphBetaDiffusion

Folders and files

Latest commit

History

Repository files navigation

Advancing Graph Generation through Beta Diffusion (ICLR 2025)

Environment Dependencies

Reproducing Experiments

1. Dataset preparations

2. Configurations

3. Training

4. Generation and Evaluation

Visualization

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages