Awesome-AI-for-Materials-Generation | arXiv
This repository is for our paper:
Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey
Zhixun Li1,*, Bin Cao2,*, Rui Jiao3,*, Liang Wang4,*, Ding Wang4, Yang Liu2, Dingshuo Chen4, Jia Li2, Qiang Liu4, Yu Rong5, Liang Wang4, Tong-yi Zhang2,6, Jeffrey Xu Yu1
1The Chinese University of Hong Kong
2Hong Kong University of Science and Technology (Guangzhou)
3Tsinghua University
4Institute of Automation, Chinese Academy of Sciences
5DAMO Academy, Alibaba Group
6Shanghai University
🙋 Please let us know if you find out a mistake or have any suggestions!
🌟 If you find this resource helpful, please consider to star this repository and cite our paper!
The discovery of novel materials with desired physical, chemical, or mechanical properties is a longstanding challenge in materials science. Deep generative models (DGMs) have emerged as a powerful tool to design new materials by learning underlying patterns from existing structure-property databases. This repository serves as:
- A taxonomy of DGM-based crystal generation methods.
- A comparative analysis of architectures, conditioning schemes, and model sizes.
- A collection of open datasets used for training generative models.
This repository presents a curated and comprehensive overview of deep generative models for crystal structure generation. It categorizes recent methods by generation mechanism (e.g., VAE, GAN, Diffusion, and LLM), summarizes key datasets, and provides links to implementations and papers for further exploration.
-
[VQCrystal] Massive discovery of crystal structures across dimensionalities by leveraging vector quantization [Paper | Code]
-
[KLDM] Kinetic Langevin Diffusion for Crystalline Materials Generation
-
[WyckoffDiff] WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry [Paper | Code]
-
[Wyckoff Transformer] Wyckoff Transformer: Generation of Symmetric Crystals [Paper | Code]
-
[OMG] Open Materials Generation with Stochastic Interpolants [Paper]
-
[Chemeleon] Exploration of crystal chemical space using text-guided generative artificial intelligence [Paper | Code]
-
[SymmCD] SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models [Paper | Code]
-
[MatterGen] Mattergen: a generative model for inorganic materials design [Paper | Code]
-
[ADiT] All-atom Diffusion Transformers: Unified generative modelling of molecules and materials [Paper | Code]
-
[DAO] Siamese Foundation Models for Crystal Structure Prediction [Paper]
-
[CrystalGRW] CrystalGRW: Generative Modeling of Crystal Structures with Targeted Properties via Geodesic Random Walks [Paper | Code]
-
[NatureLM] NatureLM: Deciphering the Language of Nature for Scientific Discovery [Paper | Code]
-
[MatLLMSearch] Large Language Models Are Innate Crystal Structure Generators [Paper | Code]
-
[Uni-3DAR] Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens [Paper | Code]
-
[TGDMat] Periodic Materials Generation using Text-Guided Joint Diffusion Model [Paper | Code]
-
[UniGenX] UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion [Paper]
-
[CrysBFN] A Periodic Bayesian Flow for Material Generation [Paper | Code]
-
[WyCryst] WyCryst: Wyckoff inorganic crystal generator framework [Paper | Code]
-
[MagGen] MagGen: A graph aided deep generative model for inverse design of stable, permanent magnets [Paper]
-
[GAN-DDLSF] Crystal Structure Prediction Using Generative Adversarial Network with Data-Driven Latent Space Fusion Strategy [Paper]
-
[NSGAN] NSGAN: a non-dominant sorting optimisation-based generative adversarial design framework for alloy discovery [Paper | Code]
-
[DeepCSP] Organic crystal structure prediction via coupled generative adversarial networks and graph convolutional networks [Paper]
-
[CGWGAN] CGWGAN: crystal generative framework based on Wyckoff generative adversarial network [Paper | Code]
-
[Cond-CDVAE] Deep learning generative model for crystal structure prediction [Paper | Code]
-
[Con-CDVAE] Con-CDVAE: A method for the conditional generation of crystal structures [Paper | Code]
-
[StructRepDiff] Representation-space diffusion models for generating periodic materials [Paper]
-
[DiffCSP++] Space group constrained crystal generation [Paper | Code]
-
[GemsDiff] Vector field oriented diffusion model for crystal material generation [Paper | Code]
-
[EquiCSP] Equivariant diffusion for crystal structure prediction [Paper | Code]
-
[FlowMM] Flowmm: Generating materials with riemannian flow matching [Paper | Code]
-
[SuperDiff] Diffusion models for conditional generation of hypothetical new families of superconductors [Paper | Code]
-
[MOFFlow] MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks [Paper | Code]
-
[CrystalFlow] Crystalflow: A flow-based generative model for crystalline materials [Paper | Code]
-
[CrystaLLM] Crystal structure generation with autoregressive large language modeling [Paper | Code]
-
[CrystalLLM] Fine-tuned language models generate stable inorganic materials as text [Paper | Code]
-
[Mat2Seq] Invariant tokenization of crystalline materials for language model enabled generation [Paper | Code]
-
[MatExpert] Matexpert: Decomposing materials discovery by mimicking human experts [Paper | Code]
-
[FlowLLM] FlowLLM: Flow matching for material generation with large language models as base distributions [Paper | Code]
-
[GenMS] Generative hierarchical materials search [Paper]
-
[LCMGM] A deep generative modeling architecture for designing lattice-constrained perovskite materials [Paper | Code]
-
[VGD-CG] Inverse design of semiconductor materials with deep generative models [Paper | Code]
-
[DP-CDVAE] Diffusion probabilistic models enhance variational autoencoder for crystal structure generative modeling [Paper | Code]
-
[PCVAE] Pcvae: A physics-informed neural network for determining the symmetry and geometry of crystals [Paper | Code]
-
[PGCGM] Evaluating the diversity and utility of materials proposed by generative models [Paper | Code]
-
[P-CDVAE] Compositional Search of Stable Crystalline Structures in Multi-Component Alloys Using Generative Diffusion Models [Paper]
-
[LCOMs] Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction [Paper]
-
[DiffCSP] Crystal structure prediction by joint equivariant diffusion [Paper | Code]
-
[UniMat] Scalable diffusion for materials generation [Paper | Code]
-
[MOFDiff] Mofdiff: Coarse-grained diffusion for metal-organic framework design [Paper | Code]
-
[xyztransformer] Language models can generate molecules, materials, and protein binding sites directly in three dimensions as xyz, cif, and pdb files [Paper | Code]
-
[SLI2Cry] An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning [Paper | Code]
-
[EMPNN] Equivariant message passing neural network for crystal material discovery [Paper | Code]
-
[CHGlownet] Hierarchical GFlownet for Crystal Structure Generation [Paper]
-
[Crystal-GFN] Crystal-GFN: sampling crystals with desirable properties and constraints [Paper | Code]
- [FTCP] An invertible crystallographic representation for general inverse design of inorganic crystals with targeted properties [Paper | Code]
-
[iMatGen] Inverse design of solid-state materials via a continuous representation [Paper | Code]
-
[Cond-DFC-VAE] 3-D inorganic crystal structure generation and property prediction via representation learning [Paper | Code]
-
[GANCSP] Generative adversarial networks for crystal structure prediction [Paper | Code]
-
[CCDCGAN] Constrained crystals deep convolutional generative adversarial network for the inverse design of crystal structures [Paper]
-
[ZeoGAN] Inverse design of porous materials using artificial neural networks [Paper | Code]
-
[MatGAN] Generative adversarial networks (GAN) based efficient sampling of chemical composition space for inverse design of inorganic materials [Paper]
-
[CubicGAN] High‐throughput discovery of novel cubic crystal materials using deep generative neural networks [Paper]
-
[G-SchNet] Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules [Paper | Code]
Note: Data were updated as of April 18, 2025.
| Dataset | #Open Access | #Structures | Attribute | E or C | In/Organic | Format | Link |
|---|---|---|---|---|---|---|---|
| COD | ✓ | 523,874 | ✗ | Both | Both | CIF | COD |
| Materials Project | ✓ | 154,718 | ✓ | C | Inorganic | CIF, API | Materials Project |
| JARVIS‑DFT | ✓ | 40,000 (3D) / 1,000 (2D) | ✓ | C | Inorganic | CIF, JSON, API | JARVIS |
| ICSD | ✗ | 318,901 | ✗ | E | Inorganic | CIF | ICSD |
| AFLOW | ✓ | 3,530,330 | ✓ | C | Inorganic | API (JSON), CIF | AFLOW |
| OQMD | ✓ | 1,226,781 | ✓ | C | Inorganic | JSON, API | OQMD |
| ICDD (PDF‑5+) | ✗ | 1,104,137 | ✗ | E | Both | PDF, TXT, CIF | ICDD |
| OMat24 | ✓ | 118,000,000 | ✓ | C | Inorganic | ASEDB (LMDB) | OMat24 |
| HKUST-CrystDB | ✓ | 718,725 | ✓ | Both | Inorganic | ASEDB | HKUST-CrystDB |
| Alexandria | ✗ | 1,500,000+ | ✗ | C | Inorganic | CIF, JSON, DGL, PyG, LMDB | Alexandria |
| CSD | ✗ | 1,250,000+ | ✗ | E | Organic | CIF | CSD |
| NOMAD | ✓ | 19,115,490 | ✓ | C | Inorganic | Raw I/O, Metainfo (JSON) | NOMAD |
This repository is licensed under the MIT License.

