Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md

Repository files navigation

Awesome Auto-regressive in GenerativeAI

A curated list of awesome auto-regressive papers in generative AI, inspired by awesome-NeRF.

(Source: MeshAnythingV2, MeshArt, and MagicArticulate.)

Table of contents

3D Generation
Image Generation
Video Generation

Change Log:

Aug 2, 2025: Add XSpecMesh for 3D generation, SCALAR and X-Omni for image generation.
Jul 25, 2025: Add Lumos-1 for video generation, MENTOR, CSD-VAR, Lumina-mGPT 2.0 and TTS-VAR for image generation/modeling.
Jul 13, 2025: Add OmniPart and Mesh Silksong for 3D generation, DC-AR, Hita, LASADGen and LPD for image generation.
Jul 1, 2025: Add Epona and InfGen for autonomous driving, CycleVAR for image translation.
Jun 24, 2025: Add MV-AR for multi-view generation, Make it Efficient and WMAR for image generation.
Jun 18, 2025: Add Self Forcing, VideoMAR and Seaweed APT2 for videogen, AR-RAG, MADFormer, SkipVAR, TransDiff, Pisces and SpectralAR for image generation.
Jun 6, 2025: Add HMAR (hierarchical masked ar for image generation), MS_SR_VAR (image super resolution), and AliTok (align the token modeling between tokenizer and ar model)

3D Generation

3D Shape Generation

XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding, Chen et al., arXiv 2025
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion, Yang et al., arXiv 2025 | Project
Mesh Silksong: Auto-Regressive Mesh Generation as Weaving Silk, Song et al., arXiv 2025 | Project | Code
LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework, Kang et al., arXiv 2025
OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation, Wei et al., SIGGRAPH 2025 | Code
Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization, Deng et al., arXiv 2025 | Project
TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing, Lionar et al., CVPR 2025 | Code
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning, Zhao et al., arXiv 2025 | Project | Code
MeshPad: Interactive Sketch Conditioned Artistic-designed Mesh Generation and Editing, Li et al., arXiv 2025 | Project | Video
AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction, Zhang et al., arXiv 2025 | Project | Code
MARS: Mesh AutoRegressive Model for 3D Shape Detailization, Gao et al., arXiv 2025
Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation, Wang et al., arXiv 2025 | Project | Code
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction, Zhang et al., arXiv 2024 | Code
3D representation in 512-Byte: Variational tokenizer is the key for autoregressive 3D generation, Zhang et al., arXiv 2024 | Project | Code
3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes, Medi et al., arXiv 2024
Scaling Mesh Generation via Compressive Tokenization, Weng et al., arXiv 2024 | Project | Code
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation, Tang et al., ICLR 2025 | Project | Code
MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization, Chen et al., arXiv 2024 | Project | Code
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers, Chen et al., ICLR 2025 | Project | Code
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models, Chen et al., NeurIPS 2024 | Project | Code
Pivotmesh: Generic 3d mesh generation via pivot vertices guidance., Weng et al., ICLR 2025 | Project | Code
MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers, Siddiqui et al., CVPR 2024 Highlight | Project | Code | Video
Autoregressive 3D Shape Generation via Canonical Mapping, Cheng et al., ECCV 2022 | Code
ShapeFormer: Transformer-based Shape Completion via Sparse Representation, Yan et al., CVPR 2022 | Project | Code
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation, Mittal et al., CVPR 2022 | Project | Code
PolyGen: An Autoregressive Generative Model of 3D Meshes, Nash et al., ICML 2020 | Code

Articulated Object Generation

MeshArt: Generating Articulated Meshes with Structure-guided Transformers, Gao et al., arXiv 2024 | Project | Code | Video | Data

Automatic Rigging

MagicArticulate: Make Your 3D Models Articulation-Ready, Song et al., CVPR 2025 | Project | Code | Video | Data
RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets, Liu et al., arXiv 2025 | Project | Video

Motion Generation

Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion, Zhang et al., arXiv 2025
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model, Chu et al., arXiv 2025 | Project
ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model, Lu et al., arXiv 2024 | Project | Code
Synergy and Synchrony in Couple Dances, Maluleke et al., arXiv 2024 | Project
BAMM: Bidirectional Autoregressive Motion Model, Pinyoanuntapong et al., ECCV 2024 | Project | Code
MoMask: Generative Masked Modeling of 3D Human Motions, Guo et al., CVPR 2024 | Project | Code | Demo
HumanTOMATO: Text-aligned Whole-body Motion Generation, Lu et al., ICML 2024 | Project | Code
MotionGPT: Human Motion as Foreign Language, Jiang et al., NeurIPS 2023 | Project | Code | Demo
T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations, Zhang et al., CVPR 2023 | Project | Code | Demo
TM2T: Stochastical and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts, Guo et al., ECCV 2022 | Project | Code

4D Generation

AR4D: Autoregressive 4D Generation from Monocular Videos, Zhu et al., arXiv 2025 | Project

Multi-View Generation

Auto-Regressively Generating Multi-View Consistent Images, Hu et al., arXiv 2025

Camera Generation

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography, Zhang et al., arXiv 2025 | Project | Code

Garment Generation

GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation, Guo et al., arXiv 2025
DressCode: Autoregressively Sewing and Generating Garments from Text Guidance, He et al., SIGGRAPH 2024 | Project | Code | Video

CAD Generation

SkexGen: Autoregressive Generation of CAD Construction Sequences with Disentangled Codebooks, Xu et al., ICML 2022 | Project | Code | Video

Autonomous Driving

Epona: Autoregressive Diffusion World Model for Autonomous Driving, Zhang et al., arXiv 2025
InfGen: Scenario Generation as Next Token Group Prediction, Peng et al., arXiv 2025 | Project | Code

Image Generation

Image Generation

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again, Geng et al., arXiv 2025 | Project | Code
SCALAR: Scale-wise Controllable Visual Autoregressive LeARning, Xu et al., arXiv 2025
Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling, Xin et al., arXiv 2025 | Code
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation, Chen et al., arXiv 2025 | Code
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models, Nguyen et al., ICCV 2025
MENTOR: Efficient Multimodal-Conditioned Tuning for Autoregressive Vision Generation Models, Zhao et al., arXiv 2025 | Project | Code
DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer, Wu et al., ICCV 2025 | Code
Holistic Tokenizer for Autoregressive Image Generation, Zheng et al., arXiv 2025 | Code
Autoregressive Image Generation with Linear Complexity: A Spatial-Aware Decay Perspective, Mao et al., arXiv 2025
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation, Zhang et al., arXiv 2025 | Code
CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation, Liu et al., arXiv 2025
Make It Efficient: Dynamic Sparse Attention for Autoregressive Image Generation, Xiang et al., arXiv 2025
Watermarking Autoregressive Image Generation, Jovanović et al., arXiv 2025 | Code
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation, Qi et al., arXiv 2025
MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation, Chen et al., arXiv 2025
SkipVAR: Accelerating Visual Autoregressive Modeling via Adaptive Frequency-Aware Skipping, Li et al., arXiv 2025 | Code
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression, Zhen et al., arXiv 2025 | Code
Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation, Xu et al., arXiv 2025
SpectralAR: Spectral Autoregressive Visual Generation, Huang et al., arXiv 2025 | Project | Code
AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model, Wu et al., arXiv 2025 | Code
HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation, Kumbong et al., arXiv 2025
Multi-scale Image Super Resolution with a Single Auto-Regressive Model, Sanchez et al., arXiv 2025 | Code
ReasonGen-R1: CoT for Autoregressive Image Generation model through SFT and RL, Zhang et al., arXiv 2025 | Project | Code
REOrdering Patches Improves Vision Models, Kutscher et al., arXiv 2025 | Project | Code
D-AR: Diffusion via Autoregressive Models, Gao et al., arXiv 2025 | Code
Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization, Gallici et al., arXiv 2025
LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization, Wu et al., arXiv 2025 | Project
DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction, Liu et al., arXiv 2025 | Code
DiSA: Diffusion Step Annealing in Autoregressive Image Generation, Zhao et al., arXiv 2025 | Code
Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots, Zheng et al., ICML 2025 | Code
RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration, Rajagopalan et al., arXiv 2025 | Project
Conditional Panoramic Image Generation via Masked Autoregressive Modeling, Li et al., arXiv 2025 | Project
TensorAR: Refinement is All You Need in Autoregressive Image Generation, Cheng et al., arXiv 2025
MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning, Zhang et al., arXiv 2025 | Code
VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation, Lin et al., arXiv 2025 | Code
MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning, Zhang et al., arXiv 2025 | Code
VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation, Lin et al., arXiv 2025 | Code
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models, Ma et al., arXiv 2025 | Project
Distilling Semantically Aware Orders for Autoregressive Image Generation, Pramanik et al., arXiv 2025
Personalized Text-to-Image Generation with Auto-Regressive Models, Sun et al., arXiv 2025 | Code
Autoregressive Distillation of Diffusion Transformers, Kim et al., CVPR 2025 Oral | Code
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL, Wang et al., arXiv 2025 | Code
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation, Xiong et al., arXiv 2025 | Project | Code
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing, Hu et al., arXiv 2025 | Code
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens, Wang et al., arXiv 2025
Unified Autoregressive Visual Generation and Understanding with Continuous Tokens, Fan et al., arXiv 2025
Neighboring Autoregressive Modeling for Efficient Visual Generation, He et al., arXiv 2025 | Project | Code
NFIG: Autoregressive Image Generation with Next-Frequency Prediction, Huang et al., arXiv 2025
Frequency Autoregressive Image Generation with Continuous Tokens, Yu et al., arXiv 2025 | Project | Code
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation, Ren et al., arXiv 2025 | Code
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction, Jiao et al., arXiv 2025 | Project | Code
Autoregressive Image Generation Guided by Chains of Thought, Cai et al., arXiv 2025
Generative Autoregressive Transformers for Model-Agnostic Federated MRI Reconstruction, Nezhad et al., arXiv 2025
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis, Han et al., arXiv 2024 | Project | Code
Parallelized Autoregressive Visual Generation, Wang et al., CVPR 2025 | Project | Code
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders, Pang et al., CVPR 2025 | Project | Code
Randomized Autoregressive Visual Generation, Yu et al., arXiv 2024 | Project | Code | Demo
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens, Fan et al., ICLR 2025
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling, Liu et al., arXiv 2024 | Project | Code
Autoregressive Image Generation without Vector Quantization, Li et al., NeurIPS 2024 Spotlight | Code
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation, Sun et al., arXiv 2024 | Project | Code
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction, Tian et al., NeruIPS 2024 Best Paper | Code
Muse: Text-To-Image Generation via Masked Generative Transformers, Chang et al., PMLR 2023 | Project
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation, Yu et al., TMLR 2022
Autoregressive Image Generation using Residual Quantization, Lee et al., CVPR 2022 | Code
Vector-quantized Image Modeling with Improved VQGAN, Yu et al., ICLR 2022 | Project
Taming Transformers for High-Resolution Image Synthesis, Esser et al., CVPR 2021 | Code
Generative Pretraining from Pixels, Chen et al., ICML 2020 | Code
Parallel Multiscale Autoregressive Density Estimation, Reed et al., ICML 2017
Conditional Image Generation with PixelCNN Decoders, Van den Oord et al., NIPS 2016 | Code

Video generation

Video Generation

Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective, Yuan er al., arXiv 2025 | Code
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation, Lin et al., arXiv 2025 | Project
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion , Huang et al., arXiv 2025 | Project | Code
VideoMAR: Autoregressive Video Generation with Continuous Tokens, Yu et al., arXiv 2025 | Project
Video-GPT via Next Clip Diffusion, Zhuang et al., arXiv 2025 | Project | Code
MAGI-1: Autoregressive Video Generation at Scale, Sand.ai, arXiv 2025 | Project | Code
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation, Zhang et al., arXiv 2025 | Code
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion, Sun et al., CVPR 2025 | Project | Code
Autoregressive Video Generation without Vector Quantization, Deng et al., ICLR 2025 | Code
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models, Li et al., arXiv 2024 | Project
Progressive Autoregressive Video Diffusion Models, Xie et al., arXiv 2024 | Project | Code
Loong: Generating Minute-level Long Videos with Autoregressive Language Models, Wang et al., arXiv 2024 | Project
ARLON: Boosting Diffusion Transformers With Autoregressive Models for Long Video Generation, Li et al., ICLR 2025 | Project
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior, Wang et al., ICLR 2025 Oral | Project | Code
VideoPoet: A Large Language Model for Zero-Shot Video Generation, Kondratyuk et al., ICML 2024 | Project
ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models, Weng et al., arXiv 2023 | Project
Transframer: Arbitrary Frame Prediction with Generative Models, Nash et al., TMLR 2023 | Code
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers, Hong et al., ICLR 2023 | Code
VideoGPT: Video Generation using VQ-VAE and Transformers, Yan et al., arXiv 2021 | Project | Code

About

A curated list of awesome autoregressive papers in Generative AI

Report repository

Releases

No releases published

Packages

No packages published