A curated list of awesome auto-regressive papers in generative AI, inspired by awesome-NeRF.
(Source: MeshAnythingV2, MeshArt, and MagicArticulate.)
Change Log:
Aug 2, 2025: Add XSpecMesh for 3D generation, SCALAR and X-Omni for image generation.
Jul 25, 2025: Add Lumos-1 for video generation, MENTOR, CSD-VAR, Lumina-mGPT 2.0 and TTS-VAR for image generation/modeling.
Jul 13, 2025: Add OmniPart and Mesh Silksong for 3D generation, DC-AR, Hita, LASADGen and LPD for image generation.
Jul 1, 2025: Add Epona and InfGen for autonomous driving, CycleVAR for image translation.
Jun 24, 2025: Add MV-AR for multi-view generation, Make it Efficient and WMAR for image generation.
Jun 18, 2025: Add Self Forcing, VideoMAR and Seaweed APT2 for videogen, AR-RAG,
MADFormer, SkipVAR, TransDiff, Pisces and SpectralAR for image generation.
Jun 6, 2025: Add HMAR (hierarchical masked ar for image generation), MS_SR_VAR (image super resolution), and AliTok (align the token modeling between tokenizer and ar model)
3D Shape Generation
- XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding, Chen et al., arXiv 2025
- OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion, Yang et al., arXiv 2025 | Project
- Mesh Silksong: Auto-Regressive Mesh Generation as Weaving Silk, Song et al., arXiv 2025 | Project | Code
- LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework, Kang et al., arXiv 2025
- OctGPT: Octree-based Multiscale Autoregressive Models for 3D Shape Generation, Wei et al., SIGGRAPH 2025 | Code
- Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization, Deng et al., arXiv 2025 | Project
- TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing, Lionar et al., CVPR 2025 | Code
- DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning, Zhao et al., arXiv 2025 | Project | Code
- MeshPad: Interactive Sketch Conditioned Artistic-designed Mesh Generation and Editing, Li et al., arXiv 2025 | Project | Video
- AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction, Zhang et al., arXiv 2025 | Project | Code
- MARS: Mesh AutoRegressive Model for 3D Shape Detailization, Gao et al., arXiv 2025
- Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation, Wang et al., arXiv 2025 | Project | Code
- TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction, Zhang et al., arXiv 2024 | Code
- 3D representation in 512-Byte: Variational tokenizer is the key for autoregressive 3D generation, Zhang et al., arXiv 2024 | Project | Code
- 3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes, Medi et al., arXiv 2024
- Scaling Mesh Generation via Compressive Tokenization, Weng et al., arXiv 2024 | Project | Code
- EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation, Tang et al., ICLR 2025 | Project | Code
- MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization, Chen et al., arXiv 2024 | Project | Code
- MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers, Chen et al., ICLR 2025 | Project | Code
- MeshXL: Neural Coordinate Field for Generative 3D Foundation Models, Chen et al., NeurIPS 2024 | Project | Code
- Pivotmesh: Generic 3d mesh generation via pivot vertices guidance., Weng et al., ICLR 2025 | Project | Code
- MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers, Siddiqui et al., CVPR 2024 Highlight | Project | Code | Video
- Autoregressive 3D Shape Generation via Canonical Mapping, Cheng et al., ECCV 2022 | Code
- ShapeFormer: Transformer-based Shape Completion via Sparse Representation, Yan et al., CVPR 2022 | Project | Code
- AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation, Mittal et al., CVPR 2022 | Project | Code
- PolyGen: An Autoregressive Generative Model of 3D Meshes, Nash et al., ICML 2020 | Code
Articulated Object Generation
- MeshArt: Generating Articulated Meshes with Structure-guided Transformers, Gao et al., arXiv 2024 | Project | Code | Video | Data
Automatic Rigging
- MagicArticulate: Make Your 3D Models Articulation-Ready, Song et al., CVPR 2025 | Project | Code | Video | Data
- RigAnything: Template-Free Autoregressive Rigging for Diverse 3D Assets, Liu et al., arXiv 2025 | Project | Video
Motion Generation
- Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion, Zhang et al., arXiv 2025
- ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model, Chu et al., arXiv 2025 | Project
- ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model, Lu et al., arXiv 2024 | Project | Code
- Synergy and Synchrony in Couple Dances, Maluleke et al., arXiv 2024 | Project
- BAMM: Bidirectional Autoregressive Motion Model, Pinyoanuntapong et al., ECCV 2024 | Project | Code
- MoMask: Generative Masked Modeling of 3D Human Motions, Guo et al., CVPR 2024 | Project | Code | Demo
- HumanTOMATO: Text-aligned Whole-body Motion Generation, Lu et al., ICML 2024 | Project | Code
- MotionGPT: Human Motion as Foreign Language, Jiang et al., NeurIPS 2023 | Project | Code | Demo
- T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations, Zhang et al., CVPR 2023 | Project | Code | Demo
- TM2T: Stochastical and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts, Guo et al., ECCV 2022 | Project | Code
4D Generation
- AR4D: Autoregressive 4D Generation from Monocular Videos, Zhu et al., arXiv 2025 | Project
Multi-View Generation
- Auto-Regressively Generating Multi-View Consistent Images, Hu et al., arXiv 2025
Camera Generation
- GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography, Zhang et al., arXiv 2025 | Project | Code
Garment Generation
- GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation, Guo et al., arXiv 2025
- DressCode: Autoregressively Sewing and Generating Garments from Text Guidance, He et al., SIGGRAPH 2024 | Project | Code | Video
CAD Generation
- SkexGen: Autoregressive Generation of CAD Construction Sequences with Disentangled Codebooks, Xu et al., ICML 2022 | Project | Code | Video
Autonomous Driving
- Epona: Autoregressive Diffusion World Model for Autonomous Driving, Zhang et al., arXiv 2025
- InfGen: Scenario Generation as Next Token Group Prediction, Peng et al., arXiv 2025 | Project | Code
Image Generation
- X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again, Geng et al., arXiv 2025 | Project | Code
- SCALAR: Scale-wise Controllable Visual Autoregressive LeARning, Xu et al., arXiv 2025
- Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling, Xin et al., arXiv 2025 | Code
- TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation, Chen et al., arXiv 2025 | Code
- CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models, Nguyen et al., ICCV 2025
- MENTOR: Efficient Multimodal-Conditioned Tuning for Autoregressive Vision Generation Models, Zhao et al., arXiv 2025 | Project | Code
- DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer, Wu et al., ICCV 2025 | Code
- Holistic Tokenizer for Autoregressive Image Generation, Zheng et al., arXiv 2025 | Code
- Autoregressive Image Generation with Linear Complexity: A Spatial-Aware Decay Perspective, Mao et al., arXiv 2025
- Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation, Zhang et al., arXiv 2025 | Code
- CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation, Liu et al., arXiv 2025
- Make It Efficient: Dynamic Sparse Attention for Autoregressive Image Generation, Xiang et al., arXiv 2025
- Watermarking Autoregressive Image Generation, Jovanović et al., arXiv 2025 | Code
- AR-RAG: Autoregressive Retrieval Augmentation for Image Generation, Qi et al., arXiv 2025
- MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation, Chen et al., arXiv 2025
- SkipVAR: Accelerating Visual Autoregressive Modeling via Adaptive Frequency-Aware Skipping, Li et al., arXiv 2025 | Code
- Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression, Zhen et al., arXiv 2025 | Code
- Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation, Xu et al., arXiv 2025
- SpectralAR: Spectral Autoregressive Visual Generation, Huang et al., arXiv 2025 | Project | Code
- AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model, Wu et al., arXiv 2025 | Code
- HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation, Kumbong et al., arXiv 2025
- Multi-scale Image Super Resolution with a Single Auto-Regressive Model, Sanchez et al., arXiv 2025 | Code
- ReasonGen-R1: CoT for Autoregressive Image Generation model through SFT and RL, Zhang et al., arXiv 2025 | Project | Code
- REOrdering Patches Improves Vision Models, Kutscher et al., arXiv 2025 | Project | Code
- D-AR: Diffusion via Autoregressive Models, Gao et al., arXiv 2025 | Code
- Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization, Gallici et al., arXiv 2025
- LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization, Wu et al., arXiv 2025 | Project
- DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction, Liu et al., arXiv 2025 | Code
- DiSA: Diffusion Step Annealing in Autoregressive Image Generation, Zhao et al., arXiv 2025 | Code
- Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots, Zheng et al., ICML 2025 | Code
- RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration, Rajagopalan et al., arXiv 2025 | Project
- Conditional Panoramic Image Generation via Masked Autoregressive Modeling, Li et al., arXiv 2025 | Project
- TensorAR: Refinement is All You Need in Autoregressive Image Generation, Cheng et al., arXiv 2025
- MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning, Zhang et al., arXiv 2025 | Code
- VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation, Lin et al., arXiv 2025 | Code
- MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning, Zhang et al., arXiv 2025 | Code
- VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation, Lin et al., arXiv 2025 | Code
- Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models, Ma et al., arXiv 2025 | Project
- Distilling Semantically Aware Orders for Autoregressive Image Generation, Pramanik et al., arXiv 2025
- Personalized Text-to-Image Generation with Auto-Regressive Models, Sun et al., arXiv 2025 | Code
- Autoregressive Distillation of Diffusion Transformers, Kim et al., CVPR 2025 Oral | Code
- SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL, Wang et al., arXiv 2025 | Code
- GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation, Xiong et al., arXiv 2025 | Project | Code
- Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing, Hu et al., arXiv 2025 | Code
- D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens, Wang et al., arXiv 2025
- Unified Autoregressive Visual Generation and Understanding with Continuous Tokens, Fan et al., arXiv 2025
- Neighboring Autoregressive Modeling for Efficient Visual Generation, He et al., arXiv 2025 | Project | Code
- NFIG: Autoregressive Image Generation with Next-Frequency Prediction, Huang et al., arXiv 2025
- Frequency Autoregressive Image Generation with Continuous Tokens, Yu et al., arXiv 2025 | Project | Code
- Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation, Ren et al., arXiv 2025 | Code
- FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction, Jiao et al., arXiv 2025 | Project | Code
- Autoregressive Image Generation Guided by Chains of Thought, Cai et al., arXiv 2025
- Generative Autoregressive Transformers for Model-Agnostic Federated MRI Reconstruction, Nezhad et al., arXiv 2025
- Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis, Han et al., arXiv 2024 | Project | Code
- Parallelized Autoregressive Visual Generation, Wang et al., CVPR 2025 | Project | Code
- RandAR: Decoder-only Autoregressive Visual Generation in Random Orders, Pang et al., CVPR 2025 | Project | Code
- Randomized Autoregressive Visual Generation, Yu et al., arXiv 2024 | Project | Code | Demo
- Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens, Fan et al., ICLR 2025
- Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling, Liu et al., arXiv 2024 | Project | Code
- Autoregressive Image Generation without Vector Quantization, Li et al., NeurIPS 2024 Spotlight | Code
- Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation, Sun et al., arXiv 2024 | Project | Code
- Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction, Tian et al., NeruIPS 2024 Best Paper | Code
- Muse: Text-To-Image Generation via Masked Generative Transformers, Chang et al., PMLR 2023 | Project
- Scaling Autoregressive Models for Content-Rich Text-to-Image Generation, Yu et al., TMLR 2022
- Autoregressive Image Generation using Residual Quantization, Lee et al., CVPR 2022 | Code
- Vector-quantized Image Modeling with Improved VQGAN, Yu et al., ICLR 2022 | Project
- Taming Transformers for High-Resolution Image Synthesis, Esser et al., CVPR 2021 | Code
- Generative Pretraining from Pixels, Chen et al., ICML 2020 | Code
- Parallel Multiscale Autoregressive Density Estimation, Reed et al., ICML 2017
- Conditional Image Generation with PixelCNN Decoders, Van den Oord et al., NIPS 2016 | Code
Video Generation
- Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective, Yuan er al., arXiv 2025 | Code
- Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation, Lin et al., arXiv 2025 | Project
- Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion , Huang et al., arXiv 2025 | Project | Code
- VideoMAR: Autoregressive Video Generation with Continuous Tokens, Yu et al., arXiv 2025 | Project
- Video-GPT via Next Clip Diffusion, Zhuang et al., arXiv 2025 | Project | Code
- MAGI-1: Autoregressive Video Generation at Scale, Sand.ai, arXiv 2025 | Project | Code
- Packing Input Frame Context in Next-Frame Prediction Models for Video Generation, Zhang et al., arXiv 2025 | Code
- AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion, Sun et al., CVPR 2025 | Project | Code
- Autoregressive Video Generation without Vector Quantization, Deng et al., ICLR 2025 | Code
- DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models, Li et al., arXiv 2024 | Project
- Progressive Autoregressive Video Diffusion Models, Xie et al., arXiv 2024 | Project | Code
- Loong: Generating Minute-level Long Videos with Autoregressive Language Models, Wang et al., arXiv 2024 | Project
- ARLON: Boosting Diffusion Transformers With Autoregressive Models for Long Video Generation, Li et al., ICLR 2025 | Project
- LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior, Wang et al., ICLR 2025 Oral | Project | Code
- VideoPoet: A Large Language Model for Zero-Shot Video Generation, Kondratyuk et al., ICML 2024 | Project
- ART•V: Auto-Regressive Text-to-Video Generation with Diffusion Models, Weng et al., arXiv 2023 | Project
- Transframer: Arbitrary Frame Prediction with Generative Models, Nash et al., TMLR 2023 | Code
- CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers, Hong et al., ICLR 2023 | Code
- VideoGPT: Video Generation using VQ-VAE and Transformers, Yan et al., arXiv 2021 | Project | Code