Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Paper | Model

Demo

⚡️Check out the lightning speed of LPD!

lpd_demo_compressed.mp4

News

[2025/07] 🔥 We release the code and models for LPD!

Abstract

We present Locality-aware Parallel Decoding (LPD) to accelerate autoregressive image generation. Traditional autoregressive image generation relies on next-patch prediction, a memory-bound process that leads to high latency. Existing works have tried to parallelize next-patch prediction by shifting to multi-patch prediction to accelerate the process, but only achieved limited parallelization. To achieve high parallelization while maintaining generation quality, we introduce two key techniques: (1) Flexible Parallelized Autoregressive Modeling, a novel architecture that enables arbitrary generation ordering and degrees of parallelization. It uses learnable position query tokens to guide generation at target positions while ensuring mutual visibility among concurrently generated tokens for consistent parallel decoding. (2) Locality-aware Generation Ordering, a novel schedule that forms groups to minimize intra-group dependencies and maximize contextual support, enhancing generation quality. With these designs, we reduce the generation steps from 256 to 20 (256x256 res.) and 1024 to 48 (512x512 res.) without compromising quality on the ImageNet class-conditional generation, and achieving at least 3.4x lower latency than previous parallelized autoregressive models.

Preparation

Environment Setup

git clone https://github.com/mit-han-lab/lpd
cd lpd
bash environment_setup.sh lpd

Models

Download the LlamaGen tokenizer and place it in tokenizers. Download LPD models from Huggingface.

Model	#Para.	#Steps	FID-50K	IS	Latency(s)	Throughput(img/s)
LPD-L-256	337M	20	2.40	284.5	0.28	139.11
LPD-XL-256	752M	20	2.10	326.7	0.41	75.20
LPD-XXL-256	1.4B	20	2.00	337.6	0.55	45.07
LPD-L-256	337M	32	2.29	282.7	0.46	110.34
LPD-XL-256	752M	32	1.92	319.4	0.66	61.24
LPD-L-512	337M	48	2.54	292.2	0.69	35.16
LPD-XL-512	752M	48	2.10	326.0	1.01	18.18

Dataset

If you conduct training, please download ImageNet dataset and palce it in your IMAGENET_PATH. To accelerate training, we recommend precomputing the tokenizer latents and saving them to CACHED_PATH. Please set the --img_size to either 256 or 512.

torchrun --nproc_per_node=8 --nnodes=1 \
main_cache.py \
--img_size 256 --vqgan_path tokenizers/vq_ds16_c2i.pt \
--data_path ${IMAGENET_PATH} --cached_path ${CACHED_PATH}

Usage

Evaluation

First, generate the LPD orders. Alternatively, you may download the pre-generated orders and place them in orders/lpd_orders_generated.

bash orders/run_lpd_order.sh

Then, run the evaluation scripts located in scripts/eval. For example, to evaluate LPD-L-256 using 20 steps:

bash scripts/eval/lpd_l_res256_steps20.sh

Note: Please set --pretrained_ckpt to the path of the downloaded LPD model, and specify --output_dir.

Training

Run the training scripts located in scripts/train. For example, to train LPD-L-256:

python scripts/cli/run.py -J lpd_l_256 -p your_slurm_partition -A your_slurm_account -N 4 bash scripts/train/lpd_l_256.sh

Acknowledgements

Thanks to MAR for the wonderful open-source codebase.

We thank MIT-IBM Watson AI Lab, National Science Foundation, Hyundai, and Amazon for supporting this research.

Citation

If you find LPD useful or relevant to your project and research, please kindly cite our paper:

@article{zhang2025locality,
  title={Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation},
  author={Zhang, Zhuoyang and Huang, Luke J and Wu, Chengyue and Yang, Shang and Peng, Kelly and Lu, Yao and Han, Song},
  journal={arXiv preprint arXiv:2507.01957},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
fid_stats		fid_stats
models		models
orders		orders
scripts		scripts
util		util
LICENSE		LICENSE
README.md		README.md
engine.py		engine.py
environment_setup.sh		environment_setup.sh
main.py		main.py
main_cache.py		main_cache.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Paper | Model

Demo

News

Abstract

Preparation

Environment Setup

Models

Dataset

Usage

Evaluation

Training

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

mit-han-lab/lpd

Folders and files

Latest commit

History

Repository files navigation

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Paper | Model

Demo

News

Abstract

Preparation

Environment Setup

Models

Dataset

Usage

Evaluation

Training

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages