D2fEngine

vLLM implementation for Diffusion LLMs, D2F is integrated as the core inference strategy, while also support training-free strategies like Fast-dLLM.

Foundation of Our vLLM Implementation

Based on Nano-vLLM.

How We Implement

Easy Install D2F-vLLM

pip install d2f_vllm

Configure the Project from Source (for Developers)

We use UV to manage the whole project.

Install UV

UV Installation

Initialize the Project

uv sync
source .venv/bin/activate
uv pip install -e .

For easy-activation:

echo "alias uvon=source .venv/bin/activate" >> ~/.zshrc # If using bash, change to .bashrc
source ~/.zshrc

Then, use uvon under the project root path to activate.

Download vLLM

uv pip install vllm

D2F-vLLM still depends on some modules of vLLM, however, there are some problems lies in UV venv management, thus we have to install vLLM independently.

Download Flash Attention (NO NEED RIGHT NOW)

uv pip install flash-attn --no-build-isolation

If not working, build flash-attn from scratch. This may take some while (most of the time is cost on compiling cutlass).

git submodule update --init --recursive
cd third_party/flash-attn
MAX_JOBS=$(nproc) python setup.py install --verbose

User Guideline

Setting Generation Mode

Setting add_new_block_threshold<1.0, together with our D2F training strategy, enables support for the D2F-specific decoding paradigm.

In contrast, setting add_new_block_threshold=1.0 allows compatibility with Fast-dLLM inference, which is Training-free.

TODO List

Implement KV Cache loading kernel
Tensor Parallel
Data Parallel
Implement Async Engine and Streaming Generation
Faster Flash Attention Kernel
Diffusion LM CUDA Graph Capturing

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github/workflows		.github/workflows
.vscode		.vscode
d2f_engine		d2f_engine
document		document
examples		examples
imgs		imgs
scripts		scripts
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

D2fEngine

Foundation of Our vLLM Implementation

How We Implement

Easy Install D2F-vLLM

Configure the Project from Source (for Developers)

Install UV

Initialize the Project

Download vLLM

Download Flash Attention (NO NEED RIGHT NOW)

User Guideline

Setting Generation Mode

TODO List

About

Uh oh!

Releases

Packages

Languages

License

jpli02/d2f_vllm

Folders and files

Latest commit

History

Repository files navigation

D2fEngine

Foundation of Our vLLM Implementation

How We Implement

Easy Install D2F-vLLM

Configure the Project from Source (for Developers)

Install UV

Initialize the Project

Download vLLM

Download Flash Attention (NO NEED RIGHT NOW)

User Guideline

Setting Generation Mode

TODO List

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages