Speculate, then Collaborate (CoSD)

ICML 2025 Poster · Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding

CoSD is a plug-in algorithm for speculative decoding that fuses knowledge from a draft model and an assistant model during decoding. It is model-agnostic and can be integrated into existing speculative decoding implementations with minimal changes.

This repository builds on MCSD and evaluates with tinyBenchmarks.
Base code: https://github.com/NJUNLP/MCSD
Evaluation: https://github.com/felipemaiapolo/tinyBenchmarks

Highlights

Drop-in plugin for any speculative decoding pipeline
Knowledge fusion that uses draft and target signals
Reproducible evaluation with tinyBenchmarks

Installation

# Python ≥ 3.9 with CUDA is recommended
git clone <this-repo-url>
cd <this-repo-dir>

# (Optional) create environment
# conda create -n cosd python=3.10 -y
# conda activate cosd

# Install common dependencies
pip install torch transformers accelerate datasets ...

Data

We follow tinyBenchmarks. You can:

Clone tinyBenchmarks and point --datapath to its prepared data files, or
Provide your own JSON/JSONL in the format expected by evaluation.py.

git clone https://github.com/felipemaiapolo/tinyBenchmarks
# Prepare paths and pass to --datapath (see below)

Quick Start

Minimal command to run CoSD on tinyBenchmarks:

python evaluation.py \
  --draft-model PATH_TO_DRAFT_MODEL \
  --target-model PATH_TO_TARGET_MODEL \
  --fp16 \
  --k-config 4,2,2 \
  --datapath PATH_TO_DATA # you can use an empty file since the evaluation is done by tinyBenchmarks\
  --sampling-type sampling

Example (Hugging Face model ids)

python evaluation.py \
  --draft-model mistralai/Mistral-7B-v0.1 \
  --target-model meta-math/MetaMath-Mistral-7B \
  --fp16 \
  --k-config 4,2,2 \
  --datapath ./data/empty.jsonl \
  --sampling-type sampling

Key Arguments

--draft-model (str): Draft model path or Hugging Face id
--target-model (str): Assistant model path or Hugging Face id
--fp16 (flag): Enable FP16 inference
--k-config (str): Comma-separated speculation schedule, e.g., 4,2,2, special arguments in MCSD
--datapath (str): Evaluation data path (can be empty in our code and will not be used)
--sampling-type (str): Decoding mode, e.g., sampling or greedy

Tip: run python evaluation.py -h for full options.

Integrating CoSD into Your Pipeline

CoSD is a lightweight plugin:

Initialize draft and target models as usual
Train a decision tree with a few data samples if using CoSD-Tree
Replace the speculative accept/reject step with CoSD’s fusion step
Call generate(...) as usual; log both quality and speed statistics

See cosd/ or the CoSD class in this repository for a minimal integration example.

Acknowledgments

MCSD: https://github.com/NJUNLP/MCSD
tinyBenchmarks: https://github.com/felipemaiapolo/tinyBenchmarks

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
CoSD		CoSD
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speculate, then Collaborate (CoSD)

Highlights

Installation

Data

Quick Start

Example (Hugging Face model ids)

Key Arguments

Integrating CoSD into Your Pipeline

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

ziyaow1010/CoSD

Folders and files

Latest commit

History

Repository files navigation

Speculate, then Collaborate (CoSD)

Highlights

Installation

Data

Quick Start

Example (Hugging Face model ids)

Key Arguments

Integrating CoSD into Your Pipeline

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages