Skip to content

ziyaow1010/CoSD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

Speculate, then Collaborate (CoSD)

ICML 2025 Poster · Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding

CoSD is a plug-in algorithm for speculative decoding that fuses knowledge from a draft model and an assistant model during decoding. It is model-agnostic and can be integrated into existing speculative decoding implementations with minimal changes.

This repository builds on MCSD and evaluates with tinyBenchmarks.
Base code: https://github.com/NJUNLP/MCSD
Evaluation: https://github.com/felipemaiapolo/tinyBenchmarks


Highlights

  • Drop-in plugin for any speculative decoding pipeline
  • Knowledge fusion that uses draft and target signals
  • Reproducible evaluation with tinyBenchmarks

Installation

# Python ≥ 3.9 with CUDA is recommended
git clone <this-repo-url>
cd <this-repo-dir>

# (Optional) create environment
# conda create -n cosd python=3.10 -y
# conda activate cosd

# Install common dependencies
pip install torch transformers accelerate datasets ...

Data

We follow tinyBenchmarks. You can:

  1. Clone tinyBenchmarks and point --datapath to its prepared data files, or
  2. Provide your own JSON/JSONL in the format expected by evaluation.py.
git clone https://github.com/felipemaiapolo/tinyBenchmarks
# Prepare paths and pass to --datapath (see below)

Quick Start

Minimal command to run CoSD on tinyBenchmarks:

python evaluation.py \
  --draft-model PATH_TO_DRAFT_MODEL \
  --target-model PATH_TO_TARGET_MODEL \
  --fp16 \
  --k-config 4,2,2 \
  --datapath PATH_TO_DATA # you can use an empty file since the evaluation is done by tinyBenchmarks\
  --sampling-type sampling

Example (Hugging Face model ids)

python evaluation.py \
  --draft-model mistralai/Mistral-7B-v0.1 \
  --target-model meta-math/MetaMath-Mistral-7B \
  --fp16 \
  --k-config 4,2,2 \
  --datapath ./data/empty.jsonl \
  --sampling-type sampling

Key Arguments

  • --draft-model (str): Draft model path or Hugging Face id
  • --target-model (str): Assistant model path or Hugging Face id
  • --fp16 (flag): Enable FP16 inference
  • --k-config (str): Comma-separated speculation schedule, e.g., 4,2,2, special arguments in MCSD
  • --datapath (str): Evaluation data path (can be empty in our code and will not be used)
  • --sampling-type (str): Decoding mode, e.g., sampling or greedy

Tip: run python evaluation.py -h for full options.


Integrating CoSD into Your Pipeline

CoSD is a lightweight plugin:

  1. Initialize draft and target models as usual
  2. Train a decision tree with a few data samples if using CoSD-Tree
  3. Replace the speculative accept/reject step with CoSD’s fusion step
  4. Call generate(...) as usual; log both quality and speed statistics

See cosd/ or the CoSD class in this repository for a minimal integration example.


Acknowledgments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages