Skip to content

Commit b08e3de

Browse files
rahul-tuliclaude
andcommitted
fix: Complete speculators Eagle support fixes
- Updated llama_eagle.py to skip transformer weights (loaded separately) - Added num_lookahead_tokens to speculators config (required for Eagle) - Together these fixes allow speculators Eagle models to work with V1 engine Signed-off-by: rtuli@redhat.com 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Rahul Tuli <rtuli@redhat.com>
1 parent 7ad0c07 commit b08e3de

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

vllm/model_executor/models/llama_eagle.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -117,9 +117,8 @@ def load_weights(self, weights: Iterable[tuple[str,
117117
if name in speculators_name_map:
118118
name = speculators_name_map[name]
119119
elif name.startswith("transformer."):
120-
# transformer.* -> model.layers.0.*
121-
suffix = name[len("transformer."):]
122-
name = f"model.layers.0.{suffix}"
120+
# Skip transformer weights - they're loaded separately
121+
continue
123122

124123
for param_name, weight_name, shard_id in stacked_params_mapping:
125124
if weight_name not in name:

vllm/transformers_utils/configs/speculators_eagle.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@ def _convert_speculators_to_vllm(cls, speculators_config: dict) -> dict:
8989
"eagle_fc_bias": speculators_config.get("fusion_bias", False),
9090
"truncated_vocab_size": transformer_config.get("vocab_size"),
9191
"method": speculators_config.get("speculators_model_type", "eagle"), # Use speculators_model_type
92+
"num_lookahead_tokens": 5, # Default number of speculative tokens for Eagle
9293
}
9394

9495
# Preserve any additional fields that might be needed

0 commit comments

Comments
 (0)