Fix input layer norm mismatch for Eagle Speculative Decoding compatib… (#548)

mmkamani7 · web-flow · commit db892e757b90 · 2025-05-16T19:58:43.000-04:00
* Fix input layer norm mismatch for Eagle Speculative Decoding compatibility The LLaMA decoder layer applies input layer normalization at every layer, whereas Eagle omits it for the initial layer, using a dummy InputLayerNorm class instead. Recently, LLaMA's input layer norm implementation (https://github.com/ROCm/vllm/blob/262ed1e16c5bd71f0612b700186854b8c932565d/vllm/model_executor/models/llama.py#L326) was updated to accept at most 3 inputs. To maintain compatibility and prevent Eagle Speculative Decoding from failing, this dummy class needs to be updated accordingly. * Update eagle.py
diff --git a/vllm/model_executor/models/eagle.py b/vllm/model_executor/models/eagle.py
@@ -29,7 +29,7 @@ def __init__(self, weight=None, bias=None):
         self.weight = nn.Parameter(weight) if weight is not None else None
         self.bias = nn.Parameter(bias) if bias is not None else None
 
-    def forward(self, x):
+    def forward(self, x, residual=None, scale=None):
         return x