Skip to content

Commit 0d5e880

Browse files
rahul-tuliclaude
andcommitted
feat: Add verification scripts for Eagle models
- Add comprehensive verification script for Eagle-1 and Eagle-3 models - Add quick test script for config loading and imports - Include documentation for usage and troubleshooting 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent e6027f1 commit 0d5e880

File tree

3 files changed

+276
-0
lines changed

3 files changed

+276
-0
lines changed

local/README.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Eagle Models Verification Scripts
2+
3+
This folder contains verification scripts for testing Eagle and Eagle-3 speculator models with vLLM.
4+
5+
## Scripts
6+
7+
### `verify_eagle_models.py`
8+
Full verification script that tests both Eagle-1 and Eagle-3 models with actual model loading and generation.
9+
10+
**Models tested:**
11+
- **Eagle-1**: `nm-testing/eagle-llama3.1-8b-instruct` with `meta-llama/Meta-Llama-3.1-8B-Instruct`
12+
- **Eagle-3**: `nm-testing/eagle3-llama3.1-8b-instruct-speculators` with `meta-llama/Meta-Llama-3.1-8B-Instruct`
13+
14+
**Usage:**
15+
```bash
16+
cd /home/rahul/vllm
17+
source .venv/bin/activate
18+
python local/verify_eagle_models.py
19+
```
20+
21+
**Requirements:**
22+
- Sufficient GPU memory (models are ~8B parameters each)
23+
- Network access to download models from HuggingFace
24+
25+
### `quick_eagle_test.py`
26+
Lightweight test script that verifies configuration loading and imports without full model initialization.
27+
28+
**Tests:**
29+
- Config detection for speculators vs regular Eagle models
30+
- Model class imports
31+
- Engine argument handling
32+
- Speculative config creation
33+
34+
**Usage:**
35+
```bash
36+
cd /home/rahul/vllm
37+
source .venv/bin/activate
38+
python local/quick_eagle_test.py
39+
```
40+
41+
**Requirements:**
42+
- Minimal - only tests imports and config loading
43+
44+
## Expected Output
45+
46+
### Successful Run
47+
```
48+
✓ Eagle-3 speculators format detected correctly
49+
✓ Regular Eagle model correctly not detected as speculators
50+
✓ Eagle-1 model loaded successfully
51+
✓ Eagle-3 model loaded successfully
52+
🎉 All Eagle models are working correctly!
53+
```
54+
55+
### Model Configuration Details
56+
57+
**Eagle-1 (Regular Format):**
58+
- Uses standard vLLM Eagle configuration
59+
- Model: `nm-testing/eagle-llama3.1-8b-instruct`
60+
- Target: `meta-llama/Meta-Llama-3.1-8B-Instruct`
61+
62+
**Eagle-3 (Speculators Format):**
63+
- Uses speculators library configuration format
64+
- Model: `nm-testing/eagle3-llama3.1-8b-instruct-speculators`
65+
- Target: `meta-llama/Meta-Llama-3.1-8B-Instruct`
66+
- Automatically detected and converted by `SpeculatorsEagleConfig`
67+
68+
## Troubleshooting
69+
70+
### Common Issues
71+
72+
1. **CUDA out of memory**: Reduce `max_model_len` or use a machine with more GPU memory
73+
2. **Model download errors**: Ensure network connectivity and HuggingFace access
74+
3. **Import errors**: Verify vLLM installation and that you're in the correct environment
75+
76+
### Debug Steps
77+
78+
1. Run `quick_eagle_test.py` first to verify basic functionality
79+
2. Check that both target and draft models are accessible:
80+
```bash
81+
python -c "from transformers import AutoConfig; print(AutoConfig.from_pretrained('nm-testing/eagle3-llama3.1-8b-instruct-speculators'))"
82+
```
83+
3. Test with smaller models if memory is limited
84+
85+
## Development Notes
86+
87+
These scripts verify the speculators Eagle support implementation:
88+
89+
- **Config Translation**: `SpeculatorsEagleConfig` converts speculators format to vLLM format
90+
- **Model Detection**: `is_speculators_eagle_config()` identifies speculators models
91+
- **V1 Engine Support**: Uses V1 engine with `llama_eagle.py` implementation
92+
- **Weight Mapping**: Handles speculators weight name translation

local/quick_eagle_test.py

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Quick Eagle test script for faster verification during development.
4+
5+
This script runs a minimal test to verify Eagle models are working without
6+
full model initialization overhead.
7+
"""
8+
9+
def test_config_loading():
10+
"""Test that Eagle configs can be loaded properly."""
11+
print("Testing Eagle config loading...")
12+
13+
try:
14+
from vllm.transformers_utils.configs.speculators_eagle import (
15+
SpeculatorsEagleConfig,
16+
is_speculators_eagle_config
17+
)
18+
19+
# Test speculators detection
20+
is_speculators = is_speculators_eagle_config("nm-testing/eagle3-llama3.1-8b-instruct-speculators")
21+
print(f"✓ Eagle-3 speculators detection: {is_speculators}")
22+
23+
# Test regular Eagle detection
24+
is_regular = is_speculators_eagle_config("nm-testing/eagle-llama3.1-8b-instruct")
25+
print(f"✓ Regular Eagle detection (should be False): {is_regular}")
26+
27+
# Try loading a speculators config
28+
if is_speculators:
29+
config = SpeculatorsEagleConfig.from_pretrained(
30+
"nm-testing/eagle3-llama3.1-8b-instruct-speculators"
31+
)
32+
print(f"✓ Config loaded successfully")
33+
print(f" - Method: {getattr(config, 'method', 'N/A')}")
34+
print(f" - Num lookahead tokens: {getattr(config, 'num_lookahead_tokens', 'N/A')}")
35+
print(f" - Model type: {getattr(config, 'model_type', 'N/A')}")
36+
37+
return True
38+
39+
except Exception as e:
40+
print(f"✗ Config test failed: {str(e)}")
41+
return False
42+
43+
def test_model_imports():
44+
"""Test that Eagle model classes can be imported."""
45+
print("\nTesting Eagle model imports...")
46+
47+
try:
48+
# Test V1 Eagle model import
49+
from vllm.model_executor.models.llama_eagle import EagleLlamaForCausalLM
50+
print("✓ V1 Eagle model imported successfully")
51+
52+
# Test V0 Eagle model import
53+
from vllm.model_executor.models.eagle import EAGLEModel
54+
print("✓ V0 Eagle model imported successfully")
55+
56+
# Test detection utilities
57+
from vllm.engine.arg_utils import EngineArgs
58+
print("✓ Engine args imported successfully")
59+
60+
return True
61+
62+
except Exception as e:
63+
print(f"✗ Import test failed: {str(e)}")
64+
return False
65+
66+
def test_engine_args():
67+
"""Test that speculative config can be created."""
68+
print("\nTesting engine argument handling...")
69+
70+
try:
71+
from vllm.engine.arg_utils import EngineArgs
72+
73+
# Test creating engine args with Eagle-3 speculative config
74+
args = EngineArgs(
75+
model="meta-llama/Meta-Llama-3.1-8B-Instruct",
76+
speculative_config={
77+
"method": "eagle",
78+
"model": "nm-testing/eagle3-llama3.1-8b-instruct-speculators",
79+
"num_spec_tokens": 5
80+
}
81+
)
82+
83+
print("✓ EngineArgs created successfully")
84+
85+
# Test speculative config creation
86+
spec_config = args.create_speculative_config(
87+
args.speculative_config,
88+
model_config=None # We're just testing creation
89+
)
90+
91+
if spec_config:
92+
print("✓ Speculative config created successfully")
93+
print(f" - Method: {spec_config.method}")
94+
print(f" - Draft model: {spec_config.model}")
95+
print(f" - Spec tokens: {spec_config.num_spec_tokens}")
96+
97+
return True
98+
99+
except Exception as e:
100+
print(f"✗ Engine args test failed: {str(e)}")
101+
return False
102+
103+
def main():
104+
"""Run quick tests."""
105+
print("Running Quick Eagle Verification Tests")
106+
print("=" * 50)
107+
108+
tests = [
109+
test_config_loading,
110+
test_model_imports,
111+
test_engine_args,
112+
]
113+
114+
passed = 0
115+
for test in tests:
116+
if test():
117+
passed += 1
118+
119+
print(f"\n{'=' * 50}")
120+
print(f"Quick Tests Summary: {passed}/{len(tests)} passed")
121+
122+
if passed == len(tests):
123+
print("🎉 All quick tests passed! Eagle support is working.")
124+
else:
125+
print("⚠️ Some tests failed. Check the output above.")
126+
127+
return 0 if passed == len(tests) else 1
128+
129+
if __name__ == "__main__":
130+
exit(main())

local/verify_eagle_models.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Simple verification script for Eagle and Eagle-3 speculator models.
4+
"""
5+
6+
from vllm import LLM, SamplingParams
7+
8+
def test_eagle1():
9+
"""Test Eagle-1 model."""
10+
print("Testing Eagle-1...")
11+
12+
llm = LLM(
13+
model="meta-llama/Meta-Llama-3.1-8B-Instruct",
14+
speculative_config={
15+
"method": "eagle",
16+
"model": "nm-testing/eagle-llama3.1-8b-instruct",
17+
"num_spec_tokens": 5
18+
},
19+
max_model_len=1024,
20+
enforce_eager=True
21+
)
22+
23+
outputs = llm.generate(["AI is"], SamplingParams(max_tokens=20, temperature=0))
24+
print(f"Eagle-1 output: {outputs[0].outputs[0].text}")
25+
print("✓ Eagle-1 works!")
26+
27+
def test_eagle3():
28+
"""Test Eagle-3 model."""
29+
print("\nTesting Eagle-3...")
30+
31+
llm = LLM(
32+
model="meta-llama/Meta-Llama-3.1-8B-Instruct",
33+
speculative_config={
34+
"method": "eagle",
35+
"model": "nm-testing/eagle3-llama3.1-8b-instruct-speculators",
36+
"num_spec_tokens": 5
37+
},
38+
max_model_len=1024,
39+
enforce_eager=True
40+
)
41+
42+
outputs = llm.generate(["AI is"], SamplingParams(max_tokens=20, temperature=0))
43+
print(f"Eagle-3 output: {outputs[0].outputs[0].text}")
44+
print("✓ Eagle-3 works!")
45+
46+
if __name__ == "__main__":
47+
print("Eagle Models Verification\n")
48+
49+
try:
50+
test_eagle1()
51+
test_eagle3()
52+
print("\n🎉 All tests passed!")
53+
except Exception as e:
54+
print(f"\n❌ Test failed: {e}")

0 commit comments

Comments
 (0)