-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Summary
Implement and finalize src/speculators/models/independent.py
to support initial speculative decoding algorithms requiring an independent or separate draft model as the speculator.
References
- Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation
- Fast Inference from Transformers via Speculative Decoding
Acceptance Criteria
Classes and Test Cases
- Implement
IndependentSpeculatorConfig
andIndependentSpeculator
following the example insrc/speculators/models/eagle.py
. - Ensure compatibility with
SpeculatorModelConfig.from_pretrained
andSpeculatorModel.from_pretrained
. - Implement full test cases following the examples in
tests/unit/models/test_eagle_config.py
andtests/unit/models/test_eagle_model.py
.
IndependentSpeculatorConfig
- Include all relevant hyperparameters expected to change or be configured to construct a working Speculator model as defined in the referenced papers.
IndependentSpeculator
- Correctly create the required architecture from a given
IndependentSpeculatorConfig
. - Enable loading and saving of weights.
- Integrate seamlessly with the existing system.
TokenProposal Functionality
- Implement any missing
TokenProposal
methods or functionality as defined in the papers, or expand current implementations as needed.
Out of Scope (Future Targets)
- Implement a functioning forward pass for
SpeculatorModel
compatible with training flows. - Implement a functioning generate pass for
SpeculatorModel
compatible with generation flows. - Create an Algorithm factory to handle preconfigured hyperparameters for the desired supported algorithms.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers