This repository contains the official implementation for the paper "Greedy, Not Needy: A General Paradigm for Efficient Decoding in Large Language Models".
Implemention of three different decoding strategies are done:
- AdaSearch - An adaptive search algorithm for efficient decoding (Ours)
- Best-of-N Sampling - Generate N candidates and select the best one using a reward model
- Vanilla Decoding - Standard greedy/sampling-based generation
Install the required dependencies:
pip install -r requirements.txtBefore running the scripts, configure the appropriate YAML files in the configs/ directory
Run the AdaSearch algorithm to generate responses:
python run-adasearch.pyRun the Best-of-N sampling approach:
python bon.pyRun standard vanilla decoding:
python vanilla.pyCompare AdaSearch results against vanilla decoding:
python blockwise-vs-vanilla.pyCompare Best-of-N results against vanilla decoding:
python best-of-n-vs-vanilla.pyWe have extensively evaluated our method on three alignment directions:
- HarmfulQA - Questions requiring careful handling to avoid harmful responses
- HH-RLHF - Human preference data from Anthropic for helpful and harmless responses
- IMDB - Movie review sentiment data for generating positive sentiment
- OpenAI GSM - Grade school math problems for improving mathematical reasoning capabilities
This project is licensed under the MIT License - see the LICENSE file for details.
