AdaSearch: A General Paradigm for Efficient Decoding in Large Language Models

This repository contains the official implementation for the paper "Greedy, Not Needy: A General Paradigm for Efficient Decoding in Large Language Models".

Overview

Implemention of three different decoding strategies are done:

AdaSearch - An adaptive search algorithm for efficient decoding (Ours)
Best-of-N Sampling - Generate N candidates and select the best one using a reward model
Vanilla Decoding - Standard greedy/sampling-based generation

Requirements

Install the required dependencies:

pip install -r requirements.txt

Configuration

Before running the scripts, configure the appropriate YAML files in the configs/ directory

Run Decoding Algorithms

Step 1: Generate AdaSearch Results

Run the AdaSearch algorithm to generate responses:

python run-adasearch.py

Step 2: Generate Best-of-N Results

Run the Best-of-N sampling approach:

python bon.py

Step 3: Generate Vanilla Results

Run standard vanilla decoding:

python vanilla.py

Evaluation

AdaSearch vs Vanilla Comparison

Compare AdaSearch results against vanilla decoding:

python blockwise-vs-vanilla.py

Best-of-N vs Vanilla Comparison

Compare Best-of-N results against vanilla decoding:

python best-of-n-vs-vanilla.py

Alignment Directions (Datasets)

We have extensively evaluated our method on three alignment directions:

1. Harmlessness

HarmfulQA - Questions requiring careful handling to avoid harmful responses
HH-RLHF - Human preference data from Anthropic for helpful and harmless responses

2. Positive Sentiment

IMDB - Movie review sentiment data for generating positive sentiment

3. Mathematical Reasoning

OpenAI GSM - Grade school math problems for improving mathematical reasoning capabilities

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AdaSearch: A General Paradigm for Efficient Decoding in Large Language Models

Overview

Requirements

Configuration

Run Decoding Algorithms

Step 1: Generate AdaSearch Results

Step 2: Generate Best-of-N Results

Step 3: Generate Vanilla Results

Evaluation

AdaSearch vs Vanilla Comparison

Best-of-N vs Vanilla Comparison

Alignment Directions (Datasets)

1. Harmlessness

2. Positive Sentiment

3. Mathematical Reasoning

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
data		data
results		results
utils		utils
README.md		README.md
adasearch.py		adasearch.py
best-of-n-vs-vanilla.py		best-of-n-vs-vanilla.py
blockwise-vs-vanilla.py		blockwise-vs-vanilla.py
bon.py		bon.py
requirements.txt		requirements.txt
run-adasearch.py		run-adasearch.py
vanilla.py		vanilla.py

purseclab/AdaSearch

Folders and files

Latest commit

History

Repository files navigation

AdaSearch: A General Paradigm for Efficient Decoding in Large Language Models

Overview

Requirements

Configuration

Run Decoding Algorithms

Step 1: Generate AdaSearch Results

Step 2: Generate Best-of-N Results

Step 3: Generate Vanilla Results

Evaluation

AdaSearch vs Vanilla Comparison

Best-of-N vs Vanilla Comparison

Alignment Directions (Datasets)

1. Harmlessness

2. Positive Sentiment

3. Mathematical Reasoning

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages