Active Reward Learning and Iterative Trajectory Improvement from Comparative Language Feedback (Journal Paper Under Review)

Authors: Eisuke Hirota*, Zhaojing Yang*, Ayano Hiranaka, Miru Jun, Jeremy Tien, Stuart Russell, Anca Dragan, Erdem Bıyık

Set-Up

First, let's create a virtual env using pyvenv or conda:

conda create -n lal python=3.8
conda activate lal

Next, we'll clone the repo and set our requirements up:

git clone https://github.com/USC-Lira/language_active_learning.git
cd language_active_learning/
pip install -r requirements.txt

Run

There are 6 different methods to run:

Comparison (non-active, random queries)
Language (non-active, random queries)
ActiveLanguage (active, information gain, linear nn)
PurelyBayesian (active, information gain, Bayesian ML)
BALD (active, information gain, deep ensemble)
QbC (active, uncertainty, deep ensemble)

Methods 3-5 use some form of approximate Bayesian inference or variational inference. While this repo provides code for such methods, the paper only leverages Laplace approximation for speed purposes. The usage of other algorithms may consist of refactoring the code. Other algorithms consist of Metropolis-Hastings, Metropolis within Gibbs, and Expectation Propagation.

To run the code, we suggest using any of the slurm scripts like follows:

/home/user/language_active_learning$ sbatch scripts/nn_active_rs.slurm

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
data		data
lal		lal
scripts		scripts
LICENSE		LICENSE
README.md		README.md
lal.pdf		lal.pdf
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Active Reward Learning and Iterative Trajectory Improvement from Comparative Language Feedback (Journal Paper Under Review)

Set-Up

Run

About

Uh oh!

Releases

Packages

Languages

License

USC-Lira/language_active_learning

Folders and files

Latest commit

History

Repository files navigation

Active Reward Learning and Iterative Trajectory Improvement from Comparative Language Feedback (Journal Paper Under Review)

Set-Up

Run

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages