Skip to content

guidj/BOE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BOE

Bandit Observation Engine: simulating context-free bandits.

Running Experiment

First, create a YAML file similar to egreedy-basics.yaml.

To see run options, including param names, type:

python -m boe.experimenting --help

Note that contents of the py directory should be included in PYTHONPATH.

Features

  • Delayed reward (initial delay)
  • Periodic update (periodic delay)
  • Periodic snapshot of bandit state
  • HTML report on experiment

Supported algorithms

Reported Metrics

There are four reported metrics, which are saved at each snapshot:

  • Probability of selection per arm
  • Average reward per arm
  • Cumulative reward per arm
  • Global cumulative reward

Average reward per arm: Upper Confidence Bound

There is an option of reporting the Upper Confidence Bound (UCB) along with the average reward per arm. The parameter report-ucb defines this behavior.

When UCB is reported, there is an option to discount the value, linearly. This is strictly for plotting purposes. UCB is relative measure, and as such, should be used to estimate the confidence of knowledge of one arm over other arms. Not as an absolute measure.

The computed UCB at each snapshot is stored as is.

Environment

Use python3.

About

Bandit Observation Engine: simulating context-free bandits.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published