A series of bandit algorithms in Swift, built with functional programing and immutable data structures. Inspired by johnmyleswhite/BanditsBook.
To run on the command line:
- $ swift build// requires you to be in the- ./Swiperdirectory
- $ ./build/debug/Swiper// builds a- results_swift.tsvfile in your- ~/Documents/directory
The epsilon in the Epsilon-greedy strategy controls the proportion of explorations vs exploitations.
Example usage:
let epsilonGreedy = EpsilonGreedy(epsilon: 0.1, nArms: 2)
let selectedArm = epsilonGreedy.selectArm()
somethingWithCallback(color: selectedArm) { (reward) in
  let updatedEpsilonGreedy = epsilonGreedy.update(selectedArm, reward: reward)
}The Annealing Softmax object selects arms based on a softmax function. This object does not require a temperature—the algorithm automatically manages it via simulated annealing.
Example usage:
let softmax = Softmax(nArms: 4)
let selectedArm = softmax.selectArm()
somethingWithCallback(copy: selectedArm) { (reward) in
  let updatedSoftmax = softmax.update(selectedArm, reward: reward)
}The UCB strategy uses context to select its next arm. The UCB1 assumes that your max reward is a value of 1.
Example useage:
let ucb = UCB1(nArms: 3)
let selectedArm = ucb.selectArm()
somethingWithCallback(displayPopup: selectedArm) { (reward) in
  let updatedUcb = ucb.update(selectedArm, reward: reward)
}