This project implements an AI agent to play a deterministic variant of the "Einstein würfelt nicht!" (EWN) board game. The AI utilizes the Monte Carlo Tree Search (MCTS) algorithm to make decisions.
The core source code is organized as follows:
main.cpp
: Main entry point, handles communication with the game client.decide.cpp
: Contains the core MCTS algorithm implementation.makefile
: Used to compile the project into an executable namedagent
.pcg_extras.hpp
,pcg_random.hpp
: Headers for the PCG (Permuted Congruential Generator) random number generation library, used for simulations.board/
:board.cpp
: Implements theBoard
class, managing game state and logic.board.hpp
: Defines theBoard
struct and related constants.tables.hpp
: Contains precomputed lookup tables for efficient move generation.
math_lib/
:maths.cpp
: Provides mathematical utility functions, including UCB calculation.maths.hpp
: Declarations for math functions.
To compile the project, run make
:
make
This will create an executable named agent
.
The agent
executable is designed to communicate with an external game client. It reads game state updates from standard input and outputs its chosen moves to standard error.
Example interaction (conceptual):
# Client sends game state
# AI processes and outputs move
====> [dice_value] [moving_piece] [step_destination]
The game is a two-player deterministic variant of "Einstein würfelt nicht!".
-
Board: A 5x5 grid. Each player (Red and Blue) has 6 pieces, numbered 1 through 6.
-
Dice: Unlike the original game, the dice value is not random. The current player chooses the dice value for their opponent.
-
Moving Procedure:
- Determine Movable Piece(s):
- If the chosen dice number matches an existing piece, only that matched piece can be moved.
- If the dice number does not match any existing piece, the player can choose to move:
- The smallest piece whose number is greater than the dice number (if such a piece exists).
- The largest piece whose number is smaller than the dice number (if such a piece exists).
- Piece Movement and Capture:
- After selecting a piece, it can be moved in 3 specific directions:
- Red Side: Down, Right, Down-Right.
- Blue Side: Up, Left, Up-Left.
- If a piece is at the destination square, it is captured and removed from the board.
- After selecting a piece, it can be moved in 3 specific directions:
- Determine Movable Piece(s):
-
Win Conditions:
- Goal Position: A player wins if one of their pieces reaches the opponent's goal corner:
- Red Goal: Bottom-right corner.
- Blue Goal: Top-left corner.
- Capture All: A player wins if all of the opponent's pieces are captured.
- Goal Position: A player wins if one of their pieces reaches the opponent's goal corner:
The AI agent employs the Monte Carlo Tree Search (MCTS) algorithm, which involves four main phases:
- Selection: The algorithm traverses the game tree from the root, selecting child nodes based on the Upper Confidence Bound (UCB) formula. The
fast_UCB
function inmath_lib/maths.cpp
is used for this purpose, adjusting for MIN (opponent) and MAX (player) nodes. - Expansion: When a leaf node (a node not yet fully explored) is reached, new child nodes are generated for all possible next moves. The
Board::expand()
method handles this, creating newBoard
objects for each child. - Simulation (Playout): From each newly expanded child node, a random playout (simulation) is performed until the game reaches a terminal state (win/loss). The
Board::simulate()
method uses the PCG random number generator for move selection during playouts. - Backpropagation: The result of each simulation (win or loss) is propagated back up the tree to update the statistics of all nodes along the traversed path. The
Board::update()
method updatesNtotal
(total visits) andsum1
(total wins) for each node.
Key Features:
- Pre-allocated Game Tree: The game tree is implemented as a fixed-size array (
Board game_tree[MAX_NODE]
) to avoid dynamic memory allocation overhead. - Incremental Node Updates: Tree node statistics are updated incrementally during backpropagation for efficiency.
- Precomputed Tables:
movable_piece_table
andstep_table
are used to optimize move generation, ensuring efficient calculation of possible moves. - Fast Math Utilities: Functions like
fast_log2
andinverse_rsqrt
are included inmath_lib/
for performance optimization in UCB calculations.
During the development of this AI agent, several MCTS refinements were explored, as detailed in the original report.md
. The primary goal was to enhance the agent's performance against various baselines.
- UCB with Variance: An attempt was made to incorporate variance into the UCB calculation. While this increased the complexity of UCB calculation and node maintenance, it did not improve overall performance and actually slowed down simulations.
- Node Expansion Refinement (Transition Probability): A policy was tested that checked if a node's score (Mean - 0.1 * Variance) was high enough compared to its siblings' Mean scores before expansion. This method proved to be time-consuming and ineffective.
- Node Expansion Refinement (Visited Count): This method modifies the tree node structure to include a
visit_count
field. If a leaf node'svisit_count
is less than 2, its count is incremented, and a batch simulation is performed. Otherwise, normal expansion and simulation proceed. This method was chosen as the final MCTS implementation due to its simplicity and tangible, albeit limited, effects.
Experiments were conducted with MAX_SIMULATION_COUNT
=2,000,000. The table below shows the win rates of each agent versus the others (each result from 100 tests):
Basic | Variance | Transition probability | Visited count | |
---|---|---|---|---|
Basic | x | 45 | 5 | 45 |
Variance | 55 | x | 3 | 44 |
Transition probability | 95 | 97 | x | 99 |
Visited count | 55 | 56 | 1 | x |
Overall Performance: Transition probability < Visited count < Variance < Basic.
The experiments indicated that the basic MCTS implementation surprisingly outperformed the enhanced versions (UCB+Variance and Visited Count). The "Transition probability" method was the weakest. This phenomenon is attributed to the nature of EWN, where random movements during the simulation phase can lead to data discrepancies, and additional modifications might amplify this issue. Despite this, the "Visited count" method was retained for its simplicity and some observed effects.