Skip to content

LEE-CHI-HSUAN/Deterministic-EWN_MCTS-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Built with C++

EWN AI Agent

This project implements an AI agent to play a deterministic variant of the "Einstein würfelt nicht!" (EWN) board game. The AI utilizes the Monte Carlo Tree Search (MCTS) algorithm to make decisions.

⚙️ Project Structure

The core source code is organized as follows:

  • main.cpp: Main entry point, handles communication with the game client.
  • decide.cpp: Contains the core MCTS algorithm implementation.
  • makefile: Used to compile the project into an executable named agent.
  • pcg_extras.hpp, pcg_random.hpp: Headers for the PCG (Permuted Congruential Generator) random number generation library, used for simulations.
  • board/:
    • board.cpp: Implements the Board class, managing game state and logic.
    • board.hpp: Defines the Board struct and related constants.
    • tables.hpp: Contains precomputed lookup tables for efficient move generation.
  • math_lib/:
    • maths.cpp: Provides mathematical utility functions, including UCB calculation.
    • maths.hpp: Declarations for math functions.

🛠️ Build Instructions

To compile the project, run make:

make

This will create an executable named agent.

🚀 Usage

The agent executable is designed to communicate with an external game client. It reads game state updates from standard input and outputs its chosen moves to standard error.

Example interaction (conceptual):

# Client sends game state
# AI processes and outputs move
====> [dice_value] [moving_piece] [step_destination]

🎲 Game Rules (Deterministic EWN)

The game is a two-player deterministic variant of "Einstein würfelt nicht!".

  • Board: A 5x5 grid. Each player (Red and Blue) has 6 pieces, numbered 1 through 6.

  • Dice: Unlike the original game, the dice value is not random. The current player chooses the dice value for their opponent.

  • Moving Procedure:

    1. Determine Movable Piece(s):
      • If the chosen dice number matches an existing piece, only that matched piece can be moved.
      • If the dice number does not match any existing piece, the player can choose to move:
        • The smallest piece whose number is greater than the dice number (if such a piece exists).
        • The largest piece whose number is smaller than the dice number (if such a piece exists).
    2. Piece Movement and Capture:
      • After selecting a piece, it can be moved in 3 specific directions:
        • Red Side: Down, Right, Down-Right.
        • Blue Side: Up, Left, Up-Left.
      • If a piece is at the destination square, it is captured and removed from the board.
  • Win Conditions:

    1. Goal Position: A player wins if one of their pieces reaches the opponent's goal corner:
      • Red Goal: Bottom-right corner.
      • Blue Goal: Top-left corner.
    2. Capture All: A player wins if all of the opponent's pieces are captured.

🧠 AI Implementation Details (MCTS)

The AI agent employs the Monte Carlo Tree Search (MCTS) algorithm, which involves four main phases:

  1. Selection: The algorithm traverses the game tree from the root, selecting child nodes based on the Upper Confidence Bound (UCB) formula. The fast_UCB function in math_lib/maths.cpp is used for this purpose, adjusting for MIN (opponent) and MAX (player) nodes.
  2. Expansion: When a leaf node (a node not yet fully explored) is reached, new child nodes are generated for all possible next moves. The Board::expand() method handles this, creating new Board objects for each child.
  3. Simulation (Playout): From each newly expanded child node, a random playout (simulation) is performed until the game reaches a terminal state (win/loss). The Board::simulate() method uses the PCG random number generator for move selection during playouts.
  4. Backpropagation: The result of each simulation (win or loss) is propagated back up the tree to update the statistics of all nodes along the traversed path. The Board::update() method updates Ntotal (total visits) and sum1 (total wins) for each node.

Key Features:

  • Pre-allocated Game Tree: The game tree is implemented as a fixed-size array (Board game_tree[MAX_NODE]) to avoid dynamic memory allocation overhead.
  • Incremental Node Updates: Tree node statistics are updated incrementally during backpropagation for efficiency.
  • Precomputed Tables: movable_piece_table and step_table are used to optimize move generation, ensuring efficient calculation of possible moves.
  • Fast Math Utilities: Functions like fast_log2 and inverse_rsqrt are included in math_lib/ for performance optimization in UCB calculations.

📈 MCTS Refinements and Performance

During the development of this AI agent, several MCTS refinements were explored, as detailed in the original report.md. The primary goal was to enhance the agent's performance against various baselines.

Explored Refinements:

  • UCB with Variance: An attempt was made to incorporate variance into the UCB calculation. While this increased the complexity of UCB calculation and node maintenance, it did not improve overall performance and actually slowed down simulations.
  • Node Expansion Refinement (Transition Probability): A policy was tested that checked if a node's score (Mean - 0.1 * Variance) was high enough compared to its siblings' Mean scores before expansion. This method proved to be time-consuming and ineffective.
  • Node Expansion Refinement (Visited Count): This method modifies the tree node structure to include a visit_count field. If a leaf node's visit_count is less than 2, its count is incremented, and a batch simulation is performed. Otherwise, normal expansion and simulation proceed. This method was chosen as the final MCTS implementation due to its simplicity and tangible, albeit limited, effects.

Experiment Results:

Experiments were conducted with MAX_SIMULATION_COUNT=2,000,000. The table below shows the win rates of each agent versus the others (each result from 100 tests):

Basic Variance Transition probability Visited count
Basic x 45 5 45
Variance 55 x 3 44
Transition probability 95 97 x 99
Visited count 55 56 1 x

Overall Performance: Transition probability < Visited count < Variance < Basic.

The experiments indicated that the basic MCTS implementation surprisingly outperformed the enhanced versions (UCB+Variance and Visited Count). The "Transition probability" method was the weakest. This phenomenon is attributed to the nature of EWN, where random movements during the simulation phase can lead to data discrepancies, and additional modifications might amplify this issue. Despite this, the "Visited count" method was retained for its simplicity and some observed effects.

Reference

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published