GitHub - jalpp/chess-llm-bench: benchmarking LLMs playing chess with combination of Chess Context Protocol

Chess Context Protocol + LLM chess playing benchmark

This repo contains source code for running benchmarking code to benchmark how well LLM + CCP can play chess.

What is CCP?

CCP is chess context protocol that takes care of providing chess context to LLM, such as legal moves, board state, chess themes etc. You can read more about it here

Isn't this cheating?

No CCP does not give the model the best/right move rather the board state, chess themes, and legal moves which it uses to its advantge. LLMs are not meant to play chess, and by adding CCP layer LLMs become chess aware and this project tests how well LLM + CCP combo works.

Acheivements

Gemini-2.5-pro + Agine system prompt + CCP was able to take down Stockfish 1000 running at depth 15. the bench folder contains on victory json and text files.
Gemini-2.5-pro + Agine system prompt + CCP was able to take down Stockfish 1200 running at depth 15. the bench folder contains on victory json file.

Game:

Future plans

playing is one category, there will be more bench tests that be added.

Setup

cd chess-llm-bench\src\bench

npm i

in .env file add the following

AGINE_PROVIDER=

# Model name
AGINE_MODEL=

# API Key (required for google, openai, anthropic)
AGINE_API_KEY=

npx tsx .\playTest.ts 4 # number of games 15 local wasm fish depth 5 api delay 

or run

npm run bench

watch the benchmark happen live

Output

the benchmark generates detailed benchmark.json file that contains game info, win rates, game pgn and moves.

json

{
  "summary": {
    "totalGames": 2,
    "completedGames": 1,
    "agentWins": 0,
    "stockfishWins": 1,
    "draws": 0,
    "stockfishDepth": 15,
    "apiDelaySeconds": 5,
    "agentTimeoutSeconds": 60,
    "model": "gemini-2.5-pro",
    "provider": "google",
    "ccpEnabled": true,
    "stats": {
      "agentAsWhite": {
        "wins": 0,
        "losses": 1,
        "draws": 0
      },
      "agentAsBlack": {
        "wins": 0,
        "losses": 0,
        "draws": 0
      }
    }
  },
  "games": [
    {
      "winner": "stockfish",
      "reason": "checkmate",
      "moves": [
        "e4",
        "e6",

...

Authors:

@jalpp

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Chess Context Protocol + LLM chess playing benchmark

What is CCP?

Isn't this cheating?

Acheivements

Future plans

Setup

Output

Authors:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Uh oh!

Uh oh!

jalpp/chess-llm-bench

Folders and files

Latest commit

History

Repository files navigation

Chess Context Protocol + LLM chess playing benchmark

What is CCP?

Isn't this cheating?

Acheivements

Future plans

Setup

Output

Authors:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages