LLM Agent Duel ⚔️

Duel Large Language Model (LLM) agents in the classic Towers of Hanoi puzzle and Nim Game! Benchmark any model (e.g., Gemma3, Llama 3.2) through a variety of prompts!

Screenshots 📸

Single game of Nim via main.py.

Game and prompt selection for benchmark.py.

Getting Started 🚀

To contribute to this project or run the program, follow these steps:

Download the Project Files:

Clone the repository:

git clone https://github.com/gbsierra/LLM-Agent-Duel.git

Create virtual python environment:

# Windows (Command Prompt)
python -m venv venv
venv\Scripts\activate

# Windows (PowerShell)
python -m venv venv
.\venv\Scripts\Activate.ps1

# macOS/Linux
python3 -m venv venv
source venv/bin/activate

Install Dependencies:
```
pip install -r requirements.txt
```

Run your benchmarks, then visualize:

# Start a single game, watching each model's move
# (You will select your game and prompt type)
python main.py

# Start benchmarking, viewing only game completions
# (You will select your game and prompt types)
python benchmark.py

# Create plots of benchmarks
# (Ensure your benchmarks are in /benchmarks/nim & /benchmarks/hanoi, respectively)
python visualization.py

That is all! You are now ready to explore, contribute, and run the application. Head to the 'agents' directory to customize any prompt or choose from the full list of models https://ollama.com/search.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agents		agents
memory		memory
puzzles		puzzles
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
benchmark.py		benchmark.py
main.py		main.py
requirements.txt		requirements.txt
visualization.py		visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Agent Duel ⚔️

Screenshots 📸

Getting Started 🚀

About

Uh oh!

Releases

Packages

Languages

gbsierra/LLM-Agent-Duel

Folders and files

Latest commit

History

Repository files navigation

LLM Agent Duel ⚔️

Screenshots 📸

Getting Started 🚀

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages