This repository contains an AI-powered agent system that automatically generates and verifies Lean 4 code, turning natural language problem statements into formally proven Lean programs. It's part of a project developed during the LLM Agents Learning MOOC, focused on reasoning, theorem proving, and agent-based AI systems.
The Lean Coding Agent combines multi-agent reasoning with code generation and verification. Given:
- A natural language description (
description.txt
) - A Lean code template (
task.lean
)
It outputs:
- An implemented function
- A formal proof of correctness
All in valid Lean syntax — ready to compile and check!
The workflow involves multiple distinct steps that illustrate the cooperation between the two agents:
The Reasoning Agent (o3-mini
) begins by receiving the problem description and the Lean code template. It:
- Breaks down the problem description into smaller components
- Suggests a general approach for solving the problem
- Prepares a structured analysis to guide the next step
The LLM Agent (gpt-4o
) uses the Reasoning Agent's analysis to:
- Generate the Lean code required to solve the problem
- Produce the corresponding proof that supports the solution
- Insert the solution into the designated template sections
The generated code is executed in the Lean environment:
- If successful, the solution is considered complete
- If errors occur, the system collects the error messages
When errors are detected, the system enters a refinement phase where the LLM Agent:
- Reanalyzes the problem and code
- Refines the implementation and proof to ensure correctness
full explanation is on Lean4 Ai agent Doc
Implemented in src/main.py
, the key function is:
def main_workflow(problem_description: str, task_lean_code: str = "") -> LeanCode:
"""
Takes a problem description and a Lean template, and returns completed code and proof.
"""
The output is a LeanCode
dictionary:
{
"code": "<function implementation>",
"proof": "<formal proof>"
}
To run the agent and setup code you should follow the guidelines in INSTRUCTIONS.md
for task setup and Lean 4 implementation.
Run the automated test suite to verify the agent against all test tasks:
make test
This command executes the agent on predefined problems and verifies that the generated code passes Lean's verification.
- Python 3.12+
- Lean 4
- OpenAI-compatible LLMs (for GPT-4o access)
- Dependencies in
requirements.txt
This project was developed as part of the LLM Agents MOOC offered by UC Berkeley, led by Professor Dawn Song, Xinyun Chen, and Kaiyu Yang. The course covers:
- Inference-time techniques for reasoning
- LLM post-training methods
- Code generation and verification
- Autoformalization and theorem proving
More details: llmagents-learning.org/sp25
PRs are welcome! Whether it's bug fixes, improvements, or ideas — feel free to fork and open a Pull Request.