GitHub - SumanMadipeddi/RL_Environment_Training: RL env training with a learning agent; SWE-Bench Pro

This is my personal repo for RL environment engineering. I’m iterating on an agent that interacts with a custom RL-style environment and uses Anthropic tools to adapt behavior over time.

Project goals

The initial policy/agent should intentionally fail about 60% of the time (i.e., succeed roughly 40%) to ensure the environment is challenging and informative.
Over repeated interactions, the agent should learn and demonstrate improvement, achieving the targeted behavior with number of repetitions.

Prerequisites

- Git (latest)
Python 3.11+ (see .python-version for the exact version)
uv for Python project management (pip install uv or see https://docs.astral.sh/uv/)
An Anthropic API key

Setup (Windows PowerShell)

Clone and enter the repo

git clone <repo-url>

Install dependencies with uv

uv sync

Set your Anthropic API key in the current shell

$env:ANTHROPIC_API_KEY="your_api_key_here"

Run the agent

uv run main.py

Execution Modes

The agent can run tests concurrently or sequentially. Change the concurrent flag at the bottom of main.py:

asyncio.run(main(concurrent=True))
asyncio.run(main(concurrent=False))

When running concurrently, results print as they complete (not in run order) for faster overall execution.

Benchmark reference

This project references and draws inspiration from large-scale software engineering benchmarks such as SWE-Bench Pro. For details about the dataset, task structure, and evaluation setup, see the dataset page on Hugging Face: ScaleAI/SWE-bench_Pro.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
nvram.py		nvram.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project goals

Prerequisites

Setup (Windows PowerShell)

Execution Modes

Benchmark reference

About

Uh oh!

Releases

Packages

Languages

SumanMadipeddi/RL_Environment_Training

Folders and files

Latest commit

History

Repository files navigation

Project goals

Prerequisites

Setup (Windows PowerShell)

Execution Modes

Benchmark reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages