This project reproduces the iterative agent framework introduced in the paper PaperBench, providing a practical implementation for researchers and developers to explore and extend.
git clone https://github.com/Just-Curieous/inspect-agent.git
cd inspect-agent/
docker build --platform=linux/amd64 -t pb-env -f Dockerfile.base .
- Build the docker.
- Copy
env.sh.example
toenv.sh
. - Replace your system prompt under
instructions.txt
- Run Inspect AI Agent with your code base and questions:
python entry_point.py --research_task <path_to_research_paper> --code_repo_path <path_to_code_repo> --inspect_path $(PWD)
For example:
python entry_point.py --research_task /home/ubuntu/Benchmark-Construction/logs/neurips2024/95262.json --code_repo_path /home/ubuntu/Benchmark-Construction/logs/neurips2024/MoE-Jetpack --inspect_path /home/ubuntu/inspect-agent
cd inspect-agent/; docker run -it --name my-pb-env -v $(pwd):/workspace -v /:/all pb-env
docker exec -it my-pb-env bash
Copy env.sh.example
to env.sh
.
And configure:
- Your model and API key.
- Directory to your code and paper/questions
Remember to
- Put your system prompt under
instructions.txt
- Put your code repo under
$CODE_DIR
.
cd /workspace
bash start.sh <PATH_TO_CODE> <PATH_TO_PAPER>