Agent for AI Research

This project reproduces the iterative agent framework introduced in the paper PaperBench, providing a practical implementation for researchers and developers to explore and extend.

Installation

git clone https://github.com/Just-Curieous/inspect-agent.git
cd inspect-agent/
docker build --platform=linux/amd64 -t pb-env -f Dockerfile.base .

Run the agent on your research tasks

Build the docker.
Copy env.sh.example to env.sh.
Replace your system prompt under instructions.txt
Run Inspect AI Agent with your code base and questions: python entry_point.py --research_task <path_to_research_paper> --code_repo_path <path_to_code_repo> --inspect_path $(PWD) For example:

python entry_point.py --research_task /home/ubuntu/Benchmark-Construction/logs/neurips2024/95262.json --code_repo_path /home/ubuntu/Benchmark-Construction/logs/neurips2024/MoE-Jetpack --inspect_path /home/ubuntu/inspect-agent

Manual Setup

Setup experiment

cd inspect-agent/; docker run -it --name my-pb-env -v $(pwd):/workspace -v /:/all pb-env 
docker exec -it my-pb-env bash

Copy env.sh.example to env.sh. And configure:

Your model and API key.
Directory to your code and paper/questions

Remember to

Put your system prompt under instructions.txt
Put your code repo under $CODE_DIR.

Start the agent

cd /workspace 
bash start.sh <PATH_TO_CODE> <PATH_TO_PAPER>

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
__pycache__		__pycache__
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.base		Dockerfile.base
README.md		README.md
_basic_agent_iterative.py		_basic_agent_iterative.py
_basic_agent_plus.py		_basic_agent_plus.py
_execute.py		_execute.py
_file_reader.py		_file_reader.py
config.yaml		config.yaml
entry_point.py		entry_point.py
env.sh.example		env.sh.example
instructions.txt		instructions.txt
launch.json		launch.json
openai-3-78.py		openai-3-78.py
pre-commit		pre-commit
requirements.txt		requirements.txt
start.py		start.py
start.sh		start.sh
templates.py		templates.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agent for AI Research

Installation

Run the agent on your research tasks

Manual Setup

Setup experiment

Start the agent

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Just-Curieous/inspect-agent

Folders and files

Latest commit

History

Repository files navigation

Agent for AI Research

Installation

Run the agent on your research tasks

Manual Setup

Setup experiment

Start the agent

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages