Awesome-List Researcher

A Docker-based tool that automatically finds and suggests new resources for GitHub Awesome Lists.

Overview

This tool ingests any GitHub Awesome-style repository, parses its README.md, and uses OpenAI's agents to discover new, non-duplicate, spec-compliant resources that can be added to the list. The result is an updated Markdown file that passes awesome-lint checks.

Features

Fully containerized - runs completely inside Docker
Enforces cost and time constraints
Configurable OpenAI model selection
Deduplication of existing resources
Validation of new resources (HTTPS, accessibility, etc.)
Structured logging of all operations

Requirements

Docker
OpenAI API key

Usage

# Build and run the tool
./build-and-run.sh --repo_url https://github.com/username/awesome-repo \
  --wall_time 600 \
  --cost_ceiling 10.00 \
  --output_dir runs/ \
  --model_planner gpt-4.1 \
  --model_researcher o3 \
  --model_validator o3

Parameters

Parameter	Description	Default
`--repo_url`	GitHub URL of the Awesome list (required)	-
`--wall_time`	Maximum execution time in seconds	600
`--cost_ceiling`	Maximum OpenAI API cost in USD	10.00
`--output_dir`	Directory for output artifacts	runs/
`--seed`	Random seed for deterministic behavior	random
`--model_planner`	Model for planning research queries	gpt-4.1
`--model_researcher`	Model for researching new resources	o3
`--model_validator`	Model for validating new resources	o3

Environment Variables

OPENAI_API_KEY (required): Your OpenAI API key

Output

All outputs are saved under runs/<ISO-TIMESTAMP>/ with the following artifacts:

original.json: Parsed content of the original README
plan.json: Research plan generated by the planner agent
candidate_*.json: Candidate resources found by research agents
new_links.json: Validated new resources after deduplication
updated_list.md: Final updated Markdown list
agent.log: Detailed log of all operations
research_report.md: Summary of the research process

Model Selection Strategy

The tool uses different models for different tasks to optimize for cost and quality:

Planner Agent: gpt-4.1 - Deep reasoning to create high-quality search queries
Category Researcher: o3 - Cost-effective option for large volume of research tasks
Validator: o3 - Lightweight description cleanup and validation

Running Tests

To run the end-to-end test:

# Make sure the OPENAI_API_KEY is set
export OPENAI_API_KEY=your_api_key_here
# Run the test
./tests/run_e2e.sh

Add the --keep flag to preserve test outputs:

./tests/run_e2e.sh --keep

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.cursor/rules		.cursor/rules
awesome_list_researcher		awesome_list_researcher
docs		docs
tests		tests
.gitignore		.gitignore
CONTRIBUTING_TEMPLATE.md		CONTRIBUTING_TEMPLATE.md
Dockerfile		Dockerfile
README.md		README.md
architecture.md		architecture.md
awesome-list-researcher-prompt.md		awesome-list-researcher-prompt.md
build-and-run.sh		build-and-run.sh
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Awesome-List Researcher

Overview

Features

Requirements

Usage

Parameters

Environment Variables

Output

Model Selection Strategy

Running Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

krzemienski/awesome-researcher

Folders and files

Latest commit

History

Repository files navigation

Awesome-List Researcher

Overview

Features

Requirements

Usage

Parameters

Environment Variables

Output

Model Selection Strategy

Running Tests

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages