A Google-free, local-friendly reimplementation of the MLE-STAR multi‑agent ML engineering pipeline.
Uses OpenAI‑compatible LLM APIs (OpenRouter; planned local Ollama) and DuckDuckGo search.
Not affiliated with the original authors or Google.
- Multi-agent stages: initialization → refinement → ensembling → submission
- OpenRouter LLM backend (free-tier; rate limits apply)
- Deferred local LLM (Ollama) adapter (see Archive.md for rationale)
- DuckDuckGo search (no API key)
- Automated basic leakage / data usage checks
- Minimal runner for low token usage
- Kaggle submission formatting helper
For extended design rationale, roadmap, and historical notes see Archive.md
.
-
Prerequisites
- Python 3.12+ - Git - (Optional) Poetry
-
Clone & env (Conda)
# Clone
git clone https://github.com/<yourname>/mle-star-open.git
cd mle-star-open
# Create conda env (Python 3.12+)
conda create -n mle-star-open python=3.12 -y
conda activate mle-star-open
# (Optional) Faster resolver upgrade
python -m pip install --upgrade pip
# Install deps
pip install -r requirements.txt
- Configure
.env
OPENROUTER_API_KEY=sk-...
ROOT_AGENT_MODEL=openai/gpt-oss-20b:free
# Optional overrides:
# MAX_AGENT_STEPS=4
# TEMPERATURE=0.2
See machine_learning_engineering/shared_libraries/config.py
(DefaultConfig
) for all keys.
Create a task directory (default path expected under machine_learning_engineering/tasks/
):
machine_learning_engineering/tasks/
california-housing-prices/
task_description.txt
train.csv
test.csv # if producing submission
Minimal task_description.txt
example:
task_name: california-housing-prices
target: median_house_value
id_column: id
metric: rmse
Ensure train.csv
includes the target column; test.csv
omits it (plus an id
column if required for submission).
Full pipeline (all agents):
python .\scripts\run_pipeline.py --task-name california-housing-prices
Minimal (fewer LLM calls):
python .\scripts\run_task.py --task-dir .\machine_learning_engineering\tasks\california-housing-prices
Create Kaggle-style submission (expects prediction file in workspace run folder):
python .\scripts\make_submission.py --output-dir .\machine_learning_engineering\workspace\california-housing-prices\1\output
Example outputs:
machine_learning_engineering/workspace/<task>/<run_id>/
init/
refine/
ensemble/
output/predictions.csv
logs/
Edit config.py
or set environment variables (env vars override defaults). Examples:
$env:MAX_AGENT_STEPS=3
$env:ROOT_AGENT_MODEL="openai/gpt-oss-20b:free"
pytest
Scenario and pipeline tests live under eval/
and tests/
.
Concise live list—historical or deprecated items move to Archive.md
.
- Not sized for official MLE-Bench specs (36 vCPUs / 440GB RAM / A10 24GB); local experimentation focus.
- OpenRouter free-tier ~50 requests/day (subject to change).
- Ollama adapter deferred pending provider + eval stabilization (see Archive.md decision note).
- Test suite may lag new features.
Kaggle error: missing id column? Ensure id_column
in task_description.txt
and that CSVs include it, or run make_submission.py
to add one.
Hit rate limits? Wait for reset, reduce steps (MAX_AGENT_STEPS
), use minimal runner, or upgrade plan.
Force CPU? Leave CUDA_VISIBLE_DEVICES
empty when invoking Python.
Change model? Set ROOT_AGENT_MODEL
in .env
or as an environment variable before running.
Original paper: MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement
https://arxiv.org/abs/2506.15692
MIT (see LICENSE). Project is independent; no affiliation with the original authors or Google.