Curie is the first AI-agent framework designed for automated and rigorous scientific experimentation. Curie helps answer your curiosity through end-to-end experimentation automation, ensuring that every step—from hypothesis formulation to result interpretation—is conducted with precision, reliability, and reproducibility. Our mission is to empower scientists to move research at the speed of thought.
Curie’s Role in the Scientific Research Lifecycle
- [2025/06] We published EXP-Bench: Can AI Conduct AI Research Experiments? → 📄 Paper | 🗂️ Dataset | 📰 Blog
- [2025/05] We launched an AutoML feature to help researcher find the optimal ML solution → 📢 Blog
- [2025/02] We published Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents → 📄 Paper | 📰 Blog
- 🚀 Automated Experimentation – From hypothesis formulation, experiment implementation, experiment execution, result analysis and finding reflection.
- 📊 Rigor Enhancement - Built-in verification modules enforce methodical procedure, agent reliability and reproducibility.
- 🔬 Broad Applicability – Supports ML Engineering, system analysis, and scientific discovery.
- 💻 Use Your Starter Code – Supports working on arbitrary user's starter code.
- 📂 Bring Your Own Dataset – Supports working on arbitrary user's datasets.
- 🧾 Automatic, Insightful Reporting - See a sample report here
Prerequisite: Install Docker from here
sudo chmod 666 /var/run/docker.sock
docker ps # Verify Docker installation
pip install curie-ai
Option 2: Manual Installation for Developers
-
It's recommended to use
tmux
or a similar terminal multiplexer before running Curie, as experiments can take several minutes depending on the task and budget. -
Do not use Jupyter Notebook.
👩🎓: I want to understand the Sorting Algorithm Efficiency.
import curie
# Set up your API keys, refer to curie/setup/env.sh.example
key_dict = {
"MODEL": "claude-3-7-sonnet-20250219",
"ANTHROPIC_API_KEY": "your-anthropic-key"
}
result = curie.experiment(api_keys=key_dict,
question="How does the choice of sorting algorithm impact runtime performance across different input distributions?",
max_global_steps=10)
- 🧾 Auto-Generated Experiment Report:
logs/research_<ID>.md
.
-
📊 Experiment Result Notebook:
logs/research_*_all_results.txt
. -
🪵 The Experimentation Process (generated script generated code to reproduce experiment results):
workspace/research_<ID>/
.
👩🎓: I want to find the most robust ML methods for my noisy data.
result = curie.experiment(api_keys=key_dict,
question="Are ensemble methods (e.g., Random Forests, Gradient Boosting) more robust to added noise in the Breast Cancer Wisconsin dataset compared to linear models like Logistic Regression for a binary classification task?")
👨🎓: I have a dataset and some starter code,and I want to train/deloy ML models to achieve specific goals. (GPU is recommended for ML training tasks.)
result = curie.experiment(
api_keys=key_dict,
question="E.g. How to improve my prediction accuracy on my dataset.",
dataset_dir="/abs/path/to/your/dataset",
codebase_dir="[Optional] /abs/path/to/your/code",
env_requirements="[Optional] /abs/path/to/requirements.txt",
)
Curie AutoML Feature Overview.
- Check out how Curie is able to find optimal ML soltuions in these examples from MLE-Bench.
- Sample Curie-Generated Experiment Report:
Check out more Machine Learning Use Cases here.
Curie Overview & Demo.
If you use Curie in a research paper, please cite our work:
@article{kon2025expbenchaiconductai,
title={EXP-Bench: Can AI Conduct AI Research Experiments?},
author={Patrick Tser Jern Kon and Jiachen Liu and Xinyi Zhu and Qiuyi Ding and Jingjia Peng and Jiarong Xing and Yibo Huang and Yiming Qiu and Jayanth Srinivasa and Myungjin Lee and Mosharaf Chowdhury and Matei Zaharia and Ang Chen},
journal={arXiv preprint 2505.24785}
year={2025},
}
@article{kon2025curie,
title={Curie: Toward rigorous and automated scientific experimentation with ai agents},
author={Kon, Patrick Tser Jern and Liu, Jiachen and Ding, Qiuyi and Qiu, Yiming and Yang, Zhenning and Huang, Yibo and Srinivasa, Jayanth and Lee, Myungjin and Chowdhury, Mosharaf and Chen, Ang},
journal={arXiv preprint arXiv:2502.16069},
year={2025}
}
- GitHub Issues - Report bugs or request features
- Schedule a Meeting with Us - Get help from our team
- Join our Slack Community
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.