Skip to content
Change the repository type filter

All

    Repositories list

    • ldp

      Public
      Framework enabling modular interchange of language agents, environments, and optimizers
      Python
      1310349Updated Aug 20, 2025Aug 20, 2025
    • aviary

      Public
      A language agent gym with challenging scientific tasks
      Python
      2520084Updated Aug 20, 2025Aug 20, 2025
    • paper-qa

      Public
      High accuracy RAG for answering questions from scientific documents with citations
      Python
      7667.6k1244Updated Aug 20, 2025Aug 20, 2025
    • Documentation and tutorials for the FutureHouse platform API
      1010Updated Aug 4, 2025Aug 4, 2025
    • LAB-Bench

      Public
      Evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology
      Python
      98150Updated Jul 18, 2025Jul 18, 2025
    • ether0

      Public
      A scientific reasoning model, dataset, and reward functions for chemistry.
      Python
      1312241Updated Jun 25, 2025Jun 25, 2025
    • robin

      Public
      Robin: A multi-agent system for automating scientific discovery
      Python
      2921531Updated Jun 24, 2025Jun 24, 2025
    • An aviary-based data science agent based on jupyter notebooks
      HTML
      103500Updated Jun 13, 2025Jun 13, 2025
    • BixBench

      Public
      Benchmark for LLM-based Agents in Computational Biology
      Python
      94920Updated Jun 12, 2025Jun 12, 2025
    • trl

      Public
      FutureHouse fork of trl
      Python
      2.1k105Updated Mar 12, 2025Mar 12, 2025
    • llm-client

      Public archive
      Central LLM client for use by Aviary and PaperQA
      Python
      0200Updated Feb 23, 2025Feb 23, 2025
    • LitQA

      Public archive
      LitQA Eval: A difficult set of scientific questions that require context of full-text research papers to answer
      Python
      54120Updated Dec 18, 2024Dec 18, 2024
    • WikiCrow

      Public
      23700Updated Oct 24, 2024Oct 24, 2024
    • SWE-bench

      Public
      Fork of upstream
      Python
      588000Updated Jul 24, 2024Jul 24, 2024