Skip to content

Pinned Loading

  1. OLMo OLMo Public

    Modeling, training, eval, and inference code for OLMo

    Python 6k 656

  2. dolma dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    Python 1.3k 151

  3. ai2thor ai2thor Public

    An open-source platform for Visual AI.

    C# 1.5k 257

  4. olmocr olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    Python 14.1k 1k

  5. OLMoE OLMoE Public

    OLMoE: Open Mixture-of-Experts Language Models

    Jupyter Notebook 866 81

Repositories

Showing 10 of 525 repositories
  • olmo-cookbook Public

    OLMost every training recipe you need to perform data interventions with the OLMo family of models.

    allenai/olmo-cookbook’s past year of commit activity
    Python 48 Apache-2.0 9 1 30 Updated Sep 19, 2025
  • open-instruct Public

    AllenAI's post-training codebase

    allenai/open-instruct’s past year of commit activity
    Python 3,196 Apache-2.0 440 16 29 Updated Sep 19, 2025
  • dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    allenai/dolma’s past year of commit activity
    Python 1,314 Apache-2.0 151 7 17 Updated Sep 19, 2025
  • OLMo-core Public

    PyTorch building blocks for the OLMo ecosystem

    allenai/OLMo-core’s past year of commit activity
    Python 292 Apache-2.0 55 1 34 Updated Sep 19, 2025
  • safety-eval Public

    A simple evaluation of generative language models and safety classifiers.

    allenai/safety-eval’s past year of commit activity
    Python 64 18 0 2 Updated Sep 18, 2025
  • rslearn Public

    A tool for developing remote sensing datasets and models.

    allenai/rslearn’s past year of commit activity
    Python 41 Apache-2.0 6 16 6 Updated Sep 19, 2025
  • olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    allenai/olmocr’s past year of commit activity
    Python 14,106 Apache-2.0 1,049 20 8 Updated Sep 18, 2025
  • allenai/rslearn_projects’s past year of commit activity
    Python 12 Apache-2.0 2 16 14 Updated Sep 18, 2025
  • scispacy Public

    A full spaCy pipeline and models for scientific/biomedical documents.

    allenai/scispacy’s past year of commit activity
    Python 1,875 Apache-2.0 245 35 (3 issues need help) 4 Updated Sep 17, 2025
  • ai2thor Public

    An open-source platform for Visual AI.

    allenai/ai2thor’s past year of commit activity
    C# 1,509 Apache-2.0 257 268 5 Updated Sep 17, 2025