Skip to content
Change the repository type filter

All

    Repositories list

    • Container-based software environments used in the TrustLLM EU project.
      Shell
      3301Updated Oct 29, 2025Oct 29, 2025
    • A native PyTorch Library for large model training
      Python
      590100Updated Oct 22, 2025Oct 22, 2025
    • streaming

      Public
      A Data Streaming Library for Efficient Neural Network Training
      Python
      176000Updated Jan 22, 2025Jan 22, 2025
    • LLM training code for Databricks foundation models
      Python
      575000Updated Jan 21, 2025Jan 21, 2025
    • composer

      Public
      Supercharge Your Model Training
      Python
      455000Updated Jan 21, 2025Jan 21, 2025
    • Python
      216000Updated Jan 21, 2025Jan 21, 2025
    • datatrove

      Public
      Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
      Python
      230000Updated Dec 17, 2024Dec 17, 2024
    • Ongoing research training transformer models at scale
      Python
      3.2k000Updated Aug 22, 2024Aug 22, 2024
    • NeMo

      Public
      A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
      Python
      3.2k000Updated Aug 21, 2024Aug 21, 2024