Skip to content
View sparklerz's full-sized avatar

Organizations

@org1-sparklerz

Block or report sparklerz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sparklerz/README.md

Sarat Kannan

I build scalable LLM training/inference systems across distributed multi-GPU (PyTorch DDP/FSDP, DeepSpeed ZeRO, Ray, MosaicML LLM Foundry) and decentralised swarms (Hivemind, Petals). Writing experiments up so others can reproduce them.

Selected work

Writing

Medium: https://medium.com/@kannansarat9

Contact

DMs open: https://x.com/saratkannan

Pinned Loading

  1. multigpu-llm-finetuning multigpu-llm-finetuning Public

    This repository showcases hands-on projects leveraging distributed multi-GPU training to fine-tune large language models (LLMs).

    Python

  2. hivemind-qwen2-0.5b hivemind-qwen2-0.5b Public

    Internet-scale data-parallel fine-tuning of Qwen2-0.5B-Instruct using Hivemind + TorchTune. Initial peer on public IP; second peers on free GPUs (e.g., Kaggle).

  3. petals-llama2-70b petals-llama2-70b Public

    Decentralised inference + Prompt tuning of LLaMA-2-70B with Petals (swarm model-parallelism). Meta-repo with summary and links to the two write-ups.

  4. hivemind-modified-for-torchtune hivemind-modified-for-torchtune Public

    Forked from learning-at-home/hivemind

    Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

    Python 1

  5. torchtune torchtune Public

    Forked from pytorch/torchtune

    PyTorch native finetuning library

    Python 1