I’m a Data Scientist with some industry experience. These are the projects/experiments I do in my free time (purely out of interest). For more professional (/industrial) info please refere to my linkedin profile.
I’m particularly interested in computer vision and multi-model systems that understand the world by looking at it. I love understanding these systems from the inside out. Instead of just using models, I rebuild them to learn how they actually work.
Most of my recent work explores self-supervision, pose estimation, image segmentation, and retrieval-augmented generation. I enjoy turning research into fully working systems—modular, testable, and often local-first (or on some small cloud GPU).
- Title: Data Scientist
- Philosophy: I’m all for ML models that can help us unravel the mysteries of the universe (not just generate adorable cat images)
- Interests: Self-supervised learning, multimodal systems, human-centric AI or anything that helps us understand how machines see the world (or what they actually percieve)
- Rebuilding ML models from scratch (ViTs, DINO, UNet, etc.)
- Designing data pipelines that scale and debug well
- Self-supervised learning and training loops
- Fine-tuning large models with small data
- Keeping code readable, minimal, and modular
- Research work (finding papers for the solutions)
- Multi-modal learning (vision + text + audio)
- Efficient training on low-resource hardware
- Generative modeling (audio, diffusion, NeRFs)
- Inference optimization (TensorRT, quantization)
- Better unit testing and CI/CD practices for ML pipelines
- Currently obsessed with geometric computer vision stuff and want to get better at it
- Being less of a bozo and write better code than words ;)
A full reimplementation of DINO (Self-Distillation with No Labels) using PyTorch and Vision Transformers. The project focuses on contrast-free self-supervised learning using momentum encoders, centering, sharpening, and multi-view alignment.
Repo
A transformer-based system that lifts 2D pose keypoints to 3D, inspired by MotionBERT. Includes full training pipeline, temporal modeling, and 3D visualization of predictions.
Repo
Fine-tuned YOLO-NAS (via SuperGradients) on an Indian street sign dataset. Includes support for custom annotations, training scripts, and inference setup.
Repo
Custom-built 3D U-Net for volumetric CT/MRI segmentation. Includes patch-wise training, organ-level labeling, and volumetric visualization.
Repo
Retrieval-Augmented Generation (RAG) pipeline that reads PDFs and answers natural language questions. Built using LangChain, FAISS, and LLMs with a focus on local-first privacy and fast contextual retrieval.
Repo
NER system built using Huggingface’s RoBERTa model. Includes token classification, training loop, data preprocessing, and clean inference pipeline.
Repo
- Research collaborations in CV, SSL, pose estimation, segmentation
- Contract or freelance roles building full-stack ML systems
- Full-time roles in data science or ML engineering with depth and autonomy
Open for opportunities and learn.