Machine Learning Engineer @ DocuSign Applied Science
Currently wrangling terabytes of documents and building NLP systems that actually work in production. Previously made real estate pricing models less wrong and co-founded an AI startup.
Obsessed with large-scale NLP and the beautiful chaos of distributed systems. Love the challenge of taking transformer models from "works on my laptop" to "processes millions of documents without falling over."
Deep in the Spark + Databricks ecosystem, building ETL pipelines that don't make you want to throw your laptop out the window. Comfortable juggling PyTorch transformers, distributed training, and the occasional Rust side project when Python feels too slow.
Have a soft spot for information extraction, document understanding, and making LLMs do useful things beyond writing poetry. Also enjoy the dark arts of MLOps - because someone has to make sure your beautiful model actually runs in production.
- NLP at Scale: Transformer pipelines for document processing, because regex isn't always the answer
- Research: Combined GPT-2 with vision encoders for visual question answering back when that was still novel (code)
- Side Quest: Built a differentiable tensor engine in Rust because why not (carbon)
Stack: Python • PyTorch • Transformers • Spark • Databricks • Kubernetes • Rust • SQL (yes, it counts)
Currently exploring how far we can push transformer architectures before they become sentient 🤖
📧 Always down to chat about NLP, distributed systems, or why your model isn't converging