I’m a seasoned LLM research engineer, ML engineer/data scientist, and AI product manager with training in Applied Data Science at the University of Chicago. I specialize in LLMs, large multimodal models, and AI agents. I bridge academia and industry—as Senior Staff at an AI company and as an AI researcher at UChicago Booth and the Data Science Institute—turning peer-reviewed research into production systems and leading end-to-end AI implementations.
💻 Expertise: AI Research & Large Language Models (LLM)🤖 • Large multimodal models (LMMs)🎵 • Machine Learning & Deep Learning📚 • Full-Stack GenAI Applications💡 • AI Agents🧠
- Paper: AgentNet: Dynamically Graph Structure Selection for LLM-Based Multi-Agent System.
A dynamic, input-driven **multi-agent system (MAS)**, AgentNet, is introduced, executing over learned communication graphs (CoT, ToT, GoT). Advantage Actor–Critic (A2C) *reinforcement learning* is applied to learn a stable distribution over edges, and the base LLM (LoRA) is fine-tuned as a graph selector (**LLM-as-judge**)
to choose the best topology per input. The approach achieves **state-of-the-art** (SOTA) performance on structured **reasoning** tasks (Crossword, Game-of-24, MMLU, BBH) and **code generation** (HumanEval), while maintaining latency comparable to CoT/ToT-style and static-swarm baselines. (Paper under review at EMNLP). - Paper: Medinotes: A Gen AI Framework for Medical Note Generation.
MediNotes is a first GenAI framework that enhances clinical consultations by automating documentation and providing a healthcare-domain–fine-tuned copilot with retrieval-augmented generation (RAG), LLM and ambient listening. I developed and validated the system with clinicians at UChicago Medicine, culminating in 2 IEEE publications. - Paper: IntentVCNet: Bridging Spatio-Temporal Gaps for Intention-Oriented Controllable Video Captioning
IntentVCNet is a fine-tuned InternVL with LLaMA-Factory, earning second place in the IntentVC Challenge at ACM MM 2025 (Intention-Oriented Controllable Video Captioning), which resulted in a published ACM MM paper. - mRAG: Multimodal RAG - Paper Q&A System + Evaluation
To address the issues of traditional RAG systems—long processing times caused by OCR and text chunking, poor recall quality, and reliance on text-only embeddings—I built an embedding–retrieval–Q&A pipeline based on Qwen2.5VL’s multimodal capabilities, created a synthetic dataset, and evaluated the system using an LLM as the judge. - NLP Research: Fine-Tuned LLM Embeddings for Business Insights
In collaboration with the University of Chicago Booth School of Business, I developed, fine-tuned, and optimized LLMs to generate high quality business-domain embeddings enriched with broad general knowledge—enabling the extraction of CEO-level actionable insights for management and financial decision-making.
📂 More other projects and papers
- Uchicago AI Hackathon 2024 Won 2nd place at the UChicago DSI AI Hackathon 2024 with a RAG medical Q&A chatbot. Built using LangChain for orchestration, PostgreSQL with vector embeddings for hybrid search, Streamlit for the front end, and Google Cloud Vertex AI to fine-tune and host Llama 3-8B, enabling secure access to patient records and general medical question answering.
- Fine-Tuning Llama 3-8B for Structured Math Reasoning This project involves fine-tuning Llama3 8b to generate JSON formats for arithmetic questions and further post-process the output to perform calculations. This method incorporates the latest fine-tuning techniques such as Qlora, Unsloth, and PEFT. It enables faster training speeds and requires fewer computational resources.
- AI Salesman Built an AI-powered RAG hybrid search recommendation system using RAG that lets customers search products with filters like price. Implemented with LangChain, LLMs, and pgvector in PostgreSQL to segment product descriptions, generate embeddings, and deliver relevant recommendations.
- Agentic RAG Built an Agentic RAG workflow with smolagents, wrapping retrieval as an agent tool for dynamic document search, compared against standard RAG (embedding + FAISS + LLM), and evaluated with LLM-as-a-Judge.
- Computer Vision (CV) collection
✦ Style Transfer: Implementing style transfer with TensorFlow/Keras
✦ MLflow: Tutorial on using MLflow for experiment tracking
✦ Image Search RAG: Image-based search system using RAG with Qdrant and Streamlit (search images by image input)
✦ Roboflow: Step-by-step guide to annotating images and training a coin-detection model on Roboflow
✦ Aircraft Detection: Training a YOLO model for military aircraft detection and model evaluation - Reproduced SOTA Research Papers
✦ Stanford Alpaca 7B – dataset curation and instruction tuning of LLaMA to achieve GPT-3.5-comparable performance.
✦ LLaVA – full training workflow to reproduce the multimodal model.
✦ LLaVA + RAG – semi-structured and multimodal retrieval-augmented generation.
✦ NanoGPT – training a GPT model from scratch to understand Transformer internals.
✦ RAFT – combining fine-tuning and RAG for improved retrieval performance. - Recommendation System
✦ Instacart Market Basket Analysis using PySpark Developed a scalable market-basket analysis pipeline on Instacart order data using PySpark MLlib’s FPGrowth. Processed millions of transactions to extract frequent itemsets (≥1% support) and generated association rules (≥20% confidence, lift >1.5) for co-purchase recommendations (“customers who bought X also bought Y”).
✦ Collaborative Filtering Recommendation Use PySpark to load and clean the data, train an ALS model, and Generate Top-10 movie recommendations for all users. Provide Top-10 recommendations for a specified subset of users. Identify the most likely users for a given set of movies.Make rating predictions and evaluate the model performance using RMSE.
✦ Two-Tower Recommendation System Use PySpark and Spark SQL to clean, join, and engineer user–item interaction features at scale. Encode movie titles with SentenceTransformer and load user/item metadata into pandas for downstream processing. Build and train a Two-Tower neural network in PyTorch that learns user and item embeddings via contrastive loss. Persist item embeddings in Redis as a vector database and leverage RedisVL for approximate nearest-neighbor search to return Top-K movie recommendations. - Useful apps and tools:
- Video_subtitle_generater: Generate subtitles from an audio/video file, using OpenAI's Whisper model. Support multiple language.I take notes when learning from videos. It’s handy to have transcripts, and capturing that data is also useful for model training.
- Google Drive Helper: The code I always use in my project when come to Google Cloud Platform. Instantly delete files, download them, edit permissions, and transfer ownership in bulk – all in just a few seconds.
- Blockchain apps: 2 apps that run smart contracts and blockchain routes to demonstrate key blockchain principles: decentralization, immutability, Proof of Work (PoW), and transparency

