Skip to content
View yuki-2025's full-sized avatar
  • University of Chicago
  • Chicago
  • 18:57 (UTC -05:00)

Highlights

  • Pro

Block or report yuki-2025

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
yuki-2025/README.md

Profile views Total Stars

🤗 Hi, I’m @yuki-2025

I’m a seasoned LLM research engineer, ML engineer/data scientist, and AI product manager with training in Applied Data Science at the University of Chicago. I specialize in LLMs, large multimodal models, and AI agents. I bridge academia and industry—as Senior Staff at an AI company and as an AI researcher at UChicago Booth and the Data Science Institute—turning peer-reviewed research into production systems and leading end-to-end AI implementations.

💻 Expertise: AI Research & Large Language Models (LLM)🤖 • Large multimodal models (LMMs)🎵 • Machine Learning & Deep Learning📚 • Full-Stack GenAI Applications💡 • AI Agents🧠

My AI Projects (Open-sourced on GitHub):

  1. Paper: AgentNet: Dynamically Graph Structure Selection for LLM-Based Multi-Agent System.
    A dynamic, input-driven **multi-agent system (MAS)**, AgentNet, is introduced, executing over learned communication graphs (CoT, ToT, GoT). Advantage Actor–Critic (A2C) *reinforcement learning* is applied to learn a stable distribution over edges, and the base LLM (LoRA) is fine-tuned as a graph selector (**LLM-as-judge**) to choose the best topology per input. The approach achieves **state-of-the-art** (SOTA) performance on structured **reasoning** tasks (Crossword, Game-of-24, MMLU, BBH) and **code generation** (HumanEval), while maintaining latency comparable to CoT/ToT-style and static-swarm baselines. (Paper under review at EMNLP).
  2. Paper: Medinotes: A Gen AI Framework for Medical Note Generation.
    MediNotes is a first GenAI framework that enhances clinical consultations by automating documentation and providing a healthcare-domain–fine-tuned copilot with retrieval-augmented generation (RAG), LLM and ambient listening. I developed and validated the system with clinicians at UChicago Medicine, culminating in 2 IEEE publications.
  3. Paper: IntentVCNet: Bridging Spatio-Temporal Gaps for Intention-Oriented Controllable Video Captioning
    IntentVCNet is a fine-tuned InternVL with LLaMA-Factory, earning second place in the IntentVC Challenge at ACM MM 2025 (Intention-Oriented Controllable Video Captioning), which resulted in a published ACM MM paper.
  4. mRAG: Multimodal RAG - Paper Q&A System + Evaluation
    To address the issues of traditional RAG systems—long processing times caused by OCR and text chunking, poor recall quality, and reliance on text-only embeddings—I built an embedding–retrieval–Q&A pipeline based on Qwen2.5VL’s multimodal capabilities, created a synthetic dataset, and evaluated the system using an LLM as the judge.
  5. NLP Research: Fine-Tuned LLM Embeddings for Business Insights
    In collaboration with the University of Chicago Booth School of Business, I developed, fine-tuned, and optimized LLMs to generate high quality business-domain embeddings enriched with broad general knowledge—enabling the extraction of CEO-level actionable insights for management and financial decision-making.
📂 More other projects and papers
  1. Uchicago AI Hackathon 2024 Won 2nd place at the UChicago DSI AI Hackathon 2024 with a RAG medical Q&A chatbot. Built using LangChain for orchestration, PostgreSQL with vector embeddings for hybrid search, Streamlit for the front end, and Google Cloud Vertex AI to fine-tune and host Llama 3-8B, enabling secure access to patient records and general medical question answering.
  2. Fine-Tuning Llama 3-8B for Structured Math Reasoning This project involves fine-tuning Llama3 8b to generate JSON formats for arithmetic questions and further post-process the output to perform calculations. This method incorporates the latest fine-tuning techniques such as Qlora, Unsloth, and PEFT. It enables faster training speeds and requires fewer computational resources.
  3. AI Salesman Built an AI-powered RAG hybrid search recommendation system using RAG that lets customers search products with filters like price. Implemented with LangChain, LLMs, and pgvector in PostgreSQL to segment product descriptions, generate embeddings, and deliver relevant recommendations.
  4. Agentic RAG Built an Agentic RAG workflow with smolagents, wrapping retrieval as an agent tool for dynamic document search, compared against standard RAG (embedding + FAISS + LLM), and evaluated with LLM-as-a-Judge.
  5. Computer Vision (CV) collection
    Style Transfer: Implementing style transfer with TensorFlow/Keras
    MLflow: Tutorial on using MLflow for experiment tracking
    Image Search RAG: Image-based search system using RAG with Qdrant and Streamlit (search images by image input)
    Roboflow: Step-by-step guide to annotating images and training a coin-detection model on Roboflow
    Aircraft Detection: Training a YOLO model for military aircraft detection and model evaluation
  6. Reproduced SOTA Research Papers
    ✦ Stanford Alpaca 7B – dataset curation and instruction tuning of LLaMA to achieve GPT-3.5-comparable performance.
    LLaVA – full training workflow to reproduce the multimodal model.
    LLaVA + RAG – semi-structured and multimodal retrieval-augmented generation.
    NanoGPT – training a GPT model from scratch to understand Transformer internals.
    RAFT – combining fine-tuning and RAG for improved retrieval performance.
  7. Recommendation System
    Instacart Market Basket Analysis using PySpark Developed a scalable market-basket analysis pipeline on Instacart order data using PySpark MLlib’s FPGrowth. Processed millions of transactions to extract frequent itemsets (≥1% support) and generated association rules (≥20% confidence, lift >1.5) for co-purchase recommendations (“customers who bought X also bought Y”).
    Collaborative Filtering Recommendation Use PySpark to load and clean the data, train an ALS model, and Generate Top-10 movie recommendations for all users. Provide Top-10 recommendations for a specified subset of users. Identify the most likely users for a given set of movies.Make rating predictions and evaluate the model performance using RMSE.
    Two-Tower Recommendation System Use PySpark and Spark SQL to clean, join, and engineer user–item interaction features at scale. Encode movie titles with SentenceTransformer and load user/item metadata into pandas for downstream processing. Build and train a Two-Tower neural network in PyTorch that learns user and item embeddings via contrastive loss. Persist item embeddings in Redis as a vector database and leverage RedisVL for approximate nearest-neighbor search to return Top-K movie recommendations.
  8. Useful apps and tools:
    • Video_subtitle_generater: Generate subtitles from an audio/video file, using OpenAI's Whisper model. Support multiple language.I take notes when learning from videos. It’s handy to have transcripts, and capturing that data is also useful for model training.
    • Google Drive Helper: The code I always use in my project when come to Google Cloud Platform. Instantly delete files, download them, edit permissions, and transfer ownership in bulk – all in just a few seconds.
    • Blockchain apps: 2 apps that run smart contracts and blockchain routes to demonstrate key blockchain principles: decentralization, immutability, Proof of Work (PoW), and transparency

🛠️ Tech Stack

Python PyTorch NumPy Pandas R Scikit-learn Matplotlib Keras CUDA Apache Spark LangChain GraphQL FastAPI Flask Power BI Tableau C# CSS Dart Flutter Go HTML Swift TypeScript Next.js NodeJS jQuery Chart.js C C++ Java JavaScript JSON Solidity Linux AWS Google Cloud Microsoft Azure Alibaba Cloud Vercel Terraform GitHub Actions GitLab CI Jenkins Ubuntu Snowflake Databricks Docker Kubernetes ETL Figma Canva MongoDB MySQL Neo4J Oracle Postgres Redis SQLite Supabase Teradata ChatGPT Claude Deepseek GitHub Copilot Google Assistant Hugging Face Google Gemini

Pinned Loading

  1. Dyna_Swarm Dyna_Swarm Public

    AgentNet: Dynamically Graph Structure Selection for LLM-Based Multi-Agent System

    Python 81 9

  2. MediNotes MediNotes Public

    MediNotes: SOAP Note Generation through Ambient Listening, Large Language Model Fine-Tuning, and RAG

    Python 48

  3. Ai-hackathon Ai-hackathon Public

    AI Hackathon 2024 @ UChicago DSI – 2nd Place

    Python 27

  4. thqiu0419/IntentVCNet thqiu0419/IntentVCNet Public

    IntentVCNet: Bridging Spatio-Temporal Gaps for Intention-Oriented Controllable Video Captioning

    Python 16

  5. RAG_projects RAG_projects Public

    RAG_workshops_collection

    Jupyter Notebook 1

  6. mRAG mRAG Public template

    Multimodal RAG — Paper Q&A System Based on Qwen2VL mini-4o

    Python 1