Vensen Flink-ddd

Hi there, I'm Mu Xiaohui 👋

How to contact me 🔽

I am a Senior Software Engineer with over 7 years of experience in designing and building large-scale, high-concurrency distributed systems using Java and microservices.

Currently, I am expanding my expertise into AI/ML Infrastructure and Large Language Model (LLM) Applications, applying my deep engineering background to build robust and efficient systems that power intelligent solutions. I am passionate about creating value at the intersection of solid software architecture and cutting-edge AI.

Based in Singapore, I am seeking senior roles in either Java System Architecture or AI/ML Infrastructure.

🛠️ Core Expertise & Technology Stack

My skills cover the full spectrum from foundational backend architecture to modern AI/ML infrastructure.

Core Java & Distributed Systems	AI/ML & Big Data
Java, Spring Boot, Spring Cloud, Mybatis-Plus	Python, PyTorch, Transformers, DeepSpeed, vLLM
Microservices, SaaS, Domain-Driven Design (DDD)	LLM Fine-tuning (LoRA), RAG
Docker, Kubernetes, DevOps, gRPC, OpenFeign	Vector DB (Milvus, FAISS, ElasticSearch)
Kafka, Zookeeper, Alibaba nacos, WebSocket	Apache Flink, Flink-CDC, Prometheus+Grafana
MySQL, MongoDB, Neo4j, PostgreSQL, ElasticSearch, Redis, MinIO, SolrCloud, Hbase	ELK Stack, Flume, Clickhouse
System Design & Scalable Architecture	MLOps & Inference Optimization

🏆 Featured Projects

Here are some projects that highlight my capabilities across both domains.

AI Infrastructure & LLM Applications

RAG System for Domain-Specific Q&A
- Engineered a Retrieval-Augmented Generation (RAG) pipeline using Mistral-7B,chatGPT-o4-mini, ElasticSearch for a specialized knowledge domain.
- Optimized the system for real-time interaction through efficient data processing and a high-throughput inference server deployment.
LLM Inference Acceleration & Fine-tuning
- Customized open-source LLMs (Mistral, Qwen, Llama) using LoRA fine-tuning techniques on specific datasets.
- Accelerated model inference significantly using vLLM and FlashAttention, deploying them as scalable API endpoints on cloud platforms like GCP and Azure.
Transformers & DeepSpeed & vLLM Contribution (Ongoing)

Java Microservices & System Architecture

Group-Level Multi-functional Payment Platform
- Architected and developed a highly available, enterprise-grade payment center using Domain-Driven Design (DDD) and a robust microservices architecture.
- The system reliably handles millions of transactions in a specific period time and smoothly process tens of thousands of transactions or more every day, ensuring data consistency and security across various payment channels.
High-Concurrency Instant Messaging (IM) System
- Built a distributed IM system from the ground up to support millions of concurrent users.
- Leveraged a powerful tech stack including Spring Boot, WebSocket, Kafka for message queuing, and Zookeeper for service coordination, achieving high throughput and low latency.
Enterprise Search & Real-Time Data Platform
- Designed a high-performance search engine using Elasticsearch and Flink-CDC capable of indexing and searching billions of records with sub-second latency.
- Built the underlying real-time data synchronization pipeline, providing a unified data backbone for multiple business units.
Enterprise Flink computing Platform
- Enterprise-level Flink Cluster: Built a unified Flink-based computing center for real-time data lake, stats center, and ETL pipelines.

etc......

✍️ Writing & Sharing

I believe in continuous learning and sharing knowledge. I write about my journey in software architecture, distributed systems, and AI on my Medium blog.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vensen Flink-ddd

Achievements