FastAPI-based boilerplate for building LangGraph-powered agents with streaming, observability, and modularity in mind.
This is a minimal and extendable template for running LangGraph agents over HTTP.
It supports:
- Server-Sent Events (SSE) streaming for real-time response updates
- Swagger/OpenAPI docs out of the box (thanks to FastAPI)
- Langfuse observability and prompt tracing
- User feedback tracking and thread history
- Clean FastAPI + Uvicorn setup with Docker, ruff, mypy, and pyright
- Plug-and-play LangGraph integration, including tool support and memory
- Dev-friendly structure with clear separation between core app and agent logic
Use it as a starting point for building custom AI agents with solid HTTP and observability layers.
- Clone
cp .env.dist .env
and fill in the required environment variablesdocker compose up
- Open your browser and go to
http://localhost:8000
Setup local langfuse (https://langfuse.com/self-hosting/docker-compose)
- Clone the langfuse repository:
git clone https://github.com/langfuse/langfuse.git
cd langfuse
docker compose up -d
- Open your browser and go to
http://localhost:3000
create a new project and copy theLANGFUSE_API_KEY
andLANGFUSE_API_URL
to your.env
file.
uv venv
uv sync
cd frontend
npm run build
cd ..
uv run dev
This is still a work in progress, but usable. Yes, the frontend is vibe-coded. No, I won’t fix it. Feel free to rewrite it.
PRs welcome for:
- OAuth2
- Guardrails
- Multi-agent support (with framework abstraction)
- LangFuse abstraction
- Evaluations
- Anything that reduces my boilerplate suffering
- Add a way to add tools to the agent
- Add a way to add a database to the agent (memory, checkpoints, feedback?, etc)
- Implement graph instead of simple agent
- Refactor the structure of the project. We need to separate general fastapi app from the agent app.
- Add more model configuration options (temperature, top_p, etc)
- Add a way to get a thread history
- Normalize FastApi headers/request/middleware
- Add Langfuse integration
- Add tests
- refactor checkpointer shit factory
- 🟠 [raw api implemented] thread management Create/Update/Delete (thread model(ulid, user_id, created/updated, title, status[]))
- 🟠 [50/50 DONE] Store the thread history in the database (with all custom messages and metadata)
- 🟠 Add a way to define a custom agent in config?
-
Add Langsmith integration -
Keep alive SSE connection until the user closes the browser tab (??) - 🟡 Add a way to validate the user's access token (OAuth2)
- 🟡 Add evaluation metrics
- 🔴 Add one more abstraction layer so the agent can use different frameworks (LangGraph, LlamaIndex, etc.)
- 🟠 Add even more fucking abstractions to make it independent of observability tools (LangFuse, LangSmith, Grafana Alloy, or whatever the fuck else)
- ⚪ Long-Term memory for each user. I want to add to chat application for real-time per thread prompt tuning - memory insights, response strategies, etc. But this is more about agent implementation not template core. Graph node as " addon package?" LOL! https://i.imgur.com/k1jk3cx.png here we go again!
- ⚪ Guardrails (LLMGuard implementation or handle by LiteLLM)
⚪ - LOWEST priority | 🟡 - LOW priority | 🟠 - MID priority | 🔴 - HIGH priority | 🟣 - BLOCKER