GitHub - kot-behemoth/kitsunadata: Self-hosted one-person data platform

Kitsuna Data

Self-hosted one-person data platform

Table of Contents

About The Project
Getting Started
- Prerequisites
- Installation
Usage
Roadmap
Contact
Inspiration

About The Project

Caution

The project is very much in the pre-alpha stage. This is more of an experiment and is not meant for produciton workloads.

This is a concept for what a Rails-inspired small data platform for startups and SMEs could look like. After using a variety of end-to-end solutions like DOMO, Keboola, Mozart Data and others, I keep wishing there was something that would do the 80% of ELT + BI out-of-the-box, without the price surprises.

This project is an attempt to stitch together a set of solid and reliable open-source tools that combine into a lean platform where one data engineer can own the entire lifecycle. From ELT, to data modelling, to deploying and scaling in production.

Main Features

🧪 From laptop to production in minutes - Develop locally with DuckDB, deploy with the same code. No more "it works on my machine" problems.
⚡ Lightning-fast analytics on any data size - DuckDB's column-oriented design handles gigabytes of data on modest hardware. Query billions of rows in seconds.
📊 Beautiful dashboards - Drag-and-drop dataviz with Metabase. Perfect for everyone - tech and non-tech alike.
💸 Scale without breaking the bank - Enterprise-grade data stack for as little as $30/month. DuckDB + SQLMesh's efficiency means less compute costs than Snowflake or BigQuery.
🔄 30+ ready-to-use integrations - Instant integrations with dlt for Stripe, GitHub, Salesforce, and more. Connect your SaaS tools with minimal code.
🤖 Just ask your DB - Ask questions in plain English with DuckDB's MCP. Get immediate answers without writing complex queries.
🔍 End-to-end data lineage - SQLMesh tracks transformations from raw to gold data. Understand exactly where metrics come from and debug easily.

Goals

Local-first development for the entire stack.
Support companies that can't afford heavy, expensive data tools or large teams.
No "SSO tax" - all tools should be either fully free, or affordable once deployed in serious prod use case.
No k8s, so a small data team can be self-sufficient .
Cheap path to production and scaling.

Tech Stack

Extract (planned): dlt
Transform: SQLMesh
Data Storage: DuckDB
BI / data viz: Metabase
Deployment: Dokku

(back to top)

Getting Started

Prerequisites

This is an example of how to list things you need to use the software and how to install them.

uv
mise (recommended)
claude (recommended)

Installation

Clone this repository
Download the DuckDB driver for Metabase:
```
make download-duckdb-driver
```
Start the services:
```
docker-compose up -d
```
Access Metabase at http://localhost:3000

(back to top)

Usage

TODO

(back to top)

Deployment

Deploying to DigitalOcean with Dokku

This project can be deployed to DigitalOcean using Dokku with the following architecture:

Metabase Container:
- Dedicated hostname (e.g., metabase.yourdomain.com)
- Access to mounted DuckDB volume
dlt + SQLMesh Container:
- Combined container for data processing
- Access to the same DuckDB volume
Shared Storage:
- DigitalOcean Volume for persistent DuckDB storage

Deployment Steps

Create a Dokku-enabled droplet on DigitalOceana

Dokku docs: https://dokku.com/docs/getting-started/install/digitalocean.

Deploy using app.json configuration:

# Clone the repository on your local machine
git clone https://github.com/yourusername/kitsuna-data.git
cd kitsuna-data

# Add Dokku remote
git remote add dokku dokku@your-droplet-ip:kitsuna-data

# Push to Dokku - this will use the .do/app.yaml configuration
git push dokku main

Dokku will automatically:

Create the apps defined in app.json
Set up the specified resources
Configure the mounts for shared storage
Set up the domains

Set up SSL (recommended):
```
dokku letsencrypt:enable metabase
```

This deployment approach gives you:

Separate containers for Metabase and data processing
Shared persistent storage for DuckDB
Simple deployment through Dokku
Custom domain for Metabase

Roadmap

Add SQLMesh
Add MCP for DuckDB
Add dlt
- Implement as an example: Exploring StarCraft 2 data with Airflow, DuckDB and Streamlit | by Volker Janz | Data Engineer Things
Add Dokku deployment configuration
- Create a DigitalOcean box for a public demo
Add installation docs
Add usage docs
Add Aider docs

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.dokku		.dokku
db		db
sqlmesh		sqlmesh
.gitignore		.gitignore
.mcp.json		.mcp.json
.python-version		.python-version
Dockerfile.elt		Dockerfile.elt
Dockerfile.metabase		Dockerfile.metabase
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
mise.toml		mise.toml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kitsuna Data

About The Project

Main Features

Goals

Tech Stack

Getting Started

Prerequisites

Installation

Usage

Deployment

Deploying to DigitalOcean with Dokku

Deployment Steps

Roadmap

Contact

Inspiration

About

Uh oh!

Releases

Packages

Languages

License

kot-behemoth/kitsunadata

Folders and files

Latest commit

History

Repository files navigation

Kitsuna Data

About The Project

Main Features

Goals

Tech Stack

Getting Started

Prerequisites

Installation

Usage

Deployment

Deploying to DigitalOcean with Dokku

Deployment Steps

Roadmap

Contact

Inspiration

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages