Skip to content

kot-behemoth/kitsunadata

Repository files navigation


Kitsuna Data

Self-hosted one-person data platform

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contact
  6. Inspiration

About The Project

Caution

The project is very much in the pre-alpha stage. This is more of an experiment and is not meant for produciton workloads.

This is a concept for what a Rails-inspired small data platform for startups and SMEs could look like. After using a variety of end-to-end solutions like DOMO, Keboola, Mozart Data and others, I keep wishing there was something that would do the 80% of ELT + BI out-of-the-box, without the price surprises.

This project is an attempt to stitch together a set of solid and reliable open-source tools that combine into a lean platform where one data engineer can own the entire lifecycle. From ELT, to data modelling, to deploying and scaling in production.

Main Features

  1. 🧪 From laptop to production in minutes - Develop locally with DuckDB, deploy with the same code. No more "it works on my machine" problems.

  2. ⚡ Lightning-fast analytics on any data size - DuckDB's column-oriented design handles gigabytes of data on modest hardware. Query billions of rows in seconds.

  3. 📊 Beautiful dashboards - Drag-and-drop dataviz with Metabase. Perfect for everyone - tech and non-tech alike.

  4. 💸 Scale without breaking the bank - Enterprise-grade data stack for as little as $30/month. DuckDB + SQLMesh's efficiency means less compute costs than Snowflake or BigQuery.

  5. 🔄 30+ ready-to-use integrations - Instant integrations with dlt for Stripe, GitHub, Salesforce, and more. Connect your SaaS tools with minimal code.

  6. 🤖 Just ask your DB - Ask questions in plain English with DuckDB's MCP. Get immediate answers without writing complex queries.

  7. 🔍 End-to-end data lineage - SQLMesh tracks transformations from raw to gold data. Understand exactly where metrics come from and debug easily.

Goals

  • Local-first development for the entire stack.
  • Support companies that can't afford heavy, expensive data tools or large teams.
  • No "SSO tax" - all tools should be either fully free, or affordable once deployed in serious prod use case.
  • No k8s, so a small data team can be self-sufficient .
  • Cheap path to production and scaling.

Tech Stack

(back to top)

Getting Started

Prerequisites

This is an example of how to list things you need to use the software and how to install them.

  • uv
  • mise (recommended)
  • claude (recommended)

Installation

  1. Clone this repository
  2. Download the DuckDB driver for Metabase:
    make download-duckdb-driver
  3. Start the services:
    docker-compose up -d
  4. Access Metabase at http://localhost:3000

(back to top)

Usage

TODO

(back to top)

Deployment

Deploying to DigitalOcean with Dokku

This project can be deployed to DigitalOcean using Dokku with the following architecture:

  1. Metabase Container:

    • Dedicated hostname (e.g., metabase.yourdomain.com)
    • Access to mounted DuckDB volume
  2. dlt + SQLMesh Container:

    • Combined container for data processing
    • Access to the same DuckDB volume
  3. Shared Storage:

    • DigitalOcean Volume for persistent DuckDB storage

Deployment Steps

  1. Create a Dokku-enabled droplet on DigitalOceana

Dokku docs: https://dokku.com/docs/getting-started/install/digitalocean.

  1. Deploy using app.json configuration:

    # Clone the repository on your local machine
    git clone https://github.com/yourusername/kitsuna-data.git
    cd kitsuna-data
    
    # Add Dokku remote
    git remote add dokku dokku@your-droplet-ip:kitsuna-data
    
    # Push to Dokku - this will use the .do/app.yaml configuration
    git push dokku main

    Dokku will automatically:

    • Create the apps defined in app.json
    • Set up the specified resources
    • Configure the mounts for shared storage
    • Set up the domains
  2. Set up SSL (recommended):

    dokku letsencrypt:enable metabase

This deployment approach gives you:

  • Separate containers for Metabase and data processing
  • Shared persistent storage for DuckDB
  • Simple deployment through Dokku
  • Custom domain for Metabase

Roadmap

(back to top)

Contact

Greg Goltsov - @gregoltsov, gregoltsov.bsky.social.

(back to top)

Inspiration

Here are some projects which inspired my thinking and this project:

(back to top)

Releases

No releases published

Packages

No packages published