Simple Snowplow is a lightweight, self-hosted analytics collector compatible with the Snowplow analytics protocol. It allows you to collect event data from websites and applications while maintaining full control over your data and infrastructure.
- Compatible with Snowplow JavaScript tracker
- Collects web analytics data including page views, events, and user information
- Stores data in ClickHouse for high-performance analytics queries
- Optional SendGrid event tracking integration
- Configurable data retention and storage settings
- Horizontal scaling capabilities with ClickHouse cluster support
- Built with FastAPI for high performance
- Optional demo mode for easy testing
Simple Snowplow consists of the following components:
- FastAPI Backend: Handles incoming tracking events and forwards them to ClickHouse
- ClickHouse Database: Stores and processes analytics data
- JavaScript Tracking Libraries: Compatible with standard Snowplow js trackers
To install Simple Snowplow for local development:
- Make sure you have Docker and Docker Compose installed
- Clone the repository:
git clone https://github.com/yourusername/simple-snowplow.git cd simple-snowplow
- Download the required JavaScript files:
./simple_snowplow/utils/download_scripts.sh
- Start the application using Docker Compose:
docker compose up
For a production environment:
- Install ClickHouse version 22.11 or later (follow the ClickHouse documentation)
- Build the Docker image:
docker build -t simple-snowplow ./simple_snowplow
- Create a custom configuration file (see Configuration)
- Run the Docker container:
docker run -d \ -p 8000:80 \ -v /path/to/your/config.toml:/app/settings.toml \ -e SNOWPLOW_ENV=production \ simple-snowplow
For Kubernetes deployment, check the example manifests in the .github/k8s
directory.
Simple Snowplow uses a combined configuration system with Dynaconf for loading settings and Pydantic for type validation. Configuration can be managed through:
- Default settings from
settings.toml
- Secret settings from
.secrets.toml
(if exists) - Environment variables with
SNOWPLOW_
prefix - Custom settings file specified with
SNOWPLOW_SETTINGS_FILE
The configuration system prioritizes these sources in the order listed.
The configuration is organized into logical sections:
common
: Basic application settingsclickhouse
: Database connection and table settingslogging
: Log formatting and levelsecurity
: Security settings including rate limitingproxy
: Configuration for proxy endpointsperformance
: Application performance tuningelastic_apm
: APM monitoring configurationprometheus
: Metrics and monitoring settings
Setting | Description | Default |
---|---|---|
common.service_name |
Application name | simple-snowplow |
common.debug |
Enable debug mode | false |
common.demo |
Enable demo mode | false |
logging.level |
Log level (DEBUG, INFO, WARNING, ERROR) | WARNING |
logging.json |
Use JSON formatting for logs | false |
security.rate_limiting.enabled |
Enable request rate limiting | false |
clickhouse.connection.host |
ClickHouse host | clickhouse |
clickhouse.connection.port |
ClickHouse port | 8123 |
clickhouse.configuration.database |
ClickHouse database | snowplow |
clickhouse.configuration.cluster_name |
ClickHouse cluster name (if using) | "" |
performance.max_concurrent_connections |
Max concurrent connections | 100 |
performance.db_pool_size |
Database connection pool size | 5 |
For a complete list of configuration options, refer to the settings.toml
file.
You can override any configuration setting using environment variables with the SNOWPLOW_
prefix and double underscores to represent nested keys:
SNOWPLOW_COMMON__DEMO=true
SNOWPLOW_CLICKHOUSE__CONNECTION__HOST=my-clickhouse-server
SNOWPLOW_SECURITY__RATE_LIMITING__ENABLED=true
To use a custom configuration file:
export SNOWPLOW_SETTINGS_FILE=/path/to/your/custom.toml
The custom file only needs to include settings you want to override.
Simple Snowplow supports environment-specific settings through the SNOWPLOW_ENV
variable:
export SNOWPLOW_ENV=production
Settings for specific environments can be defined in the configuration file:
[development]
logging.level = "DEBUG"
[production]
logging.level = "WARNING"
security.disable_docs = true
To track events from your website:
-
Include the Snowplow tracker in your HTML:
<script type="text/javascript"> (function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||[]; p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||[]).push(arguments) };p[i].q=p[i].q||[];n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1; n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,"script","//your-server.com/static/sp.js","snowplow")); snowplow('newTracker', 'sp', 'your-server.com', { appId: 'my-app', platform: 'web', post: true, forceSecureTracker: true }); snowplow('trackPageView'); </script>
-
Replace
your-server.com
with your Simple Snowplow server address
To test Simple Snowplow with the built-in demo:
- Set
common.demo = true
in your configuration - Access the demo page at
http://your-server.com/demo/
- Events will be tracked and stored in your ClickHouse database
To query the collected data, connect to your ClickHouse instance:
docker exec -it simple-snowplow-ch clickhouse-client
Example queries:
-- Get page views from the last 24 hours
SELECT time, page, refr, device_id, session_id
FROM snowplow.local
WHERE event_type = 'page_view'
AND time > now() - INTERVAL 1 DAY
ORDER BY time DESC;
-- Count events by type
SELECT event_type, count() as events
FROM snowplow.local
GROUP BY event_type
ORDER BY events DESC;
-
Connection refused to ClickHouse
- Check that ClickHouse is running and accessible
- Verify the host and port in your configuration
- Ensure firewall rules allow connections
-
No data being collected
- Check browser console for JavaScript errors
- Verify tracking endpoint is correctly configured in your tracker
- Check server logs for any errors
-
Missing data in queries
- Verify the table name (it may be different if you configured a custom name)
- Check if data partitioning is working as expected
To view application logs:
docker logs -f simple-snowplow
For more verbose logging, set logging.level = "DEBUG"
in your configuration.
-
Create a virtual environment and install uv (if not already installed):
# Install uv (if not already installed) curl -sSf https://astral.sh/uv/install.sh | sh # Or on macOS with Homebrew # brew install uv # Create a virtual environment with uv uv venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
uv sync
-
Install development tools:
uv pip install -G dev pre-commit install
-
Run the application locally:
cd simple_snowplow uv run uvicorn main:app --reload
uv run pytest
docker compose up --watch
This activates the development mode which automatically:
- Syncs your local code changes to the container without rebuilding
- Rebuilds the container only when dependencies change (when uv.lock is modified)
This project is licensed under the terms of the LICENSE file.