A complete observability stack using Docker Compose with Grafana, Tempo, Loki, Prometheus, and OpenTelemetry Collector. This stack provides unified logging, metrics, and tracing capabilities with TraceQL metrics support.
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Applications βββββΆβ OpenTelemetry βββββΆβ Tempo β
β β β Collector β β (Traces) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Promtail βββββΆβ Loki β β Prometheus β
β (Log Agent) β β (Logs) β β (Metrics) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β²
βΌ β
ββββββββββββββββββββ β
β Grafana βββββββββββββββ
β (Visualization) β
ββββββββββββββββββββ
- Grafana (Port 3000): Visualization and dashboards
- Tempo (Port 3200): Distributed tracing with TraceQL metrics support
- Loki (Port 3100): Log aggregation
- Prometheus (Port 9090): Metrics collection and storage
- OpenTelemetry Collector (Ports 4317/4318): Telemetry data collection
- Promtail: Log collection agent
- π Unified Observability: Logs, metrics, and traces in one stack
- π TraceQL Metrics: Generate metrics from traces using TraceQL queries
- π Correlation: Link traces to logs and metrics
- π Real-time Monitoring: Live dashboards and alerting
- π³ Docker Compose: Easy deployment and management
- π§ Production Ready: Proper configuration for production workloads
- Docker and Docker Compose installed
- At least 4GB RAM available
- Ports 3000, 3100, 3200, 4317, 4318, 8888, 8889, 9090, 13133 available
git clone <your-repo-url>
cd ObservabilityStack
docker-compose up -d
- Grafana: http://localhost:3000 (admin/admin)
- Prometheus: http://localhost:9090
- Tempo: http://localhost:3200
- Loki: http://localhost:3100
# Check all services are running
docker-compose ps
# Check logs if needed
docker-compose logs [service-name]
This stack includes TraceQL metrics support, allowing you to generate metrics from traces.
In Grafana's Explore view with Tempo data source:
# Overall request rate
{ } | rate()
# Request rate by service
{ } | rate() by(resource.service.name)
# Error rate
{ status = error } | rate()
# Success rate by service and status
{ status != error } | rate() by(resource.service.name, status)
# Request duration histogram
{ span.kind = "server" } | histogram()
# Rate of specific operations
{ name = "GET /api/users" } | rate()
The stack includes pre-configured data sources:
- Tempo:
http://tempo:3200
- Loki:
http://loki:3100
- Prometheus:
http://prometheus:9090
File | Purpose |
---|---|
docker-compose.yml |
Main orchestration file |
tempo-config.yml |
Tempo configuration with TraceQL metrics |
loki-config.yml |
Loki log aggregation configuration |
prometheus-config.yml |
Prometheus metrics configuration |
otel-collector-config.yml |
OpenTelemetry Collector configuration |
promtail-config.yml |
Promtail log collection configuration |
grafana/provisioning/ |
Grafana data source provisioning |
- TraceQL Metrics: Enabled with
local-blocks
processor - Storage: Local filesystem with proper permissions
- Retention: 1 hour for testing (configurable)
- Metrics Generation: Sends metrics to Prometheus via remote-write
- Receivers: OTLP (gRPC: 4317, HTTP: 4318), Prometheus
- Processors: Batch, memory limiter, resource
- Exporters: Tempo (traces), Loki (logs), Prometheus (metrics), Debug
- Remote Write: Enabled for receiving metrics from Tempo
- Retention: 200 hours
- Scrape Targets: OTel Collector metrics
# OTLP gRPC endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# OTLP HTTP endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
Promtail automatically collects logs from /var/log
directory. To send custom logs to Loki:
curl -X POST http://localhost:3100/loki/api/v1/push \
-H "Content-Type: application/json" \
-d '{"streams": [{"stream": {"job": "test"}, "values": [["'$(date +%s%N)'", "test log message"]]}]}'
Send metrics to the OpenTelemetry Collector or directly to Prometheus.
-
Port Conflicts
# Check what's using the ports lsof -i :3000,3100,3200,4317,4318,9090
-
Permission Issues with Tempo
# Fix tempo-data permissions chmod 777 tempo-data/
-
Container Won't Start
# Check logs docker-compose logs [service-name] # Restart specific service docker-compose restart [service-name]
-
TraceQL Metrics Not Working
- Ensure Tempo has
local-blocks
processor enabled - Check Prometheus is receiving remote-write data
- Verify traces are being sent to Tempo
- Ensure Tempo has
# Check Tempo health
curl http://localhost:3200/ready
# Check Prometheus targets
curl http://localhost:9090/api/v1/targets
# Check Loki health
curl http://localhost:3100/ready
# Check OTel Collector metrics
curl http://localhost:8888/metrics
- Tempo:
tempo_ingester_live_traces
,tempo_request_duration_seconds
- Loki:
loki_ingester_streams
,loki_request_duration_seconds
- Prometheus:
prometheus_tsdb_head_samples_appended_total
- OTel Collector:
otelcol_receiver_accepted_spans_total
Import these dashboard IDs for monitoring:
- Tempo: 16050
- Loki: 13407
- Prometheus: 3662
- OpenTelemetry Collector: 15983
- Change default Grafana credentials in production
- Use proper authentication for external access
- Configure network policies for Kubernetes deployments
- Enable TLS for production environments
For production use:
- Use External Storage: Configure object storage (S3, GCS, Azure)
- Scale Components: Use multiple replicas
- Add Monitoring: Monitor the monitoring stack itself
- Configure Retention: Set appropriate data retention policies
- Secure Access: Add authentication and authorization
- Resource Limits: Set CPU and memory limits
- Grafana Documentation
- Tempo Documentation
- Loki Documentation
- Prometheus Documentation
- OpenTelemetry Documentation
- TraceQL Documentation
- Fork the repository
- Create a feature branch
- Make your changes
- Test the configuration
- Submit a pull request
This project is open source. See LICENSE file for details.
Happy Observing! πππ