Skip to content

qSharpy/alert_correlator

Repository files navigation

🚨 Alert Fatigue No More: Building an AI-Powered Correlation Engine for Prometheus

A powerful alert correlation system that helps reduce alert fatigue by intelligently grouping and analyzing related alerts from your monitoring stack. Built with Prometheus, AlertManager, Grafana, and Loki.

🌟 Features

  • Intelligent alert correlation and grouping
  • Real-time alert processing
  • Beautiful Grafana dashboards for visualization
  • Integrated logging with Loki
  • Comprehensive runbooks for common scenarios
  • Docker-based deployment for easy setup

🎥 Demo

Watch how the Alert Correlator works in action:

Alert Correlator Demo

See how to use local models and switch between different LLM providers:

Alert Correlator LLM Features

📁 Project Structure

The project is organized into the following main directories:

  • backend/: Contains the core application logic
    • alert_correlator.py: Main correlation engine
    • server.js: API server
  • config/: Configuration files for monitoring and logging
    • logging/: Loki and Promtail configurations
    • monitoring/: Prometheus and AlertManager configurations
  • docker/: Docker-related files
    • Dockerfile: Main application container
    • docker-compose.yml: Service orchestration
    • .env: Environment variables
  • frontend/: Web interface files
  • grafana/: Grafana provisioning and dashboards
  • runbooks/: Documentation for various scenarios
  • logs/: Application logs

🔧 Prerequisites

  • Docker and Docker Compose
  • Git
  • Python 3.8 or higher (if running locally)
  • ~2GB of free RAM for all containers

🚀 Quick Start

  1. Clone the repository:
git clone https://github.com/yourusername/alert_correlator.git
cd alert_correlator
  1. Set up environment configuration:

Create or modify the .env file in the docker directory with your OpenAI API key:

echo "OPENAI_API_KEY=your_openai_api_key_here" > docker/.env
  1. Create the logs directory and log file:
mkdir -p logs
touch logs/alert_correlator.log
  1. Start the services:
docker compose -f docker/docker-compose.yml up -d

📊 Accessing the Services

After starting the containers, you can access the following services:

📈 Grafana Dashboard

  1. Login to Grafana at http://localhost:3000
  2. Navigate to Dashboards -> Browse
  3. Look for "Alert Correlator Dashboard"
  4. The dashboard includes:
    • Alert correlation statistics
    • Active alert groups
    • Historical correlation data
    • Alert patterns and trends

🛠️ Configuration

Alert Correlator

The main configuration is done through environment variables and the following files:

  • backend/alert_correlator.py: Main correlation logic
  • config/monitoring/alertmanager.yml: AlertManager configuration
  • config/monitoring/prometheus.yml: Prometheus configuration
  • config/logging/loki-config.yaml: Loki configuration
  • config/logging/promtail-config.yaml: Log shipping configuration

LLM Service Configuration

The Alert Correlator now supports multiple LLM providers:

  1. Local Models with Ollama: Run AI models locally on your machine

    • Supports various open-source models
    • No internet connection required for inference
    • Lower latency for quick responses
  2. OpenAI: Use OpenAI's powerful GPT models

    • Requires API key
    • Best-in-class performance

You can switch between these providers through the LLM service interface at runtime, allowing for flexible model selection based on your needs.

Runbooks

Predefined runbooks are available in the runbooks/ directory, organized by component:

  • alertmanager/: AlertManager related issues
  • prometheus/: Prometheus troubleshooting
  • kubernetes/: Kubernetes cluster issues
  • grafana/: Grafana-related problems
  • azure-devops/: Azure DevOps pipeline issues

🔍 Testing

You can test the alert correlation system using the Makefile:

make help

📝 Logging

Logs are available in:

  • Docker logs: docker compose -f docker/docker-compose.yml logs -f [service_name]
  • Application logs: logs/alert_correlator.log
  • Through Loki in Grafana

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

⚖️ License

This project is licensed under a Custom Attribution-NonCommercial License - see the LICENSE file for details.

👤 Author

Vasile Bogdan Bujor

⚠️ Important Notes

  • This is a monitoring tool - please ensure you have proper security measures in place
  • Always backup your configuration before making changes
  • Test thoroughly in a non-production environment first
  • Commercial use requires explicit permission from the author

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published