Skip to content

arnabdey73/azure-core-governance-sre-agent

Azure Core Governance SRE Agent

An automated Site Reliability Engineering (SRE) agent for enforcing and monitoring governance compliance across Azure core landing zone resources. This project helps detect and auto-remediate policy violations, configuration drift, and security misconfigurations using Azure-native services and automation scripts.

🌐 Overview

The Azure Core Governance SRE Agent provides automated monitoring and remediation for Azure resources, ensuring they comply with organizational governance policies and best practices.

Component Purpose
Python Scripts Perform checks and trigger remediations
Azure Functions Scheduled or triggered remediation workflows
Bicep/Terraform Infra provisioning templates
Playbooks Incident response and escalation procedures

πŸ“‹ Prerequisites

  • Azure subscription
  • Azure CLI installed
  • Python 3.8+
  • Appropriate Azure RBAC permissions

πŸ› οΈ Installation

  1. Clone this repository
  2. Set up the required Azure resources:
    cd infra/bicep  # or infra/terraform
    ./deploy.sh
  3. Configure your environment variables (see Configuration)

πŸ“ Project Structure

azure-core-governance-sre-agent/
β”œβ”€β”€ scripts/               # Compliance check scripts
β”‚   └── remediation/      # Remediation scripts
β”œβ”€β”€ azure-functions/       # Azure Functions code
β”‚   β”œβ”€β”€ http-trigger-remediator/  # HTTP-triggered remediation function
β”‚   └── timer-trigger-checker/    # Timer-triggered compliance check function
β”œβ”€β”€ infra/                # Infrastructure as Code templates
β”‚   β”œβ”€β”€ bicep/           # Bicep templates
β”‚   └── terraform/       # Terraform templates
└── playbooks/            # Incident response procedures

πŸš€ Usage

Compliance Checks

python scripts/check_compliance.py --subscription <subscription-id>

Deploy to Azure

cd azure-functions/timer-trigger-checker
func azure functionapp publish <function-app-name>

cd ../http-trigger-remediator
func azure functionapp publish <function-app-name>

Automation

This project can be integrated with CI/CD pipelines:

  • GitHub Actions workflows
  • Azure DevOps pipelines

βš™οΈ Configuration

Environment Variables

Create a .env file with the following variables:

AZURE_SUBSCRIPTION_ID=your-subscription-id
AZURE_TENANT_ID=your-tenant-id
LOG_LEVEL=INFO

Configuration Files

  • config/policies.json - Define custom policy requirements
  • config/notification-settings.json - Configure alerts and notifications

πŸ”„ Remediation Workflows

The agent supports the following remediation workflows:

  1. Automatic Remediation

    • Non-compliant resources are automatically fixed based on predefined rules
    • Logging and audit trail maintained for all changes
  2. Approval-based Remediation

    • Changes requiring approval trigger notification workflows
    • Approvers can review and authorize via Teams or email
  3. Manual Remediation

    • Some issues include guided steps for manual resolution
    • Documentation links provided for complex scenarios

πŸ—οΈ Architecture

                                           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                           β”‚                 β”‚
                                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Ά Azure Resources β”‚
                                  β”‚        β”‚                 β”‚
                                  β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 β”‚     β”‚                 β”‚      β”‚                 β”‚
β”‚  Timer Trigger  β”œβ”€β”€β”€β”€β”€β–Ά    Checker      β”œβ”€β”€β”€β”€β”€β”€β–Ά   Remediator    β”‚
β”‚                 β”‚     β”‚                 β”‚      β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚                      β”‚
                                  β”‚                      β”‚
                                  β–Ό                      β–Ό
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚                 β”‚    β”‚                 β”‚
                         β”‚   Logs & Metricsβ”‚    β”‚  Notifications  β”‚
                         β”‚                 β”‚    β”‚                 β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ” Monitoring & Reporting

  • Dashboard - Azure Dashboard template available at infra/dashboard
  • Logging - All activity logged to Application Insights
  • Reporting - Weekly compliance reports generated automatically

πŸ›‘οΈ Security

  • All credentials stored in Azure Key Vault
  • Managed Identities used for service authentication
  • Regular security scanning integrated with CI/CD pipelines

πŸ”§ Troubleshooting

Common Issues

  1. Authentication Failures

    • Verify service principal permissions
    • Check Key Vault access policies
  2. Remediation Failures

    • Review logs in Application Insights
    • Check resource locks that might prevent changes
  3. Timeout Issues

    • For large subscriptions, adjust the function timeout in host.json

πŸ“š Additional Resources

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

An automated SRE agent for Azure core landing zone governance

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published