Skip to content

Custom end-to-end chargeback solution for generative AI workloads. Tracks token usage per user, offering a sleek GUI for summary insights, usage breakdowns, and analytics—enabling cost transparency and internal billing.

License

Notifications You must be signed in to change notification settings

andyogah/apim-openai-chargeback-environment

Repository files navigation

Azure API Management OpenAI Chargeback Environment

License: MIT Azure OpenAI Bicep

TL;DR

Enterprise-ready solution for OpenAI usage tracking and chargeback through Azure APIM. Solves the 8KB payload limit problem, provides real-time cost tracking, and includes comprehensive Infrastructure-as-Code deployment options.

🚀 Quick Deploy: git clone → setup Python → deploy Bicep → configure policies

💰 What you get: Real-time usage tracking, chargeback reports, scalable architecture, multi-IaC support

🏗️ Tech Stack: Azure Functions (Python), API Management, Redis Cache, Bicep/Terraform

Quick Start

# 1. Clone and setup
git clone https://github.com/your-org/apim-openai-chargeback-environment.git
cd apim-openai-chargeback-environment

# 2. Setup Python environment
cd function-app && python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# 3. Deploy infrastructure (Bicep - recommended)
cd ../infrastructure/bicep
az deployment group create \
  --resource-group myResourceGroup \
  --template-file main.bicep \
  --parameters @parameters/dev.json

# 4. Deploy Function App
cd ../../function-app && func azure functionapp publish <function-app-name>

# 5. Done! 🎉

⚡ Alternative deployment options: Terraform | ARM Templates | Pulumi

The Problem We Solve

Challenge Impact Our Solution
APIM logs truncate at 8KB Large OpenAI payloads get cut off ✅ Handle payloads up to 100MB
No usage tracking Cost overruns, no chargeback ✅ Real-time cost tracking & reports
Manual deployment Inconsistent environments ✅ Multiple IaC options (Bicep, Terraform)
Security concerns Data persistence issues ✅ Minimal retention, encryption at rest

Solution Overview

graph LR
    A[Client Apps] --> B[APIM Gateway]
    B --> C[Azure Functions]
    C --> D[Azure OpenAI]
    C --> E[Redis Cache]
    E --> F[Chargeback Reports]
Loading

Core Components:

  • 🔐 API Management - Gateway, auth, rate limiting
  • Azure Functions (Python) - Usage tracking, cost calculation
  • 💾 Redis Cache - Temporary data storage
  • 📊 Chargeback Engine - Real-time cost allocation

📐 Need detailed architecture? See our comprehensive Architecture Guide with detailed component diagrams, data flows, and system design.

Key Features

  • 🚀 Scalable: Handles large payloads (>8192 bytes)
  • 💰 Cost Tracking: Real-time usage monitoring and chargeback
  • 🔒 Secure: Minimal data persistence, Azure AD integration
  • 🏗️ Multi-IaC: Bicep (primary), Terraform, ARM templates, Pulumi
  • 📈 Enterprise Ready: Multi-region, HA, comprehensive monitoring
  • 🔄 CI/CD Ready: GitHub Actions, Azure DevOps pipelines

Documentation

Topic Description Link
Architecture Detailed system design and components 📖 Architecture Guide
Deployment Step-by-step deployment for all IaC tools 🚀 Deployment Guide
API Reference Complete API documentation 📚 API Docs
Usage Examples Code examples in multiple languages 💻 Usage Examples
Monitoring Dashboards, alerts, and troubleshooting 📊 Monitoring Guide
Security Security best practices and compliance 🔒 Security Guide
Contributing Development setup and guidelines 🤝 Contributing

Usage Examples

Python (Recommended)

import requests

response = requests.post(
    "https://your-apim.azure-api.net/api/v1/openai/chat/completions",
    headers={"Ocp-Apim-Subscription-Key": "your-key"},
    json={"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}
)

Other Languages

Chargeback Dashboard Preview

Chargeback Example

Real-time usage tracking with detailed cost breakdown by user, model, and time period.

Prerequisites

  • ✅ Azure subscription with appropriate permissions
  • ✅ Azure CLI v2.50+
  • ✅ Python 3.9+
  • ✅ Azure Functions Core Tools v4.x
  • ✅ Docker Desktop (for local testing)

Optional: Bicep CLI, Terraform, or ARM templates depending on your IaC preference

Configuration

Environment Variables

# Essential configuration
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com/
REDIS_CONNECTION_STRING=your-redis-connection-string
APIM_SUBSCRIPTION_KEY=your-apim-subscription-key

📋 Complete configuration guide: Configuration Reference

Repository Structure

├── infrastructure/           # IaC templates
│   ├── bicep/               # 🎯 Primary (recommended)
│   ├── terraform/           # Alternative
│   ├── arm/                 # Alternative  
│   └── pulumi/              # Alternative
├── function-app/            # Python Azure Functions
├── policies/                # APIM policies
├── scripts/                 # Deployment scripts
└── docs/                    # Detailed documentation

📝 Note: The app works as-is, but we're tidying up the structure. Work in progress! 🚧

Deployment Options

Method Best For Command
Bicep Azure-native teams az deployment group create --template-file main.bicep
Terraform Multi-cloud or existing Terraform terraform apply -var-file="dev.tfvars"
ARM Templates Legacy Azure deployments az deployment group create --template-file azuredeploy.json
Pulumi Programming-first approach pulumi up

🔗 Detailed deployment guides: All deployment methods

FAQ

❓ What's the max payload size? ✅ Up to 100MB (vs 8KB APIM limit)

❓ How long is data retained?
✅ 30 days by default (configurable)

❓ Multi-region support? ✅ Yes, built for HA and disaster recovery

❓ Can I use .NET instead of Python? ✅ Yes, Azure Functions support multiple runtimes

📚 More questions: Complete FAQ

Monitoring & Alerts

  • 📊 Real-time dashboards for usage, costs, and performance
  • 🚨 Automated alerts for anomalies and thresholds
  • 📈 Custom metrics for business KPIs
  • 🔍 Detailed logging for troubleshooting

🔧 Setup guide: Monitoring Configuration

Support & Community

  • 📖 Documentation: Comprehensive guides in /docs/
  • 🐛 Issues: GitHub Issues
  • 💬 Discussions: GitHub Discussions
  • 🏢 Enterprise: Contact support team for professional services

Roadmap

Quarter Features
Q2 2025 Multi-region deployment, advanced analytics
Q3 2025 ML-based cost optimization, enhanced reporting
Q4 2025 Additional AI service integrations

Contributing

We welcome contributions! See our Contributing Guide for:

  • 🛠️ Development setup
  • 🧪 Testing guidelines
  • 📋 Code standards
  • 🔄 PR process

License

MIT License - see LICENSE for details.


🎯 Ready to get started? Follow the Quick Start above or dive into the Deployment Guide.

💡 Need help? Check our FAQ or create an issue.

About

Custom end-to-end chargeback solution for generative AI workloads. Tracks token usage per user, offering a sleek GUI for summary insights, usage breakdowns, and analytics—enabling cost transparency and internal billing.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •