The Nephoran Intent Operator is a proof-of-concept cloud-native platform that demonstrates the potential of intent-driven telecommunications orchestration. This MVP showcases how natural language intents can be translated into structured network function configurations through AI-powered processing, providing a foundation for future production-ready telecommunications automation.
๐ Key Value Proposition (MVP Scope):
- Natural Language Interface: Translate network intents into structured configurations using AI
- Kubernetes-Native: CRD-based controller for NetworkIntent resource management
- LLM Integration: GPT-4o-mini processing with optional RAG enhancement (behind build tag)
- Extensible Architecture: Foundation for O-RAN interface implementations and production features
- Proof-of-Concept: Demonstrates intent-driven orchestration potential with simulated network functions
Currently an MVP/proof-of-concept demonstrating intent-driven network orchestration capabilities. The system includes core functionality for NetworkIntent processing, LLM integration, and basic controller operations with comprehensive testing coverage.
- LLM Integration: GPT-4o-mini for natural language processing
- Optional RAG System: Weaviate vector database integration available (when enabled via build tags)
- Context Enhancement: Framework for semantic retrieval and knowledge augmentation
- Extensible Architecture: Support for multiple LLM providers
- NetworkIntent CRD: Kubernetes custom resource for capturing network intents
- Intent Controller: Processes NetworkIntent resources and manages lifecycle
- LLM Processor: Translates natural language to structured network configurations
- RAG System: Optional context enhancement (enabled via build tags)
- FCAPS Simulator: Automated scaling decisions based on telecom events (docs)
- Kubernetes-Native: Custom resources, operators, and webhooks following K8s best practices
- GitOps Foundation: Basic package generation capabilities for future Nephio integration
- Cloud-Native Patterns: Service-oriented architecture ready for production scaling
- Observability: Prometheus metrics and health endpoints for monitoring
- Critical Security Fixes: Comprehensive input validation, path traversal prevention, and command injection protection
- Secure Patch Generation: Migration from
internal/patch
tointernal/patchgen
with enhanced security validation - Timestamp Security: RFC3339 format with collision prevention and replay attack mitigation
- JSON Schema Validation: Strict input validation with JSON Schema 2020-12 compliance
- HTTP Security: Basic authentication and configurable endpoint access
- Kubernetes RBAC: Standard service account and role-based permissions
- Container Security: Base image scanning in CI pipeline
- Configuration Security: Environment-based secrets management
- Basic Metrics: Prometheus metrics for LLM requests, controller operations, and system health
- Health Endpoints: Kubernetes liveness and readiness probes
- Structured Logging: JSON-formatted logs with request tracing
- Debugging Support: Comprehensive error handling and status reporting
- Intent Translation: Convert natural language to network function parameters
- Configuration Generation: Create structured YAML/JSON for network deployments
- Status Management: Track intent processing lifecycle and deployment state
- Extensibility: Framework for integrating real network function deployments
Get from zero to your first deployed network function in exactly 15 minutes!
Ensure you have these tools installed:
# Check required tools
docker --version # Docker 20.10+
kubectl version --client # Kubernetes v1.30+
git --version # Git 2.30+
go version # Go 1.24+
Quick install if needed:
# Linux/WSL
curl -fsSL https://get.docker.com | sh
curl -LO "https://dl.k8s.io/release/stable.txt" && curl -LO "https://dl.k8s.io/release/$(cat stable.txt)/bin/linux/amd64/kubectl"
# macOS
brew install docker kubectl kind
# Windows (PowerShell as Administrator)
winget install Docker.DockerDesktop Kubernetes.kubectl
# Clone the repository
git clone https://github.com/thc1006/nephoran-intent-operator.git
cd nephoran-intent-operator
# Create Kind cluster with optimal configuration
cat <<EOF > kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: nephoran-quickstart
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
kind create cluster --config=kind-config.yaml
# Install CRDs and deploy core services
kubectl create namespace nephoran-system
kubectl apply -f deployments/crds/
kubectl apply -f deployments/kustomize/base/llm-processor/
kubectl apply -f deployments/kustomize/base/nephio-bridge/
# Create a production-ready AMF network function using natural language
kubectl apply -f - <<EOF
apiVersion: nephoran.com/v1
kind: NetworkIntent
metadata:
name: deploy-amf-production
namespace: default
spec:
intent: |
Deploy a production-ready AMF (Access and Mobility Management Function)
for a 5G core network with:
- High availability with 3 replicas
- Auto-scaling (min: 3, max: 10 pods)
- Resource limits: 2 CPU cores, 4GB memory per pod
- Prometheus monitoring on port 9090
- Standard 3GPP interfaces (N1, N2, N11)
- Support for 100k concurrent UE connections
EOF
# Watch the magic happen! ๐ช
kubectl get networkintent deploy-amf-production -w
# View generated resources
kubectl get all -l generated-from=deploy-amf-production
Run our automated validation:
# Use the included quickstart script for full automation
./scripts/quickstart.sh
# Or run just the validation portion
./scripts/quickstart.sh --skip-prereq
# Expected output: ๐ All checks passed!
Time-Saving Alternative: Run the entire quickstart with a single command:
# Automated 15-minute setup (includes validation)
./scripts/quickstart.sh --demo
If you encounter issues:
- Check our comprehensive QUICKSTART.md for detailed steps
- View Documentation for organized guides and references
- Open GitHub Issues for support
The Nephoran Intent Operator implements a sophisticated five-layer cloud-native architecture:
graph TB
A[Natural Language Intent] --> B[LLM/RAG Processing Layer]
B --> C[Nephio R5 Control Plane]
C --> D[O-RAN Interface Bridge]
D --> E[Network Function Orchestration]
B1[GPT-4o-mini + RAG] --> B
B2[Weaviate Vector DB] --> B
C1[Porch Package Orchestration] --> C
C2[GitOps Workflows] --> C
D1[A1/O1/O2/E2 Interfaces] --> D
E1[5G Core + RAN Functions] --> E
- Intent Capture: Natural language requirements captured via NetworkIntent CRD
- AI Processing: LLM analyzes intent with RAG-enhanced telecommunications knowledge
- Package Generation: Structured parameters create Nephio-compliant packages
- GitOps Deployment: Multi-cluster orchestration via ConfigSync and ArgoCD
- O-RAN Integration: Standards-compliant network function deployment
- Monitoring & Feedback: Comprehensive observability with status propagation
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ Natural Lang โ โ LLM/RAG โ โ NetworkIntent โ
โ Intent Input โโโโโถโ Processor โโโโโถโ Controller โ
โ โ โ โ โ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โฒ โ
โ โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ Weaviate โ โ Knowledge โ โ KRM Package โ
โ Vector DB โโโโโโ Base + RAG โ โ Generation โ
โ โ โ โ โ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ Monitoring โ โ O-RAN Network โ โ GitOps Repository โ
โ & Feedback โโโโโโ Functions โโโโโโ (ConfigSync) โ
โ โ โ โ โ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
| Metric | MVP Status | Notes |
| Intent Processing Latency | Not optimized | Depends on LLM API response time |
| Concurrent Intents | Limited testing | Bounded by Kubernetes resources |
| Throughput | Development mode | Not tested for production load |
| Availability | Basic K8s patterns | No SLA targets |
| Knowledge Base | Basic telco docs | Expandable via RAG system |
| LLM Integration | GPT-4o-mini | Multi-provider ready |
The latest release includes a significant security-focused migration from internal/patch
to internal/patchgen
module with enhanced features:
- Enhanced Security: Comprehensive input validation and path traversal prevention
- Timestamp Security: RFC3339 format with collision prevention
- JSON Schema Validation: Strict validation using JSON Schema 2020-12
- Secure File Operations: Proper permissions and error handling
// Before (internal/patch)
import "github.com/thc1006/nephoran-intent-operator/internal/patch"
// After (internal/patchgen)
import "github.com/thc1006/nephoran-intent-operator/internal/patchgen"
// New validation requirement
validator, err := patchgen.NewValidator(logger)
intent, err := validator.ValidateIntent(intentData)
- Path traversal attack prevention
- Command injection protection
- Secure timestamp generation
- Enhanced input validation framework
For detailed migration information, see CHANGELOG.md and SECURITY.md.
apiVersion: nephoran.com/v1
kind: NetworkIntent
spec:
intent: |
Deploy a production-ready AMF (Access and Mobility Management Function)
for a 5G core network with:
- High availability with 3 replicas
- Auto-scaling configuration (min: 3, max: 10 pods)
- Resource limits: 2 CPU cores, 4GB memory per pod
- Prometheus monitoring on port 9090
- Standard 3GPP interfaces (N1, N2, N11)
apiVersion: nephoran.com/v1
kind: NetworkIntent
spec:
intent: |
Configure basic O-RAN components for testing:
- E2 node simulation with configurable parameters
- Basic RIC integration testing framework
- Container-based network function templates
- Development and testing resource specifications
apiVersion: nephoran.com/v1
kind: NetworkIntent
spec:
intent: |
Generate network function configurations:
- Basic QoS parameter specification
- Resource allocation templates
- Container deployment manifests
- Development environment setup
# Start the intent ingest service
go run ./cmd/intent-ingest &
# Run FCAPS simulator with telecom events
./fcaps-sim --verbose
# Automatically detects:
# - Critical faults โ Scale up by 2 replicas
# - High PRB utilization (>0.8) โ Scale up by 1
# - High latency (>100ms) โ Scale up by 1
# Generated intent (automatic):
{
"intent_type": "scaling",
"target": "nf-sim",
"replicas": 3,
"reason": "Critical fault detected: LINK_DOWN",
"source": "planner"
}
- 15-Minute Quickstart: Complete tutorial from zero to deployed network function
- Developer Guide: Architecture deep-dive and contribution guidelines
- Operator Manual: Production deployment and operations
- API Reference: Complete REST and gRPC API documentation
- Deployment Fixes Guide: Latest infrastructure improvements and fixes
- CI/CD Infrastructure: Comprehensive build pipeline documentation
- Enhanced Troubleshooting: Updated with recent fixes and solutions
The latest release includes critical CI/CD and infrastructure improvements:
- GitHub Actions Registry Cache: Fixed GHCR authentication and Docker buildx configurations
- Multi-platform Builds: Enhanced AMD64/ARM64 support with improved caching (85% hit rate)
- Build Pipeline: Resolved Makefile syntax errors and Go 1.24+ compatibility issues
- Quality Gates: Updated golangci-lint to v1.62.0, fixed gocyclo installation issues
- Performance: Reduced average build time from 5.4 to 3.2 minutes (-41% improvement)
For complete details, see Deployment Fixes Guide and CI/CD Infrastructure Documentation.
The system provides standardized health endpoints for Kubernetes liveness and readiness probes:
- Liveness Endpoint:
/healthz
- Basic service availability check - Readiness Endpoint:
/readyz
- Ready to accept traffic indicator
The RAG (Retrieval-Augmented Generation) system supports multiple API endpoints:
- Preferred Endpoints:
POST /process
- Primary intent processing endpointPOST /stream
- Streaming intent processing with Server-Sent Events
- Legacy Support:
POST /process_intent
- Legacy endpoint (supported when enabled via configuration)
Enhanced security features include:
- Metrics Exposure Control: Configure metrics endpoint exposure via
METRICS_ENABLED
flag - IP Allowlist: Restrict metrics endpoint access using
METRICS_ALLOWED_IPS
configuration - HTTP Security Headers: Automatically applied security headers including:
Strict-Transport-Security
(HSTS) for HTTPS enforcementContent-Security-Policy
(CSP) for XSS protectionX-Frame-Options
for clickjacking preventionX-Content-Type-Options
for MIME type sniffing protection
The archive/ directory contains essential example YAML configurations and reference templates actively used throughout the project. Despite its name, these are not deprecated files but rather canonical examples that serve critical purposes:
- Reference Templates: Canonical YAML configurations used by deployment scripts and quickstart guides
- Active Examples: Referenced by 12+ scripts and documentation files for demonstrations
- Testing Resources: Used in continuous integration and system validation workflows
- Learning Materials: Comprehensive examples for understanding NetworkIntent specifications and system deployment
Key files include:
my-first-intent.yaml
: Basic NetworkIntent example used by quickstart scriptstest-deployment.yaml
: Complete system deployment manifest with LLM Processor and Nephio Bridgetest-networkintent.yaml
: Advanced E2 node deployment example for O-RAN testing
For detailed information about each file and usage instructions, see the comprehensive archive/README.md
The Nephoran Intent Operator provides comprehensive configuration options through environment variables, enabling flexible deployment across different environments without code changes.
The operator supports 8 key environment variables for controlling system behavior:
Variable | Type | Default | Description |
---|---|---|---|
ENABLE_NETWORK_INTENT |
Boolean | true |
Enable/disable NetworkIntent controller |
ENABLE_LLM_INTENT |
Boolean | false |
Enable/disable LLM Intent processing |
LLM_TIMEOUT_SECS |
Integer | 15 |
Timeout for individual LLM requests (seconds) |
LLM_MAX_RETRIES |
Integer | 2 |
Maximum retry attempts for LLM requests |
LLM_CACHE_MAX_ENTRIES |
Integer | 512 |
Maximum entries in LLM cache |
HTTP_MAX_BODY |
Integer | 1048576 |
Maximum HTTP request body size (bytes) |
METRICS_ENABLED |
Boolean | false |
Enable/disable metrics endpoint |
METRICS_ALLOWED_IPS |
String | "" |
Comma-separated IPs allowed to access metrics |
export ENABLE_NETWORK_INTENT=true
export ENABLE_LLM_INTENT=true
export LLM_TIMEOUT_SECS=5
export METRICS_ENABLED=true
export METRICS_ALLOWED_IPS="*" # Open access for development
export ENABLE_NETWORK_INTENT=true
export ENABLE_LLM_INTENT=true
export LLM_TIMEOUT_SECS=30
export LLM_MAX_RETRIES=3
export METRICS_ENABLED=true
export METRICS_ALLOWED_IPS="10.0.0.50,10.0.0.51" # Monitoring systems only
For detailed information about all environment variables, including:
- Complete variable reference with examples
- Security considerations and best practices
- Troubleshooting guide and common issues
- Migration guide for version upgrades
See: Environment Variables Reference Guide
The Nephoran Intent Operator provides comprehensive observability through Prometheus metrics, distributed tracing, and structured logging, enabling complete visibility into system performance and operational health.
The system exposes 11 specialized Prometheus metrics across two main categories:
nephoran_llm_requests_total
: Total LLM requests by model and statusnephoran_llm_errors_total
: LLM errors categorized by type and modelnephoran_llm_processing_duration_seconds
: Processing latency histograms with P95/P99 analysisnephoran_llm_cache_hits_total
/nephoran_llm_cache_misses_total
: Cache efficiency trackingnephoran_llm_fallback_attempts_total
: Model fallback frequency monitoringnephoran_llm_retry_attempts_total
: Request retry pattern analysis
networkintent_reconciles_total
: Controller reconciliation success/failure ratesnetworkintent_reconcile_errors_total
: Error categorization for troubleshootingnetworkintent_processing_duration_seconds
: Phase-by-phase processing timingnetworkintent_status
: Real-time resource status (Failed=0, Processing=1, Ready=2)
# Enable metrics collection
export METRICS_ENABLED=true
# Optional: Restrict metrics access (production recommended)
export METRICS_ALLOWED_IPS="10.0.0.50,10.0.0.51"
# Verify metrics endpoint
curl http://localhost:8080/metrics | grep nephoran_
Monitor these essential metrics for production health:
Metric | Ideal Range | Alert Threshold | Business Impact |
---|---|---|---|
LLM Success Rate | > 95% | < 90% | Intent processing failures |
Cache Hit Rate | > 70% | < 50% | Increased costs and latency |
P95 Processing Time | < 2s | > 5s | User experience degradation |
Controller Error Rate | < 5% | > 10% | Deployment failures |
Fallback Frequency | < 2% | > 5% | Primary model reliability issues |
System Health Overview:
# Overall system success rate
(rate(nephoran_llm_requests_total{status="success"}[5m]) +
rate(networkintent_reconciles_total{result="success"}[5m])) /
(rate(nephoran_llm_requests_total[5m]) +
rate(networkintent_reconciles_total[5m])) * 100
Performance Monitoring:
# 95th percentile end-to-end processing time
histogram_quantile(0.95,
rate(networkintent_processing_duration_seconds_bucket{phase="total"}[5m]))
Cost Optimization:
# Cache efficiency by model
rate(nephoran_llm_cache_hits_total[5m]) /
(rate(nephoran_llm_cache_hits_total[5m]) + rate(nephoran_llm_cache_misses_total[5m]))
Our pre-configured dashboard provides:
- Executive Summary: High-level KPIs and system health status
- LLM Performance: Model-specific latency, error rates, and cost tracking
- Controller Operations: NetworkIntent lifecycle and processing efficiency
- Troubleshooting Views: Error categorization and debugging assistance
- Capacity Planning: Resource utilization trends and scaling recommendations
Essential alerting rules for operational teams:
# High-priority alerts for immediate attention
- alert: LLMProcessingFailures
expr: rate(nephoran_llm_errors_total[5m]) / rate(nephoran_llm_requests_total[5m]) > 0.1
severity: critical
- alert: SlowIntentProcessing
expr: histogram_quantile(0.95, rate(networkintent_processing_duration_seconds_bucket[5m])) > 10
severity: warning
For complete metrics reference including:
- Detailed metric descriptions with example values
- Label specifications and cardinality considerations
- Performance tuning and troubleshooting guides
- Advanced Prometheus queries and alerting rules
- Grafana dashboard configuration and best practices
See: Complete Prometheus Metrics Documentation
- O-RAN Compliance Certification: Standards compliance details
- Security Documentation: Complete security implementation guide
- OAuth2 Security Guide: Comprehensive OAuth2 implementation
- CORS Configuration: CORS security setup
- Operational Runbooks: Production operations and incident response
- Performance Optimization: Tuning and scaling guides
- Multi-Region Deployment: Global architecture patterns
- Network Slicing Guide: End-to-end slice deployment with NetworkIntent
- xApp Development: Custom application integration
- GitOps Workflows: CI/CD pipeline integration
- Production Examples: Real-world deployment configurations
We welcome contributions from telecommunications engineers, cloud-native developers, AI/ML researchers, and network operators!
# Fork and clone
git clone https://github.com/yourusername/nephoran-intent-operator.git
cd nephoran-intent-operator
# Run comprehensive test suite
make test-all # Unit, integration, E2E, security, and performance tests
# Build and validate
make build docker-build validate-all
# Submit PR with required checks
# โ
All tests passing (90%+ coverage)
# โ
Security scans clean
# โ
Documentation updated
# โ
Performance benchmarks maintained
Area | Difficulty | Impact | Examples |
---|---|---|---|
LLM/RAG Enhancement | ๐ด Advanced | ๐ฅ High | Prompt optimization, model fine-tuning |
O-RAN Interface Development | ๐ด Advanced | ๐ฅ High | E2AP codec implementation, xApp SDK |
Security Hardening | ๐ก Intermediate | ๐ฅ High | mTLS automation, vulnerability scanning |
Performance Optimization | ๐ก Intermediate | ๐ Medium | Caching layers, connection pooling |
Documentation & Tutorials | ๐ข Beginner | ๐ Medium | Use cases, troubleshooting guides |
Testing & Quality | ๐ก Intermediate | ๐ Medium | Chaos engineering, load testing |
Contributors receive recognition through:
- ๐ฅ Hall of Fame: Top contributors featured in documentation
- ๐๏ธ Expert Status: Technical advisor program for significant contributions
- ๐ข Conference Speaking: Present at telecommunications and cloud-native events
- ๐ผ Professional Network: Connect with industry leaders and potential employers
# AWS EKS with Terraform
cd deployments/multi-region/terraform
terraform init && terraform apply
# Azure AKS with ARM templates
az deployment group create --template-file deployments/azure/aks-cluster.json
# Google GKE with Helm
helm install nephoran deployments/helm/nephoran-operator \
--set cloudProvider=gcp \
--set monitoring.enabled=true
# Red Hat OpenShift
oc apply -k deployments/kustomize/overlays/production/
# VMware Tanzu
kubectl apply -f deployments/kubernetes/ --recursive
# Bare Metal with kubeadm
./scripts/deploy-production.sh --target bare-metal
# Edge computing deployment
./scripts/deploy-edge.sh --regions us-west,eu-central,asia-southeast
# Hybrid cloud with GitOps
kubectl apply -k deployments/kustomize/overlays/gitops/
The operator includes optimized GitOps settings for concurrent operations:
GIT_CONCURRENT_PUSH_LIMIT
(Environment Variable)- Default: 4 concurrent operations per process
- Behavior: Limits the number of simultaneous
CommitAndPush
operations to prevent git repository lock contention and improve overall system stability - Tuning: Increase for high-throughput environments with robust git infrastructure; decrease for environments with limited git server resources
Example configuration:
# Set via environment variable
export GIT_CONCURRENT_PUSH_LIMIT=8
# Or in Kubernetes deployment
env:
- name: GIT_CONCURRENT_PUSH_LIMIT
value: "8"
# Or in Helm values
git:
concurrentPushLimit: 8
This setting helps prevent git operation bottlenecks in high-load scenarios while maintaining data consistency.
- โ NetworkIntent CRD and controller implementation
- โ Basic LLM integration with GPT-4o-mini
- โ Optional RAG system (behind build tags)
- โ Kubernetes-native deployment patterns
- โ Prometheus metrics and observability foundation
- ๐ Full O-RAN Interface Implementation: Complete A1, O1, O2, E2 interface specifications
- ๐ค Production GitOps: Nephio R5 integration with Porch package orchestration
- ๐ Real Network Functions: Integration with actual 5G Core and RAN components
- ๐ Enterprise Security: OAuth2, mTLS, RBAC, and comprehensive audit trails
- ๐ Production Observability: 99.95% availability targets, comprehensive SLI/SLO tracking
- ๐ High-Scale Performance: 200+ concurrent intents, sub-2-second processing
- ๐ Multi-Cloud Orchestration: AWS, Azure, GCP, and edge deployment
- ๐ง Service Mesh Integration: Native Istio/Linkerd with advanced traffic management
- ๐ง ML-Driven Optimization: Autonomous network optimization and self-healing
- ๐ 6G Readiness: Next-generation wireless standards integration
- ๐จ Visual Intent Designer: Low-code interface for network operators
- ๐ญ Industry Verticals: Specialized templates for automotive, manufacturing, healthcare
- Availability: 99.95% uptime SLA
- Performance: Sub-2-second P95 intent processing
- Scale: 200+ concurrent intent operations
- Standards: Full O-RAN Alliance compliance certification
- Security: SOC 2 Type II, GDPR/CCPA compliance
- GitHub Issues: Bug reports and feature requests
- GitHub Discussions: Questions and community help
- Documentation: Comprehensive guides and tutorials
- Stack Overflow: Tagged questions with
nephoran-operator
- Priority Support: 24/7 technical assistance with SLA guarantees
- Professional Services: Custom deployment, training, and consulting
- Dedicated Success Manager: Ongoing optimization and best practices
- Custom Development: Feature development for specific requirements
- SOC 2 Type II Certified: Annual security audits and compliance reporting
- GDPR/CCPA Compliant: Data privacy and protection standards
- NIST Framework: Security controls aligned with cybersecurity framework
- Supply Chain Security: SLSA Level 3 compliant with attestation signatures
Licensed under the Apache License, Version 2.0.
Enterprise licenses with additional features, support, and compliance certifications are available. Contact us for details.
๐ Star us on GitHub โข ๐ Report Issues โข ๐ฌ Discuss on GitHub โข ๐ Read Docs โข ๐ค Contribute
Transforming telecommunications through intelligent automation
Documentation โข Getting Started โข API Reference โข GitHub Issues