Skip to content

npmulder/homelab

Repository files navigation

🏠 Neil's Homelab Infrastructure

TalosΒ Β  KubernetesΒ Β  FluxΒ Β 

Status-PageΒ Β  Alertmanager

Age-DaysΒ Β  Uptime-DaysΒ Β  Node-CountΒ Β  Pod-CountΒ Β  CPU-UsageΒ Β  Memory-UsageΒ Β  Alerts


Welcome to my homelab! This repository contains my complete GitOps-driven Kubernetes infrastructure running on Talos Linux. Everything is declarative, automated, and immutable - exactly how modern infrastructure should be.

πŸ“– Overview

This homelab showcases enterprise-grade practices in a home environment, featuring:

  • Immutable infrastructure with Talos Linux
  • GitOps workflow using Flux CD
  • Comprehensive monitoring with Prometheus and Grafana
  • Automated dependency management via Renovate
  • Security-first approach with encrypted secrets and network policies

The entire cluster is managed through Git - no manual kubectl commands, no SSH access, no exceptions.

⚑ Technology Stack

Component Technology Purpose
Operating System Talos Linux API-driven, immutable Kubernetes OS
GitOps Flux CD Continuous delivery and cluster synchronization
Container Network Cilium eBPF-based networking with BGP support
Storage OpenEBS Local persistent storage with hostPath provisioner
Secret Management SOPS + Age Encrypted secrets in Git
Certificates cert-manager Automated Let's Encrypt certificates
Ingress NGINX + Cloudflare Tunnel Internal and external application access
Monitoring Prometheus + Grafana Metrics collection and visualization
Logging Loki + Promtail Centralized log aggregation

🎯 Key Features

  • πŸ”’ Immutable Infrastructure: Zero SSH access, all changes via GitOps
  • πŸ€– Automated Everything: Renovate handles dependency updates
  • πŸ“Š Enterprise Monitoring: 20+ Grafana dashboards with comprehensive alerting
  • πŸ” Security First: Encrypted secrets, network policies, security contexts
  • 🎭 High Availability: 3-node control plane with local persistent storage
  • πŸš€ Zero Downtime: Rolling updates with proper health checks
  • 🌐 Hybrid Networking: Internal and external application access

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        🌐 Internet                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   ☁️  Cloudflare                                β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚  β”‚   DNS & Proxy   β”‚    β”‚   Zero Trust    β”‚                   β”‚
β”‚  β”‚                 β”‚    β”‚     Tunnel      β”‚                   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   🏠 Homelab Network                            β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚   Talos Node    β”‚    β”‚   Talos Node    β”‚    β”‚ Talos Node  β”‚ β”‚
β”‚  β”‚  (Control+Work) β”‚    β”‚  (Control+Work) β”‚    β”‚(Control+Work)β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚                 πŸ“¦ Kubernetes Cluster                      β”‚ β”‚
β”‚  β”‚                                                             β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚ β”‚
β”‚  β”‚  β”‚   Media     β”‚ β”‚ Monitoring  β”‚ β”‚  Networking β”‚          β”‚ β”‚
β”‚  β”‚  β”‚   Stack     β”‚ β”‚    Stack    β”‚ β”‚    Stack    β”‚          β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚ β”‚
β”‚  β”‚                                                             β”‚ β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚  β”‚  β”‚              πŸ—„οΈ  OpenEBS Local Storage                β”‚ β”‚ β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Getting Started

Prerequisites

# Install development tools
mise trust && mise install

# Bootstrap the cluster
task bootstrap:talos        # Install Talos Linux
task bootstrap:apps         # Deploy applications

Daily Operations

# Monitor cluster health
kubectl get pods -A --watch
cilium status
flux get hr -A

# Force synchronization
task reconcile

# View logs
kubectl logs -n <namespace> <pod> -f

πŸ“ Repository Structure

πŸ“ homelab/
β”œβ”€β”€ πŸ”§ bootstrap/                 # Initial cluster setup
β”œβ”€β”€ πŸ“ kubernetes/
β”‚   β”œβ”€β”€ πŸ“ apps/                  # Applications by namespace
β”‚   β”‚   β”œβ”€β”€ πŸ“ cert-manager/      # Certificate management
β”‚   β”‚   β”œβ”€β”€ πŸ“ default/           # Media applications
β”‚   β”‚   β”œβ”€β”€ πŸ“ network/           # Ingress, DNS, tunnels
β”‚   β”‚   β”œβ”€β”€ πŸ“ observability/     # Monitoring stack
β”‚   β”‚   └── πŸ“ openebs-system/    # Local storage provisioner
β”‚   β”œβ”€β”€ πŸ“ components/            # Reusable components
β”‚   └── πŸ“ flux/                  # GitOps configuration
β”œβ”€β”€ πŸ“ talos/                     # OS configuration
└── πŸ“ scripts/                   # Automation scripts

πŸ”§ Application Highlights

πŸ“Ί Media Stack

  • Sonarr/Radarr: Automated TV show and movie management
  • Prowlarr: Indexer management
  • Plex: Media server with hardware transcoding

πŸ“Š Monitoring Stack

  • Prometheus: Metrics collection with 20+ pre-configured alerts
  • Grafana: 20+ dashboards covering infrastructure and applications
  • Loki: Centralized logging with retention policies
  • Alertmanager: Multi-channel alerting (Discord, email)

🌐 Networking

  • Internal Access: k8s-gateway for local DNS resolution
  • External Access: Cloudflare Tunnel for secure remote access
  • Load Balancing: NGINX ingress controllers
  • Network Security: Cilium network policies

πŸ—„οΈ Storage

  • Local Storage: OpenEBS hostPath provisioner for persistent volumes
  • Media Storage: NFS integration with TrueNAS for media files
  • AI Workloads: Dedicated storage for Ollama models and inference

🎨 Configuration Management

Every application follows a consistent pattern:

app/
β”œβ”€β”€ helmrelease.yaml      # Helm chart deployment
β”œβ”€β”€ kustomization.yaml    # Kustomize configuration
β”œβ”€β”€ externalsecret.yaml   # SOPS-encrypted secrets
└── resources/            # Additional K8s resources

πŸ”’ Security Features

  • πŸ” Encrypted Secrets: All sensitive data encrypted with SOPS + Age
  • πŸ›‘οΈ Network Policies: Micro-segmentation with Cilium
  • πŸ“œ Security Contexts: Non-root containers with minimal privileges
  • πŸ”’ Pod Security Standards: Enforced security policies
  • 🌐 Zero Trust: Cloudflare Access for external services

πŸ€– Automation

Renovate Configuration

  • Automated Updates: Container images, Helm charts, GitHub Actions
  • Grouped Dependencies: Related updates bundled together
  • Scheduled Updates: Weekend update cycles
  • Pre-commit Validation: flux-local ensures manifests are valid

CI/CD Pipeline

  • Manifest Validation: Pre-commit hooks with flux-local
  • Diff Generation: Automated PR comments showing changes
  • Security Scanning: SOPS validation for encrypted secrets

πŸ“ˆ Monitoring & Observability

Grafana Dashboards

  • Infrastructure: Node metrics, storage, networking
  • Applications: Application-specific metrics and health
  • Kubernetes: Cluster resources and workload status
  • Media Stack: Download statistics and performance

Alerting

  • Infrastructure Alerts: Node down, disk space, memory usage
  • Application Alerts: Pod crashes, certificate expiry
  • Network Alerts: Ingress failures, DNS resolution issues

πŸ§ͺ Development Workflow

  1. Local Changes: Edit manifests in your IDE
  2. Validation: flux-local validates changes locally
  3. Git Push: Changes pushed to repository
  4. Automatic Sync: Flux applies changes to cluster
  5. Monitoring: Grafana dashboards show deployment status

πŸ”§ Troubleshooting

Common Commands

# Flux troubleshooting
flux check
flux get sources git -A
flux get ks -A
flux get hr -A

# Application debugging
kubectl -n <namespace> describe pod <pod-name>
kubectl -n <namespace> logs <pod-name> -f
kubectl -n <namespace> get events --sort-by='.metadata.creationTimestamp'

# Network debugging
cilium status
nmap -Pn -n -p 443 <ingress-ip>

Resource Recovery

# Force reconciliation
task reconcile

# Restart failed pods
kubectl -n <namespace> rollout restart deployment <deployment>

# Certificate issues
kubectl -n cert-manager describe certificates

πŸ™ Inspiration & Thanks

This homelab draws inspiration from the amazing Kubernetes at Home community:

πŸ“Š Repository Statistics

Repository Stats


⭐ If you find this repository helpful, please consider giving it a star!

Built with ❀️ using GitOps principles and powered by the Kubernetes at Home community

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •