This is a monorepository is for my home kubernetes clusters. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Terraform, Kubernetes, Flux, Renovate, and GitHub Actions.
The purpose here is to learn k8s, while practicing Gitops.
My Kubernetes clusters are deployed with Talos. One is a low-power utility cluster, running important services, and the other is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate NAS with ZFS for NFS/SMB shares, bulk file storage and backups.
There is a template over at onedr0p/cluster-template if you want to try and follow along with some of the practices I use here.
- actions-runner-controller: self-hosted Github runners
- cilium: internal Kubernetes networking plugin
- cert-manager: creates SSL certificates for services in my cluster
- external-dns: automatically syncs DNS records from my cluster ingresses to a DNS provider
- external-secrets: managed Kubernetes secrets using 1Password.
- ingress-nginx: ingress controller for Kubernetes using NGINX as a reverse proxy and load balancer
- rook-ceph: Cloud native distributed block storage for Kubernetes
- spegel: stateless cluster local OCI registry mirror
- tofu-controller: additional Flux component used to run Terraform from within a Kubernetes cluster.
- volsync: backup and recovery of persistent volume claims
Flux watches the clusters in my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.
The way Flux works for me here is it will recursively search the kubernetes/apps
folder until it finds the most top level
kustomization.yaml
per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml
will generally
only have a namespace resource and one or many Flux kustomizations (ks.yaml
). Under the control of those Flux kustomizations there will
be a HelmRelease
or other resources related to the application which will be applied.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.
This Git repository contains the following directories under Kubernetes.
📁 kubernetes
├── 📁 apps # applications
├── 📁 components # re-useable kustomize components
└── 📁 flux # flux system configuration
This diagram illustrates how Flux manages application deployments with complex dependencies. In this scenario:
Kustomization
resources depend on otherKustomization
resourcesHelmRelease
resources depend on custom resources (PostgresCluster
/Dragonfly
)- Operators manage stateful components that applications require
The workflow ensures Authentik won't deploy until:
- The PostgreSQL operator is installed and ready
- The Dragonfly operator is installed and ready
- A dedicated PostgreSQL cluster for Authentik is provisioned and healthy
- A Dragonfly caching instance is provisioned and healthy
graph TD
%% Operator Installation
A[Kustomization: crunchy-postgres-operator] -->|Creates| B[HelmRelease: crunchy-postgres-operator]
C[Kustomization: dragonfly-operator] -->|Creates| D[HelmRelease: dragonfly-operator]
%% Authentik Dependencies
E[Kustomization: authentik] -->|dependsOn| A
E -->|dependsOn| C
E -->|Creates| F[(PostgresCluster: authentik)]
E -->|Creates| G[(Dragonfly: authentik)]
E -->|Creates| H[[HelmRelease: authentik]]
%% Health Dependencies
H -->|Requires healthy| F
H -->|Requires healthy| G
%% Operator Management
B -.->|Manages| F
D -.->|Manages| G
%% External Dependencies
I[(rook-ceph storage)] -->|Provides PVC| F
While most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about two things. (1) Dealing with chicken/egg scenarios and (2) services I critically need whether my cluster is online or not.
The alternative solution to these two problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Gatus. However, maintaining another cluster and monitoring another group of workloads is a lot more time and effort than I am willing to put in.
Service | Use | Cost |
---|---|---|
Bitwarden | Secrets with External Secrets | ~$10/yr |
Cloudflare | Domain, DNS, WAF and R2 bucket (S3 Compatible endpoint) | ~$30/yr |
GitHub | Hosting this repository and continuous integration/deployments | Free |
Healthchecks.io | Monitoring internet connectivity and external facing applications | Free |
Total: ~$3,3/mo |
In my cluster there are two instances of ExternalDNS running. One for syncing
private DNS records to my AdGuard Home
using ExternalDNS webhook provider for AdGuard,
while another instance syncs public DNS to Cloudflare
. This setup is managed by creating ingresses with two specific classes: internal
for private DNS and external
for public DNS. The external-dns
instances then syncs the DNS records to their respective platforms accordingly.
Name | Device | CPU | OS Disk | Data Disk | RAM | OS | Purpose |
---|---|---|---|---|---|---|---|
Alfheim | Lenovo M920q | i5-8500T | 480GB SSD | 500GB NVME | 64GB | Talos | k8s control |
Alne | Lenovo M720q | i5-8500T | 480GB SSD | 500GB NVME | 32GB | Talos | k8s control |
Ainias | Lenovo M720q | i5-8500T | 480GB SSD | 500GB NVME | 32GB | Talos | k8s control |
Total CPU: 18 threads Total RAM: 128 GB
Name | Device | CPU | OS Disk | Data Disk | RAM | OS | Purpose |
---|---|---|---|---|---|---|---|
Aincrad | DIY | i5-9400 | 32GB USB | 2x14Tb 6x4Tb zfs | 16GB | Unraid | NAS/NFS/Backup |
Device | Purpose |
---|---|
MikroTik RB5009UPr+S+IN | Network - Router |
MikroTik CRS326-24S+2Q+RM | Network - Switch |
CyberPower VP1000ELCD-FR | General - UPS |
Big shout out to the cluster-template, and the Home Operations Discord community. Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you may deploy.