I'll nuke it and rebuild it again and again. ... but it's not fun to rebuild again 5 days before the con...
What this doesn't include is spinning up the k8s cluster itself. All of these commands require an existing k8s cluster.
pulumi stack init ckc25
# make sure you select the cackalacky doppler project
doppler run -- pulumi up --stack ckc25 -y
pulumi cancel
helm repo add cloudflare https://cloudflare.github.io/helm-charts
helm repo add longhorn https://charts.longhorn.io
helm repo add emberstack https://emberstack.github.io/helm-charts
helm repo add argo https://argoproj.github.io/argo-helm
helm repo add external-secrets https://charts.external-secrets.io
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add emqx https://repos.emqx.io/charts
helm repo add signoz https://charts.signoz.io
helm repo add cloudflare https://cloudflare.github.io/helm-charts
We use cloudflare for DNS and tunneling. This helps with exposing services to the internet without jumping through any networking hoops. Most services exposed are not for the con, but rather for monitoring / observing the cluster.
helm repo add longhorn https://charts.longhorn.io
Longhorn is our storage solution. It's a CSI driver that uses local storage on the nodes to provide storage for PVCs. It's a works well for a homelab because it's easy to manage and doesn't require any external storage.
Next year, instead of having USB-C external drives, I'd like to try out a NAS w/Longhorn.
helm repo add emberstack https://emberstack.github.io/helm-charts
Kubereflector is used to reflect secrets from one namespace to another. This is useful for secrets that are created in one namespace, but need to be used in another. For example, the Doppler operator creates secrets in the doppler-operator-system namespace, but we need to use those secrets in the cackalacky namespace. Kubereflector allows us to reflect those secrets to the cackalacky namespace and enables applications/services to use them.
helm repo add cert-manager https://charts.jetstack.io
Cert manager is used to manage TLS certificates for our services. It's a bit of a pain to get working, but it's worth it in the long run.
You'll also need to understand the create-challenge-resolve flow that the CRDs expect.
helm repo add external-secrets https://charts.external-secrets.io
External secrets operator is used to manage secrets from external sources. In our case, we use it to store an access key + secret that is used to retrieve images from ECR.
There is a K8S Cron Job that refreshes the secret every 4 hours.
helm repo add argo https://argoproj.github.io/argo-helm
Argo CD is our continuous deployment solution. Once you breach 2 services that need deployed to k8s, it's nice to have something that creates automatic rolling releases based on git commits.
NOTE: We use circleci to build the images and push them to ECR.
helm repo add doppler https://helm.doppler.com
Doppler is our secret management solution. It's fantastic for development locally and implementing this CRD was a breeze. Highly recommend checking it out.
helm repo add emqx https://repos.emqx.io/charts
EMQX is our MQTT broker. We're using the open source version, it's super easy to get up and running.
This acted as the gateway from the badges to our internal services.
There are some configs that I'm missing for a proper stand up, but this deployment gets us 90% of the way there.
What we're missing is a config connecting our user database in PG to the MQTT broker.
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f yamls\emqx\exporter.yaml
EMQX Exporter is used to export metrics from EMQX to Prometheus.
This is dependent on EMQX + Prometheus.
helm repo add metallb https://metallb.github.io/metallb
MetalLB is our load balancer specifically for bare metal clusters. It's a layer 2 load balancer that allows us to expose services to the internet without having to deal with cloud provider load balancers.
They've made ease of use much better since 0.12.1.
helm repo add nginx https://helm.nginx.com/stable
NGINX is our reverse proxy. It's used to expose services to the internet and handles TLS termination (if we had remembered to enable it).
I may have been able to use metallb for this, but I knew how to create a streaming / pass through backend proxy with NGINX to allow devices to talk to EMQX.
helm repo add bitnami https://charts.bitnami.com/bitnami
Redis is our game data authority. This was a critical component that stored all cyberpartner data.
It's exceptionally easy to deploy and manage.
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\setup\ --server-side
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\ --server-side
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml delete -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\setup\
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml delete -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\
Prometheus is our application/service monitoring solution. I was having issues with the signoz otel collector, so I ran a separate prometheus instance.
The git repo that generated all 100 yamls is here: https://github.com/prometheus-operator/kube-prometheus
I created my own image and options for this specific deployment.
# Create the CRDs
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka
# Deploy the cluster
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f C:\...\cackalacky25-infra\yamls\kafka\ -n kafka
Kafka is our event streaming solution for all internal routing. It's used to pass messages between services.
helm repo add grafana https://grafana.github.io/helm-charts
# https://grafana.com/docs/grafana/latest/setup-grafana/installation/helm/
helm install my-grafana grafana/grafana --namespace monitoring
helm upgrade my-grafana grafana/grafana -f yamls/grafana/values.yaml -n monitoring
Grafana is our dashboarding solution. It's used to visualize app/cluster metrics from Prometheus and a query layer for Clickhouse.
helm search repo signoz
Signoz is our single pane of glass logging and query solution.
Due to flakiness with pulumi and signoz, just deploy via helm...
helm install signoz signoz/signoz `
--namespace platform --create-namespace `
--wait `
--timeout 1h `
-f yamls\signoz\values.yaml
signoz k8s infra
helm install signoz-k8s-infra signoz/k8s-infra `
--namespace platform `
--wait `
--timeout 1h `
-f yamls\signoz\k8s-infra-values.yaml
OTEL Service in platform:
kubectl apply -f yamls\signoz\otel-stuff.yaml
Not easily updatable from pulumi because the resources it creates are locked and changes to stateful sets are "forbidden"
To update it, basically have to tear everything down, then bring it all back up...
First, delete the namespace: "platform" Then, check for resources:
kubectl get all --all-namespaces | Select-String "signoz"
kubectl get all --all-namespaces | Select-String "click"
kubectl get all --all-namespaces | Select-String "Click"
There will be one resource that hangs on for dear life:
- clickhouseinstallation signoz-release-clickhouse
Force delete it:
kubectl patch clickhouseinstallation signoz-release-clickhouse -n platform --type=json -p '[{"op": "remove", "path": "/metadata/finalizers"}]'
After that, the namespace will be deleted and you can tear down the pulumi state
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:helm.sh/v3:Chart::signoz-release -y --target-dependents
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:core/v1:Namespace::platform -y --target-dependents
# unsure if you need to do this:
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:helm.sh/v3:Release::signoz-k8s-infra-release
kubectl apply -f yamls\otel\basic.yaml
OTEL Collector is used to collect metrics from our hand rolled services and applications, and export them to Prometheus.
pulumi cancel
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:core/v1:Namespace::<resource>
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:batch/v1:Job::platform/signoz-release-schema-migrator-async-init