Skip to content

lockfale/cackalacky25-infrastructure

Repository files navigation

Cackalackycon 2025 Infrastructure

I'll nuke it and rebuild it again and again. ... but it's not fun to rebuild again 5 days before the con...

What this doesn't include is spinning up the k8s cluster itself. All of these commands require an existing k8s cluster.

Pulumi commands

pulumi stack init ckc25
# make sure you select the cackalacky doppler project
doppler run -- pulumi up --stack ckc25 -y

pulumi cancel

Helm Charts:

helm repo add cloudflare https://cloudflare.github.io/helm-charts 
helm repo add longhorn https://charts.longhorn.io 
helm repo add emberstack https://emberstack.github.io/helm-charts 
helm repo add argo https://argoproj.github.io/argo-helm
helm repo add external-secrets https://charts.external-secrets.io
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add emqx https://repos.emqx.io/charts
helm repo add signoz https://charts.signoz.io

Cloudflare

helm repo add cloudflare https://cloudflare.github.io/helm-charts

We use cloudflare for DNS and tunneling. This helps with exposing services to the internet without jumping through any networking hoops. Most services exposed are not for the con, but rather for monitoring / observing the cluster.

Longhorn

helm repo add longhorn https://charts.longhorn.io

Longhorn is our storage solution. It's a CSI driver that uses local storage on the nodes to provide storage for PVCs. It's a works well for a homelab because it's easy to manage and doesn't require any external storage.

Next year, instead of having USB-C external drives, I'd like to try out a NAS w/Longhorn.

Kubereflector (emberstack/reflector)

helm repo add emberstack https://emberstack.github.io/helm-charts

Kubereflector is used to reflect secrets from one namespace to another. This is useful for secrets that are created in one namespace, but need to be used in another. For example, the Doppler operator creates secrets in the doppler-operator-system namespace, but we need to use those secrets in the cackalacky namespace. Kubereflector allows us to reflect those secrets to the cackalacky namespace and enables applications/services to use them.

Cert Manager

helm repo add cert-manager https://charts.jetstack.io

Cert manager is used to manage TLS certificates for our services. It's a bit of a pain to get working, but it's worth it in the long run.

You'll also need to understand the create-challenge-resolve flow that the CRDs expect.

External Secrets Operator

helm repo add external-secrets https://charts.external-secrets.io

External secrets operator is used to manage secrets from external sources. In our case, we use it to store an access key + secret that is used to retrieve images from ECR.

There is a K8S Cron Job that refreshes the secret every 4 hours.

Argo CD

helm repo add argo https://argoproj.github.io/argo-helm

Argo CD is our continuous deployment solution. Once you breach 2 services that need deployed to k8s, it's nice to have something that creates automatic rolling releases based on git commits.

NOTE: We use circleci to build the images and push them to ECR.

Doppler Operator

helm repo add doppler https://helm.doppler.com

Doppler is our secret management solution. It's fantastic for development locally and implementing this CRD was a breeze. Highly recommend checking it out.

EMQX MQTT

helm repo add emqx https://repos.emqx.io/charts

EMQX is our MQTT broker. We're using the open source version, it's super easy to get up and running.

This acted as the gateway from the badges to our internal services.

There are some configs that I'm missing for a proper stand up, but this deployment gets us 90% of the way there.

What we're missing is a config connecting our user database in PG to the MQTT broker.

EMQX Exporter

kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f yamls\emqx\exporter.yaml

EMQX Exporter is used to export metrics from EMQX to Prometheus.

This is dependent on EMQX + Prometheus.

MetaLLB

helm repo add metallb https://metallb.github.io/metallb

MetalLB is our load balancer specifically for bare metal clusters. It's a layer 2 load balancer that allows us to expose services to the internet without having to deal with cloud provider load balancers.

They've made ease of use much better since 0.12.1.

NGINX

helm repo add nginx https://helm.nginx.com/stable

NGINX is our reverse proxy. It's used to expose services to the internet and handles TLS termination (if we had remembered to enable it).

I may have been able to use metallb for this, but I knew how to create a streaming / pass through backend proxy with NGINX to allow devices to talk to EMQX.

Redis

helm repo add bitnami https://charts.bitnami.com/bitnami

Redis is our game data authority. This was a critical component that stored all cyberpartner data.

It's exceptionally easy to deploy and manage.

Prometheus

kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\setup\ --server-side
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\ --server-side

kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml delete -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\setup\
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml delete -f C:\...\cackalacky25-infra\yamls\prometheus\manifests\

Prometheus is our application/service monitoring solution. I was having issues with the signoz otel collector, so I ran a separate prometheus instance.

The git repo that generated all 100 yamls is here: https://github.com/prometheus-operator/kube-prometheus

I created my own image and options for this specific deployment.

Kafka

# Create the CRDs
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka

# Deploy the cluster
kubectl --kubeconfig C:\cackalackycon\rke2-direct.yaml apply -f C:\...\cackalacky25-infra\yamls\kafka\ -n kafka

Kafka is our event streaming solution for all internal routing. It's used to pass messages between services.

Grafana

helm repo add grafana https://grafana.github.io/helm-charts

# https://grafana.com/docs/grafana/latest/setup-grafana/installation/helm/
helm install my-grafana grafana/grafana --namespace monitoring
helm upgrade my-grafana grafana/grafana -f yamls/grafana/values.yaml -n monitoring

Grafana is our dashboarding solution. It's used to visualize app/cluster metrics from Prometheus and a query layer for Clickhouse.

Signoz

helm search repo signoz

Signoz is our single pane of glass logging and query solution.

Due to flakiness with pulumi and signoz, just deploy via helm...

helm install signoz signoz/signoz `
   --namespace platform --create-namespace `
   --wait `
   --timeout 1h `
   -f yamls\signoz\values.yaml

signoz k8s infra

helm install signoz-k8s-infra signoz/k8s-infra `
   --namespace platform `
   --wait `
   --timeout 1h `
   -f yamls\signoz\k8s-infra-values.yaml

OTEL Service in platform:

kubectl apply -f yamls\signoz\otel-stuff.yaml

Notes:

Not easily updatable from pulumi because the resources it creates are locked and changes to stateful sets are "forbidden"

To update it, basically have to tear everything down, then bring it all back up...

First, delete the namespace: "platform" Then, check for resources:

kubectl get all --all-namespaces | Select-String "signoz"
kubectl get all --all-namespaces | Select-String "click"
kubectl get all --all-namespaces | Select-String "Click"

There will be one resource that hangs on for dear life:

  • clickhouseinstallation signoz-release-clickhouse

Force delete it:

kubectl patch clickhouseinstallation signoz-release-clickhouse -n platform --type=json -p '[{"op": "remove", "path": "/metadata/finalizers"}]'

After that, the namespace will be deleted and you can tear down the pulumi state

pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:helm.sh/v3:Chart::signoz-release -y --target-dependents
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:core/v1:Namespace::platform -y --target-dependents

# unsure if you need to do this:
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:helm.sh/v3:Release::signoz-k8s-infra-release

Open Telemetry Collector

kubectl apply -f yamls\otel\basic.yaml

OTEL Collector is used to collect metrics from our hand rolled services and applications, and export them to Prometheus.

Helpful commands

pulumi cancel
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:core/v1:Namespace::<resource>
pulumi state delete urn:pulumi:cackalacky25::cacklacky::kubernetes:batch/v1:Job::platform/signoz-release-schema-migrator-async-init

About

Infrastructure used for cackalackycon 2025

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages