[CI] Add Terraform resources for daily CronJob that processes LLVM commits #495

jriv01 · 2025-07-11T21:33:55Z

These resources are for a CronJob that executes the container at ghcr.io/llvm/operations-metrics:latest on a daily basis (07:00 UTC), which will scrape daily metrics regarding LLVM's commit volume and upload them for visualization in Grafana.

Changes were made to the already existing terraform files since many of the same resources are being reused anyway. This way we can keep all relevant changes in the same place instead of having two separate terraform directories that access and modify shared resources.

Since the container needs access to the BigQuery Google Cloud API, IAM and K8S service accounts were used to grant that access via Workload Identity Federation for GKE. More details at https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity

…mmits

jriv01 · 2025-07-11T21:35:13Z

@boomanaiden154 @lnihlen

boomanaiden154 · 2025-07-11T21:37:36Z

premerge/main.tf

+}
+
+# The container for scraping LLVM commits needs persistent storage
+# for a local check-out of llvm/llvm-project


Why does this need to be stored persistently? It's pretty cheap to clone LLVM and a PVC I think adds unnecessary complexity on top of making things more complicated because they are now stateful.

I neglected to mention this, but there's also a persistent file that keeps track of the last commits we've seen. Originally, the script was to run at a more frequent cadence so we wanted to keep track of commits we've seen as to avoid reprocessing them.

Now that the script only scrapes a day worth of data at a time, maybe we don't need a persistent state to keep track of commits we've seen. Although it might still be valuable for ensuring the quality of the commit data between iterations

boomanaiden154 · 2025-07-11T21:38:10Z

premerge/main.tf

+  depends_on = [kubernetes_namespace.operational_metrics]
+}
+
+resource "kubernetes_secret" "operational_metrics_secrets" {


Why does this need a separate Github token instead of reusing one of the existing ones?

It's the same Github token, just under a separate secrets object to keep separation between the premerge metrics and operational metrics

Although I'm not opposed to scrapping this and just reusing the metrics secrets if that's more appropriate

boomanaiden154 · 2025-07-11T21:39:52Z

premerge/gke_cluster/main.tf

@@ -12,6 +12,10 @@ resource "google_container_cluster" "llvm_premerge" {
  # for adding windows nodes to the cluster.
  networking_mode = "VPC_NATIVE"
  ip_allocation_policy {}
+
+  workload_identity_config {


At least for the non-TF docs, changing this would cause changes in new node pools. Does this change any of the defaults for node pools created through TF?

For existing node pools created through TF, they should keep their original default values.

Based on the workload identity federation docs, new node pools created through TF will have workload identity enabled since the cluster has it enabled. It seems we can explicitly add workload_metadata_config { mode = "GCE_METADATA" } to disable it in unwanted nodes however.

Although, looking back through the docs now, there appears to be some risk with updating the existing service node pools:

Caution: Modifying the node pool immediately enables Workload Identity Federation for GKE for any workloads running in the node pool. This prevents the workloads from using the service account that your nodes use and might result in disruptions.

I'm not too familiar with what existing workloads are running on these nodes, but they may break if they're using the node's service account. Perhaps we want a separate node pool for this after all?

[CI] Add Terraform resources for daily CronJob that processes LLVM co…

b159efe

…mmits

boomanaiden154 reviewed Jul 11, 2025

View reviewed changes

boomanaiden154 requested a review from lnihlen July 11, 2025 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Add Terraform resources for daily CronJob that processes LLVM commits #495

[CI] Add Terraform resources for daily CronJob that processes LLVM commits #495

Uh oh!

jriv01 commented Jul 11, 2025

Uh oh!

jriv01 commented Jul 11, 2025

Uh oh!

boomanaiden154 Jul 11, 2025

Uh oh!

jriv01 Jul 11, 2025

Uh oh!

boomanaiden154 Jul 11, 2025

Uh oh!

jriv01 Jul 11, 2025

Uh oh!

boomanaiden154 Jul 11, 2025

Uh oh!

jriv01 Jul 11, 2025

Uh oh!

Uh oh!

[CI] Add Terraform resources for daily CronJob that processes LLVM commits #495

Are you sure you want to change the base?

[CI] Add Terraform resources for daily CronJob that processes LLVM commits #495

Uh oh!

Conversation

jriv01 commented Jul 11, 2025

Uh oh!

jriv01 commented Jul 11, 2025

Uh oh!

boomanaiden154 Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

jriv01 Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

boomanaiden154 Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

jriv01 Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

boomanaiden154 Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

jriv01 Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!