Skip to content

Conversation

ytsarev
Copy link
Member

@ytsarev ytsarev commented Sep 25, 2025

Description of your changes

Key Changes:

  • KCL Function: Created functions/xgke/main.k with proper status reference handling using
    _ocds?.resourceName?.Resource?.status?.atProvider?.field pattern
  • Null Value Handling: Fixed initial deployment by using conditional field assignment **({field = value} if
    condition else {}) to avoid null values
  • Comprehensive Testing: Added composition tests with observed resources and extended e2e tests with network
    dependencies
  • Provider Config Readiness: Added "krm.kcl.dev/ready" = "True" annotation for ProviderConfig resources

Migration Benefits:

  • Modern DevEx development experience
  • Simplified composition maintenance
  • Better status data handling between GCP resources
  • Improved test coverage for status operations

I have:

  • Read and followed Upbound's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR, as appropriate.

How has this code been tested

Testing:

  • ✅ All composition tests passing (up test run tests/*)
  • ✅ E2E deployment successful with proper GCP permissions
  • ✅ Status references working correctly for cross-resource dependencies
up test run tests/*
  ✓   Parsing tests
  ✓   Collecting resources
  ✓   Generating language schemas
  ✓   Checking dependencies
  ✓   Building functions
  ✓   Building configuration package
  ✓   Pushing embedded functions to local daemon
  ✓   Assert test-gke-cluster
SUCCESS:
SUCCESS: Tests Summary:
SUCCESS: ------------------
SUCCESS: Total Tests Executed: 1
SUCCESS: Passed tests:         1
SUCCESS: Failed tests:         0
up test run tests/* --e2e
       === COMMAND
        /bin/sh -c ${KUBECTL} wait --for=delete xgke.gcp.platform.upbound.io/e2e-gke-cluster --timeout 1h15m0s
    | 16:24:57 | delete | Assert Deletion  | SCRIPT    | DONE  |
    | 16:24:57 | delete | Assert Deletion  | TRY       | END   |
    | 16:24:57 | delete | @chainsaw        | CLEANUP   | SKIP  |
  ✓   Collecting resources
  ✓   Generating language schemas
  ✓   Checking dependencies
  ✓   Building functions
  ✓   Building configuration package
  ✓   Creating development control plane in Spaces
  ✓   Ensuring repository exists
  ✓   Pushing function package xpkg.upbound.io/solutions/configuration-gcp-gke_xgke
  ✓   Pushing configuration image xpkg.upbound.io/solutions/configuration-gcp-gke:v0.0.0-1758806328
  ✓   Installing package on development control plane
  ✓   Waiting for package to be ready
  ✓   Finding test resources
Cleanup summary: 0 deleted, 0 remaining after 0 attempts
Tearing down test control plane...

Test control plane deleted

SUCCESS:
SUCCESS: Tests Summary:
SUCCESS: ------------------
SUCCESS: Total Tests Executed: 1
SUCCESS: Passed tests:         1
SUCCESS: Failed tests:         0

Signed-off-by: Yury Tsarev <yury@upbound.io>
Signed-off-by: Yury Tsarev <yury@upbound.io>
@sivabalan19
Copy link

Hi @ytsarev ,

  1. Cluster creation is successful. But, sync fails with below error. Leaving the cluster only with default pool.

async update failed: refuse to update the external resource because the following update requires replacing it: cannot change the value of the argument "node_config.0.service_account" from "default" to "pr-configuration-gcp-gke@squad-sales-playground.iam.gserviceaccount.com.

  1. The needed node-pool fails while creating with the below error.

us-west2-a: Failed to create nodes; us-west2-c: Failed to create nodes; us-west2-b: Failed to create nodes

I am not sure if 1 and 2 are connected. But, i was never able to stand-up my cluster. Would you please check ?

@ytsarev
Copy link
Member Author

ytsarev commented Sep 26, 2025

@sivabalan19, thank you very much for testing. Can you please clarify which MRs this message belongs to and show the full crossplane beta trace? Are you using up test run tests/* --e2e or a different testing setup?

@sivabalan19
Copy link

hey @ytsarev , Like we discussed here is the

  1. Here is the trace :
NAME                                                         SYNCED   READY   STATUS
XGKE/configuration-gcp-gke                                   True     True    Available
├─ ProjectIAMMember/configuration-gcp-gke-ljk4n              True     True    Available
├─ ServiceAccountKey/configuration-gcp-gke-kx2z7             True     True    Available
├─ ServiceAccount/configuration-gcp-gke-zt44g                True     True    Available
├─ Cluster/configuration-gcp-gke-bg2cv                       False    True    ReconcileError: ...guration-gcp-gke@squad-sales-playground.iam.gserviceaccount.com"
├─ NodePool/configuration-gcp-gke-p99l2                      True     True    Available
├─ ProviderConfig/configuration-gcp-gke                      -        -       
├─ ProviderConfig/configuration-gcp-gke                      -        -       
└─ Object/configuration-gcp-gke-workload-identity-settings   True     True    Available
  1. i create development controlplane and run down the examples to make sure all is in good shape.

… update conflicts

- Split resources into base_items (ServiceAccount, ServiceAccountKey) and gke_items (all GKE resources)
- Only create GKE cluster after ServiceAccount is ready with email and conditions
- Simplify conditional logic since GKE resources now have guaranteed service account data
- Move ProjectIAMMember to conditional section as it depends on service account
- Prevents immutable field update errors when service account changes from "default"

Signed-off-by: Yury Tsarev <yury@upbound.io>
@ytsarev
Copy link
Member Author

ytsarev commented Sep 29, 2025

@sivabalan19 thanks a lot for the catch!

I traced the issue and it appears to be happening when cluster is created with default service account and the attempt to patch it with newly created service account email fails because it is immutable field

I solved it in fca07d3 by explicitly ordering the service account and GKE cluster resources.

Full acceptance tests below:

crossplane beta trace xgke.gcp.platform.upbound.io/configuration-gcp-gke
NAME                                                         SYNCED   READY   STATUS
XGKE/configuration-gcp-gke                                   True     True    Available
├─ ProjectIAMMember/configuration-gcp-gke-r7dpq              True     True    Available
├─ ServiceAccountKey/configuration-gcp-gke-gr262             True     True    Available
├─ ServiceAccount/configuration-gcp-gke-qcsqs                True     True    Available
├─ Cluster/configuration-gcp-gke-wwzsv                       True     True    Available
├─ NodePool/configuration-gcp-gke-ln9rl                      True     True    Available
├─ ProviderConfig/configuration-gcp-gke                      -        -
├─ ProviderConfig/configuration-gcp-gke                      -        -
└─ Object/configuration-gcp-gke-workload-identity-settings   True     True    Available

@ytsarev ytsarev requested a review from sivabalan19 September 29, 2025 14:45
Signed-off-by: Yury Tsarev <yury@upbound.io>
Copy link

@sivabalan19 sivabalan19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me now. e2e test went well and deployed that as a seperat project too.

@ytsarev ytsarev merged commit 394eb9b into upbound:main Sep 30, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants