Skip to content

MachinePool dataSecretName can't be an empty string. #5589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
johnthompson-ybor opened this issue Apr 25, 2025 · 10 comments
Open

MachinePool dataSecretName can't be an empty string. #5589

johnthompson-ybor opened this issue Apr 25, 2025 · 10 comments
Assignees
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@johnthompson-ybor
Copy link

all the docs for setting up Machine pools have an empty string set, the latest version of cluster api validates that it's not an empty string -- what's the correct thing to do now?

"MachinePool.cluster.x-k8s.io \"...\" is invalid: spec.template.spec.bootstrap.dataSecretName: Invalid value: \"\": spec.template.spec.bootstrap.dataSecretName in body should be at least 1 chars long", reason: "Invalid", code: 422 })
@mboersma
Copy link
Contributor

If this is with Cluster API v1.10, it sounds like that's a new validation that's been added. (Although it's not mentioned in the migration doc: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/book/src/developer/providers/migrations/v1.9-to-v1.10.md).

We will probably need to take this into account in #5517.
/cc @nawazkh @tamalsaha

@jaredthivener
Copy link

@johnthompson-ybor - facing similar issue. The logic needs to be refactored a bit. Shouldn't be a requirement for MachinePools of kind, "AzureManagedMachinePools".

https://github.com/kubernetes-sigs/cluster-api/blob/main/exp/internal/webhooks/machinepool.go#L172-L180

cluster-api-provider-azure needs something like this to validate against it's own CRD.

func (webhook *MachinePool) validate(oldObj, newObj *expv1.MachinePool) error {
    // NOTE: MachinePool is behind MachinePool feature gate flag; the web hook
    // must prevent creating newObj objects when the feature flag is disabled.
    specPath := field.NewPath("spec")
    if !feature.Gates.Enabled(feature.MachinePool) {
        return field.Forbidden(
            specPath,
            "can be set only if the MachinePool feature flag is enabled",
        )
    }
    var allErrs field.ErrorList
    
    // Check if this is a managed infrastructure provider that doesn't require bootstrap
    isManagedProvider := false
    if newObj.Spec.Template.Spec.InfrastructureRef.Kind == "AzureManagedMachinePool" {
        isManagedProvider = true
    }
    
    // Only validate bootstrap config if not using a managed provider
    if !isManagedProvider && newObj.Spec.Template.Spec.Bootstrap.ConfigRef == nil && newObj.Spec.Template.Spec.Bootstrap.DataSecretName == nil {
        allErrs = append(
            allErrs,
            field.Required(
                specPath.Child("template", "spec", "bootstrap", "data"),
                "expected either spec.bootstrap.dataSecretName or spec.bootstrap.configRef to be populated",
            ),
        )
    }

@mboersma mboersma added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 28, 2025
@mboersma mboersma added this to the v1.20 milestone Apr 28, 2025
@mboersma
Copy link
Contributor

kubernetes-sigs/cluster-api#12107 seems similar.

@mboersma
Copy link
Contributor

@johnthompson-ybor could you help me reproduce this by clarifying what template you're deploying with and any other details that might be relevant? CAPI v1.10.0, CAPZ v1.9.6...?

(Also we released CAPI v1.10.1 yesterday with a fix for the similar issue I mentioned above.)

@mboersma mboersma self-assigned this Apr 30, 2025
@jaredthivener
Copy link

@mboersma Looks like you can get passed the error by giving it an arbitrary string value. Here's my setup below that worked.

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: aks-demo-01
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["10.0.0.0/16"]
    services:
      cidrBlocks: ["10.128.0.0/12"]
    serviceDomain: "cluster.local"
  controlPlaneRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureManagedControlPlane
    name: aks-demo-01
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AzureManagedCluster
    name: aks-demo-01
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedCluster
metadata:
  name: aks-demo-01
  namespace: default
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedControlPlane
metadata:
  name: aks-demo-01
  namespace: default
spec:
  resourceGroupName: rg-aks-demo-01
  location: eastus
  subscriptionID: ***redacted***
  version: 1.31.7
  networkPolicy: cilium
  networkPlugin: azure
  networkPluginMode: overlay
  networkDataplane: cilium
  loadBalancerSKU: Standard
  identityRef:
    kind: AzureClusterIdentity
    name: cluster-identity
  enablePreviewFeatures: false
  oidcIssuerProfile:
    enabled: true
  securityProfile:
    workloadIdentity:
      enabled: true
    imageCleaner:
      enabled: true
      intervalHours: 24
  sku:
    tier: Free
  addonProfiles:
  - name: azureKeyvaultSecretsProvider
    enabled: true
  - name: azurepolicy
    enabled: true
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: system
  namespace: default
spec:
  clusterName: aks-demo-01
  replicas: 2
  template:
    metadata: {}
    spec:
      bootstrap:
        dataSecretName: "azure-managed"
      clusterName: aks-demo-01
      version: 1.31.7
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureManagedMachinePool
        name: system
        namespace: default
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePool
metadata:
  name: system
  namespace: default
spec:
  mode: System
  osDiskSizeGB: 30
  osDiskType: Ephemeral
  sku: Standard_D2s_v3
  availabilityZones:
    - "1"
    - "2"
    - "3"
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: app
  namespace: default
spec:
  clusterName: aks-demo-01
  replicas: 2
  template:
    metadata: {}
    spec:
      bootstrap:
        dataSecretName: "azure-managed"
      clusterName: aks-demo-01
      version: 1.31.7
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AzureManagedMachinePool
        name: app
        namespace: default
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureManagedMachinePool
metadata:
  name: app
  namespace: default
spec:
  mode: User
  osDiskSizeGB: 40
  sku: Standard_D2s_v3
  availabilityZones:
    - "1"
    - "2"
    - "3"

Versions

cluster-api version="v1.10.1"
infrastructure-azure version="v1.19.2"
kubectl get AzureManagedCluster
NAME          CLUSTER       READY   AGE
aks-demo-01   aks-demo-01   true    23m
kubectl describe AzureManagedCluster aks-demo-01
Name:         aks-demo-01
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=aks-demo-01
              kustomize.toolkit.fluxcd.io/name=flux-system
              kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations:  <none>
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         AzureManagedCluster
Metadata:
  Creation Timestamp:  2025-04-30T23:32:09Z
  Generation:          2
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Cluster
    Name:                  aks-demo-01
    UID:                   c5fe459b-c7be-4840-ac14-018e15593058
  Resource Version:        9524
  UID:                     a19da3bf-213d-4636-9fd0-4ef686e4e605
Spec:
  Control Plane Endpoint:
    Host:  aks-demo-01-81pc7x1l.hcp.eastus.azmk8s.io
    Port:  443
Status:
  Ready:  true
Events:   <none>
kubectl get AzureManagedMachinePools
NAME     CLUSTER       READY   SEVERITY   REASON   AGE   MODE
app      aks-demo-01   True                        25m   User
system   aks-demo-01   True                        25m   System
kubectl describe AzureManagedMachinePools app
Name:         app
Namespace:    default
Labels:       azuremanagedmachinepool.infrastructure.cluster.x-k8s.io/agentpoolmode=User
              cluster.x-k8s.io/cluster-name=aks-demo-01
              kustomize.toolkit.fluxcd.io/name=flux-system
              kustomize.toolkit.fluxcd.io/namespace=flux-system
Annotations:  clusterctl.cluster.x-k8s.io/block-move: true
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         AzureManagedMachinePool
Metadata:
  Creation Timestamp:  2025-04-30T23:32:09Z
  Finalizers:
    azurecluster.infrastructure.cluster.x-k8s.io
  Generation:  3
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  MachinePool
    Name:                  app
    UID:                   866e609a-c450-4347-8293-8a6dc98992b7
  Resource Version:        9887
  UID:                     5bf0fe2c-339d-4108-acdd-8b9e125d875d
Spec:
  Availability Zones:
    1
    2
    3
  Mode:             User
  Name:             app
  Os Disk Size GB:  40
  Os Disk Type:     Managed
  Os Type:          Linux
  Provider ID List:
    azure:///subscriptions/***redacted***/resourceGroups/mc_rg-aks-demo-01_aks-demo-01_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-app-21262445-vmss/virtualMachines/0
    azure:///subscriptions/***redacted***/resourceGroups/mc_rg-aks-demo-01_aks-demo-01_eastus/providers/Microsoft.Compute/virtualMachineScaleSets/aks-app-21262445-vmss/virtualMachines/1
  Scale Down Mode:  Delete
  Sku:              Standard_D2s_v3
  Subnet Name:      aks-demo-01
Status:
  Conditions:
    Last Transition Time:  2025-04-30T23:45:26Z
    Status:                True
    Type:                  Ready
    Last Transition Time:  2025-04-30T23:45:26Z
    Status:                True
    Type:                  AgentPoolsReady
  Ready:                   true
  Replicas:                2
Events:                    <none>

@mboersma mboersma mentioned this issue May 7, 2025
4 tasks
@mboersma
Copy link
Contributor

mboersma commented May 7, 2025

This regression seems to have been introduced by adding a MinLength field in kubernetes-sigs/cluster-api#11949.

@tamalsaha
Copy link

tamalsaha commented May 7, 2025

So, this is all because upstream has to have a linter? kubernetes-sigs/cluster-api#11834

And they don't talk to/consult their major implementations before adding/changing apis? 😮

May be they should revert the linter restriction.

@mboersma
Copy link
Contributor

mboersma commented May 7, 2025

Yes, I think the API linter push led to this change. It's disappointing that this wasn't anticipated to be a breaking change. Ideally we in CAPZ could have found this earlier when integrating CAPI v1.10 alpha, but there were a lot of other things to fix first (as you know).

I'm trying out a workaround of setting dataSecretName to "unused" in these cases, but I'll also open an issue in CAPI to see if it's worth reverting.

Edit: see kubernetes-sigs/cluster-api#12161

@mbrow137
Copy link
Contributor

Hey, so it is safe to close this issue after closing kubernetes-sigs/cluster-api#12161?

@mboersma
Copy link
Contributor

@mbrow137 the fix isn't yet in a released version of CAPI (next Tuesday is v1.10.2 which will have it), and we haven't yet succeeded in integrating CAPI v1.10.x, so probably this should stay open until both of those things are true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
Status: Todo
Development

No branches or pull requests

5 participants