Description
Currently validation for our APIs is done, for the most part, in our webhooks. OpenAPI validation could be added (possibly using kubebuilder tags) to make this validation part of our OpenAPI definition. Example validations are max / min, regex matching and enum enforcement - but there's a lot of possibilities enabled by kubebuilder tags.
We currently don't use this type of validation across our API types, but we do have some examples in MachineHealthCheck, MachinePool, ClusterResourceSet and our Bootstrap types.
Putting validation in our OpenAPI specs would move implementation of validation from our webhooks to the Kubernetes tooling (kubectl and/or kube-apiserver). We would still require webhooks for a number of validations which cannot be done with OpenAPI.
Impact
From a UX perspective the change would be, in an ideal scenario, small. We wouldn't be able to customize error messages, but error messages would be more consistent. Users would be able to understand most of our validation by reading the OpenAPI spec instead of reading godoc comments (which may be out of sync with the actual implementation in the webhook).
In reality UX impact could be large and negative because of how defaulting currently works. In our current implementation webhook Defaulting happens before Validation. Users are able to leave fields blank, even though they are required in our validation, and have sane, valid defaults applied.
Currently we have the DefaultValidate test used across our API types which validates this process:
cluster-api/util/defaulting/defaulting.go
Line 37 in 09d0824
This is used to test: Machines, MachineSets, MachineHealthChecks, MachineDeployments, ClusterResourceSets, KubeadmConfigs, KubeadmConfigTemplates, MachinePools, KubeadmControlPlane, KubeadmControlPlaneTemplate.
If we move validation to OpenAPI validation will happen before webhooks are called which would require changes in how we do defaulting and which fields we expect to be defined by users.
Defaulting could alternatively, in some cases, be done in the OpenAPI schema, but the process doesn't have a good UX right now. There's an issue open about improving this in Kubernetes at kubernetes/kubernetes#108768
This can also cause rollouts in fields that previously weren't defaulted (#6095 h/t @sbueringer ) meaning that a large implementation and testing effort could be needed to ensure the changes aren't causing new rollouts.
If we end up moving only some validation and defaulting to our OpenAPI spec we end up with validation in multiple places. Which doesn't really improve the explicitness of our API but, in the worst case scenario, could be a large implementation effort and have a negative impact on UX.
Original discussion here: #6383 (review)
As the discussion is on an experimetal API we could try to implement OpenAPI validation on that API only to assess the impact and rolling it out to more parts of the overall CAPI API.
Should we try to move more validation to the OpenAPI spec?
@fabriziopandini @JoelSpeed
/kind feature
/area api