Skip to content

Commit 9ed56b1

Browse files
authored
✨ topology: implement BeforeClusterUpgrade annotation hook (#11922)
* topology: implement BeforeClusterUpgrade annotation hook * review fixes * add e2e test coverage * review fixes
1 parent 45974cd commit 9ed56b1

File tree

6 files changed

+221
-12
lines changed

6 files changed

+221
-12
lines changed

api/v1beta1/common_types.go

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,13 @@ const (
200200

201201
// CRDMigrationObservedGenerationAnnotation indicates on a CRD for which generation CRD migration is completed.
202202
CRDMigrationObservedGenerationAnnotation = "crd-migration.cluster.x-k8s.io/observed-generation"
203+
204+
// BeforeClusterUpgradeHookAnnotationPrefix annotation specifies the prefix we search each annotation
205+
// for during the before-upgrade lifecycle hook to block propagating the new version to the control plane.
206+
// This hook can be used to execute pre-upgrade add-on tasks and block upgrades of the ControlPlane and Workers.
207+
// Note: While the upgrade is blocked changes made to the Cluster Topology will be delayed propagating to the underlying
208+
// objects while the object is waiting for upgrade.
209+
BeforeClusterUpgradeHookAnnotationPrefix = "before-upgrade.hook.cluster.cluster.x-k8s.io"
203210
)
204211

205212
// MachineSetPreflightCheck defines a valid MachineSet preflight check.

docs/book/src/reference/api/labels-and-annotations.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
|:------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------|:-------------------------|
55
| cluster.x-k8s.io/cluster-name | It is set on machines linked to a cluster and external objects(bootstrap and infrastructure providers). | User | Machines |
66
| cluster.x-k8s.io/control-plane | It is set on machines or related objects that are part of a control plane. | Cluster API | Machines |
7-
| cluster.x-k8s.io/control-plane-name | It is set on machines if they're controlled by a control plane. The value of this label may be a hash if the control plane name is longer than 63 characters. | Cluster API | Machines |
7+
| cluster.x-k8s.io/control-plane-name | It is set on machines if they're controlled by a control plane. The value of this label may be a hash if the control plane name is longer than 63 characters. | Cluster API | Machines |
88
| cluster.x-k8s.io/deployment-name | It is set on machines if they're controlled by a MachineDeployment. | Cluster API | Machines |
99
| cluster.x-k8s.io/drain | If set with the value "skip" on a Pod in the workload cluster, the Pod will not be evicted during Node drain. | User | Pods (workload cluster) |
1010
| cluster.x-k8s.io/interruptible | It is used to mark the nodes that run on interruptible instances. | User | Nodes (workload cluster) |
@@ -20,6 +20,7 @@
2020

2121
| Annotation | Note | Managed By | Applies to |
2222
|:-----------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------|:-----------------------------------------------|
23+
| before-upgrade.hook.cluster.cluster.x-k8s.io | It specifies the prefix we search each annotation for during the before-upgrade lifecycle hook to block propagating the new version to the control plane. These hooks will prevent propagation of changes made to the Cluster Topology to the underlying objects. | User | Clusters |
2324
| cluster.x-k8s.io/cloned-from-groupkind | It is the annotation that stores the group-kind of the template from which the current resource has been cloned from. | Cluster API | All Cluster API objects cloned from a template |
2425
| cluster.x-k8s.io/cloned-from-name | It is the annotation that stores the name of the template from which the current resource has been cloned from. | Cluster API | All Cluster API objects cloned from a template |
2526
| cluster.x-k8s.io/cluster-name | It is set on nodes identifying the name of the cluster the node belongs to. | Cluster API | Nodes (workload cluster) |
@@ -53,9 +54,9 @@
5354
| machineset.cluster.x-k8s.io/skip-preflight-checks | It can be applied on MachineDeployment and MachineSet resources to specify a comma-separated list of preflight checks that should be skipped during MachineSet reconciliation. Supported preflight checks are: All, KubeadmVersionSkew, KubernetesVersionSkew, ControlPlaneIsStable. | User | MachineDeployments, MachineSets |
5455
| pre-drain.delete.hook.machine.cluster.x-k8s.io | It specifies the prefix we search each annotation for during the pre-drain.delete lifecycle hook to pause reconciliation of deletion. These hooks will prevent removal of draining the associated node until all are removed. | User | Machines |
5556
| pre-terminate.delete.hook.machine.cluster.x-k8s.io | It specifies the prefix we search each annotation for during the pre-terminate.delete lifecycle hook to pause reconciliation of deletion. These hooks will prevent removal of an instance from an infrastructure provider until all are removed. | User | Machines |
56-
| topology.cluster.x-k8s.io/defer-upgrade | It can be used to defer the Kubernetes upgrade of a single MachineDeployment topology. If the annotation is set on a MachineDeployment topology in Cluster.spec.topology.workers, the Kubernetes upgrade for this MachineDeployment topology is deferred. It doesn't affect other MachineDeployment topologies. | Cluster API | MachineDeployments in Cluster.topology |
57-
| topology.cluster.x-k8s.io/dry-run | It is an annotation that gets set on objects by the topology controller only during a server side dry run apply operation. It is used for validating update webhooks for objects which get updated by template rotation (e.g. InfrastructureMachineTemplate). When the annotation is set and the admission request is a dry run, the webhook should deny validation due to immutability. By that the request will succeed (without any changes to the actual object because it is a dry run) and the topology controller will receive the resulting object. | Cluster API | Template rotation objects |
58-
| topology.cluster.x-k8s.io/hold-upgrade-sequence | It can be used to hold the entire MachineDeployment upgrade sequence. If the annotation is set on a MachineDeployment topology in Cluster.spec.topology.workers, the Kubernetes upgrade for this MachineDeployment topology and all subsequent ones is deferred. | Cluster API | MachineDeployments in Cluster.topology |
57+
| topology.cluster.x-k8s.io/defer-upgrade | It can be used to defer the Kubernetes upgrade of a single MachineDeployment topology. If the annotation is set on a MachineDeployment topology in Cluster.spec.topology.workers, the Kubernetes upgrade for this MachineDeployment topology is deferred. It doesn't affect other MachineDeployment topologies. | Cluster API | MachineDeployments in Cluster.topology |
58+
| topology.cluster.x-k8s.io/dry-run | It is an annotation that gets set on objects by the topology controller only during a server side dry run apply operation. It is used for validating update webhooks for objects which get updated by template rotation (e.g. InfrastructureMachineTemplate). When the annotation is set and the admission request is a dry run, the webhook should deny validation due to immutability. By that the request will succeed (without any changes to the actual object because it is a dry run) and the topology controller will receive the resulting object. | Cluster API | Template rotation objects |
59+
| topology.cluster.x-k8s.io/hold-upgrade-sequence | It can be used to hold the entire MachineDeployment upgrade sequence. If the annotation is set on a MachineDeployment topology in Cluster.spec.topology.workers, the Kubernetes upgrade for this MachineDeployment topology and all subsequent ones is deferred. | Cluster API | MachineDeployments in Cluster.topology |
5960
| topology.cluster.x-k8s.io/upgrade-concurrency | It can be used to configure the maximum concurrency while upgrading MachineDeployments of a classy Cluster. It is set as a top level annotation on the Cluster object. The value should be >= 1. If unspecified the upgrade concurrency will default to 1. | Cluster API | Clusters |
6061
| unsafe.topology.cluster.x-k8s.io/disable-update-class-name-check | It can be used to disable the webhook check on update that disallows a pre-existing Cluster to be populated with Topology information and Class. | User | Clusters |
6162
| unsafe.topology.cluster.x-k8s.io/disable-update-version-check | It can be used to disable the webhook checks on update that disallows updating the .topology.spec.version on certain conditions. | User | Clusters |

exp/topology/desiredstate/desired_state.go

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ package desiredstate
2020
import (
2121
"context"
2222
"fmt"
23+
"slices"
24+
"strings"
2325

2426
"github.com/pkg/errors"
2527
corev1 "k8s.io/api/core/v1"
@@ -531,6 +533,35 @@ func (g *generator) computeControlPlaneVersion(ctx context.Context, s *scope.Sco
531533
}
532534

533535
if feature.Gates.Enabled(feature.RuntimeSDK) {
536+
var hookAnnotations []string
537+
for key := range s.Current.Cluster.Annotations {
538+
if strings.HasPrefix(key, clusterv1.BeforeClusterUpgradeHookAnnotationPrefix) {
539+
hookAnnotations = append(hookAnnotations, key)
540+
}
541+
}
542+
if len(hookAnnotations) > 0 {
543+
slices.Sort(hookAnnotations)
544+
message := fmt.Sprintf("annotations [%s] are set", strings.Join(hookAnnotations, ", "))
545+
if len(hookAnnotations) == 1 {
546+
message = fmt.Sprintf("annotation [%s] is set", strings.Join(hookAnnotations, ", "))
547+
}
548+
// Add the hook with a response to the tracker so we can later update the condition.
549+
s.HookResponseTracker.Add(runtimehooksv1.BeforeClusterUpgrade, &runtimehooksv1.BeforeClusterUpgradeResponse{
550+
CommonRetryResponse: runtimehooksv1.CommonRetryResponse{
551+
// RetryAfterSeconds needs to be set because having only hooks without RetryAfterSeconds
552+
// would lead to not updating the condition. We can rely on getting an event when the
553+
// annotation gets removed so we set twice of the default sync-period to not cause additional reconciles.
554+
RetryAfterSeconds: 20 * 60,
555+
CommonResponse: runtimehooksv1.CommonResponse{
556+
Message: message,
557+
},
558+
},
559+
})
560+
561+
log.Info(fmt.Sprintf("Cluster upgrade to version %q is blocked by %q hook (via annotations)", desiredVersion, runtimecatalog.HookName(runtimehooksv1.BeforeClusterUpgrade)), "hooks", strings.Join(hookAnnotations, ","))
562+
return *currentVersion, nil
563+
}
564+
534565
// At this point the control plane and the machine deployments are stable and we are almost ready to pick
535566
// up the desiredVersion. Call the BeforeClusterUpgrade hook before picking up the desired version.
536567
hookRequest := &runtimehooksv1.BeforeClusterUpgradeRequest{

exp/topology/desiredstate/desired_state_test.go

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -758,6 +758,7 @@ func TestComputeControlPlaneVersion(t *testing.T) {
758758
name string
759759
hookResponse *runtimehooksv1.BeforeClusterUpgradeResponse
760760
topologyVersion string
761+
clusterModifier func(c *clusterv1.Cluster)
761762
controlPlaneObj *unstructured.Unstructured
762763
upgradingMachineDeployments []string
763764
upgradingMachinePools []string
@@ -868,7 +869,7 @@ func TestComputeControlPlaneVersion(t *testing.T) {
868869
expectedVersion: "v1.2.3",
869870
},
870871
{
871-
name: "should return the controlplane.spec.version if the BeforeClusterUpgrade hooks returns a blocking response",
872+
name: "should return the controlplane.spec.version if a BeforeClusterUpgradeHook returns a blocking response",
872873
hookResponse: blockingBeforeClusterUpgradeResponse,
873874
topologyVersion: "v1.2.3",
874875
controlPlaneObj: builder.ControlPlane("test1", "cp1").
@@ -906,6 +907,30 @@ func TestComputeControlPlaneVersion(t *testing.T) {
906907
expectedVersion: "v1.2.2",
907908
wantErr: true,
908909
},
910+
{
911+
name: "should return the controlplane.spec.version if a BeforeClusterUpgradeHook annotation is set",
912+
hookResponse: nonBlockingBeforeClusterUpgradeResponse,
913+
topologyVersion: "v1.2.3",
914+
controlPlaneObj: builder.ControlPlane("test1", "cp1").
915+
WithSpecFields(map[string]interface{}{
916+
"spec.version": "v1.2.2",
917+
"spec.replicas": int64(2),
918+
}).
919+
WithStatusFields(map[string]interface{}{
920+
"status.version": "v1.2.2",
921+
"status.replicas": int64(2),
922+
"status.updatedReplicas": int64(2),
923+
"status.readyReplicas": int64(2),
924+
"status.unavailableReplicas": int64(0),
925+
}).
926+
Build(),
927+
clusterModifier: func(c *clusterv1.Cluster) {
928+
c.Annotations = map[string]string{
929+
clusterv1.BeforeClusterUpgradeHookAnnotationPrefix + "/test": "true",
930+
}
931+
},
932+
expectedVersion: "v1.2.2",
933+
},
909934
}
910935
for _, tt := range tests {
911936
t.Run(tt.name, func(t *testing.T) {
@@ -930,6 +955,9 @@ func TestComputeControlPlaneVersion(t *testing.T) {
930955
UpgradeTracker: scope.NewUpgradeTracker(),
931956
HookResponseTracker: scope.NewHookResponseTracker(),
932957
}
958+
if tt.clusterModifier != nil {
959+
tt.clusterModifier(s.Current.Cluster)
960+
}
933961
if len(tt.upgradingMachineDeployments) > 0 {
934962
s.UpgradeTracker.MachineDeployments.MarkUpgrading(tt.upgradingMachineDeployments...)
935963
}

0 commit comments

Comments
 (0)