feat: add disruptive reboot test case #692
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
Description of changes:
Adds a new test case that reboots the instance, intended to help sanity check that reboots are graceful. It's quite difficult from AWS APIs alone to determine the status of an instance after reboot (the
RebootInstanceAPI itself is asynchronous, EC2 status checks and SSM agent connectivity status are reconciled sparsely so they some times miss state changes). Ultimately, something must be executed on the node to determine whether or not it's running, this uses a pod to do so to keep it more as Kubernetes-oriented as possible.The test is based on the assumption that the pod
Execrequire kubelet responsiveness, and therefore a143to a command execution within a pod will decisively indicate that the node is shutting down. This is a bit of a simplification since anySIGTERMwould lead to this state, but given the timing and the presumed clean state of the instance, it's taken to mean the reboot is starting. After this, a second pod is created, and it follows from the prior state that this pod should not start running until after the boot, since kubelet was already non-responsive or evicting existing pods.Sample happy path output:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.