Skip to content

feat: Add preflight checks framework #1129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jun 17, 2025

Conversation

dlipovetsky
Copy link
Contributor

@dlipovetsky dlipovetsky commented May 20, 2025

What problem does this PR solve?:
Adds a framework for preflight checks. A preflight check is a type of validation that typically requires access to an infrastructure API.

A validating webhook on the Cluster resource executes all preflight checks, and returns failures, and warnings to the client.

Which issue(s) this PR fixes:
Fixes #

How Has This Been Tested?:

Special notes for your reviewer:

Previously, helm variables were used for only the first webhook
configuration.
@dlipovetsky dlipovetsky force-pushed the dlipovetsky/preflight-checks-framework branch from b768947 to 2c69dd0 Compare May 20, 2025 21:06
Let checker decide whether it should run
Copy link
Contributor

@dkoshkin dkoshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping the PR small and targeted!

Remove side effects from the initialization. That is, the checker
initialization still decides which checks apply, but we defer side
effects, and potential errors, to the checks themselves. This allows us
to execute all checks that apply, and get the results to the client.

Previously, if initialization failed, the checker returned no checks.
@dlipovetsky
Copy link
Contributor Author

@dkoshkin Thanks for reviewing!

As I'm working on #1130, I'm making some changes here.

In hindsight, I should have marked this a draft PR. Sorry that I didn't. I won't force-push any changes here. I hope that will make it easier to review the new changes. 🙏

Address gocritic linter error
Each check returns list of causes
Do not wait for all checkers to initialize before running checks
Derive cause type from name in the check result
Add 'cluster' to webhook path and name
@dkoshkin dkoshkin self-requested a review May 29, 2025 19:24
Deterministically order results, and fix status reporting
Remove unnecessary copying of slice
Use CheckerFactory. This allows the preflight framework to construct a
new Checker for each cluster, and that allows the Checker to store state
specific to that cluster.
@dlipovetsky dlipovetsky requested a review from jimmidyson June 10, 2025 18:41
Recover from and log panic in check
Remove Name from CheckResult, because the name is given by check.Name()
Remove the use of a checker factory.
Copy link
Contributor

@dkoshkin dkoshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@dlipovetsky dlipovetsky merged commit 178ede7 into main Jun 17, 2025
22 checks passed
@dlipovetsky dlipovetsky deleted the dlipovetsky/preflight-checks-framework branch June 17, 2025 15:20
dlipovetsky added a commit that referenced this pull request Jun 17, 2025
**What problem does this PR solve?**:
Implements a preflight check that verifies the Nutanix VM images
referenced by the Cluster spec.

This also includes Nutanix Configuration and Credentials checks,
both of which are provide dependencies required by the VM images check.

**Which issue(s) this PR fixes**:
Fixes #

**How Has This Been Tested?**:
<!--
Please describe the tests that you ran to verify your changes.
Provide output from the tests and any manual steps needed to replicate
the tests.
-->

**Special notes for your reviewer**:
<!--
Use this to provide any additional information to the reviewers.
This may include:
- Best way to review the PR.
- Where the author wants the most review attention on.
- etc.
-->
Stacked on #1129 

I am working on unit tests. The Nutanix client is difficult to mock, so
creating the tests is taking more time than I expected.
supershal added a commit that referenced this pull request Jun 24, 2025
🤖 I have created a release *beep* *boop*
---


## 0.30.0 (2025-06-24)

<!-- Release notes generated using configuration in .github/release.yaml
at main -->

## What's Changed
### Exciting New Features 🎉
* feat: Build with Go 1.24.4 to fix CVEs by @jimmidyson in
#1157
* feat: add requests and limits to registry containers by @dkoshkin in
#1158
* feat: Add preflight checks framework by @dlipovetsky in
#1129
* feat: Preflight check opt-out by @dlipovetsky in
#1156
* feat: Nutanix VM image preflight check by @dlipovetsky in
#1130
* feat: update addons by @dkoshkin in
#1168
* feat: Enforce MD replicas within cluster autoscaler bounds by
@jimmidyson in
#1169
* feat(preflight): Storage container checks for Nutanix by
@thunderboltsid in
#1136
* feat: update Nutanix CSI to 3.3.4 by @dkoshkin in
#1179
### Fixes 🔧
* fix: update CNCF registry version to 2.3.4, app version 2.8.3 by
@dkoshkin in
#1150
* fix: registry addon headless service port by @dkoshkin in
#1159
* fix: preserve registry addon root CA on move by @dkoshkin in
#1155
* fix: Add noderegistration patch to previous handler by @jimmidyson in
#1177
### Other Changes
* build: include regclient/regsync image for registry addon by @dkoshkin
in
#1148
* test: Add update test helpers by @jimmidyson in
#1162
* test(e2e): Nutanix 1.33.1 testing by @jimmidyson in
#1164
* build: Update all tools by @jimmidyson in
#1165
* refactor: add global feature.Gates variable by @dkoshkin in
#1167
* ci: new env variable to set --feature-gates by @dkoshkin in
#1166
* build: github.com/hashicorp/go-retryablehttp@v0.7.8 to fix CVE by
@jimmidyson in
#1170
* docs: Update link to default Cilium values in cni.md by
@yannickstruyf3 in
#1173
* docs: Fix up Cilium config link (again) & icons by @jimmidyson in
#1176

## New Contributors
* @yannickstruyf3 made their first contribution in
#1173

**Full Changelog**:
v0.29.0...v0.30.0

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Shalin Patel <shalin.patel@nutanix.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants