Skip to content

Fix operatorSync: false breaking on implicit webhook mutations #299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 7, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 36 additions & 6 deletions api/v1alpha1/storage_webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,6 @@ func (r *StorageDefaulter) Default(ctx context.Context, obj runtime.Object) erro
storage := obj.(*Storage)
storagelog.Info("default", "name", storage.Name)

if !storage.Spec.OperatorSync {
Copy link
Contributor Author

@Jorres Jorres May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug, default webhook should run on every update even if you have operatorSync = false. Otherwise the yaml that gets applied after operatorSync gets switched to false will REMOVE the defaults -- which is definitely not what we want (good example is hosts section, it is autogenerated for test stands)

I did not yet manage to come up with a scenario where this if is necessary

Copy link
Contributor

@kobzonega kobzonega May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the yaml that gets applied after operatorSync gets switched to false will REMOVE the defaults -- which is definitely not what we want (good example is hosts section, it is autogenerated for test stands)

Can you describe in more detail the step-by-step scenario for reproducing the problem?
We already have e2e tests for operatorSync field with updating https://github.com/ydb-platform/ydb-kubernetes-operator/blob/master/tests/e2e/smoke_test.go#L204
Maybe we should write additional tests for this use-case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just re-read the test carefully and understood what is happening.

In the test, before reapplying the Storage object with operatorSync: false, instead of modifying just the operatorSync option, we GET the Storage object from k8s cluster, then set operatorSync and only then apply. Of course, if we get an object from k8s api, it has sorted configuration string. Even if mutated webhook does not run, there is zero diff in configuration and the test passes.

Conclusion: currently all our e2e tests look like above (fetching a Storage from k8s api before apply instead of modifying the original Storage object that we create in a test).

We PROBABLY can stop doing that in tests, but it's non-trivial. I tried to modify the original object and got the following error:

  STEP: setting storage operatorSync to false... - /home/jorres/work/nebius/ydb-kubernetes-operator/tests/e2e/smoke_test.go:215 @ 05/07/25 22:08:02.354
  [FAILED] Expected success, but got an error:
      <*errors.StatusError | 0xc00070d540>:
      Operation cannot be fulfilled on storages.ydb.tech "storage": the object has been modified; please apply your changes to the latest version and try again
      {
          ErrStatus: {
          ╎   TypeMeta: {Kind: "", APIVersion: ""},
          ╎   ListMeta: {
          ╎       SelfLink: "",
          ╎       ResourceVersion: "",
          ╎       Continue: "",
          ╎       RemainingItemCount: nil,
          ╎   },
          ╎   Status: "Failure",
          ╎   Message: "Operation cannot be fulfilled on storages.ydb.tech \"storage\": the object has been modified; please apply your changes to the latest version and try again",
          ╎   Reason: "Conflict",
          ╎   Details: {Name: "storage", Group: "ydb.tech", Kind: "storages", UID: "", Causes: nil, RetryAfterSeconds: 0},
          ╎   Code: 409,
          },
      }
  In [It] at: /home/jorres/work/nebius/ydb-kubernetes-operator/tests/e2e/smoke_test.go:223 @ 05/07/25 22:08:04.394

maybe with a little more time I will come up with a solution how to avoid this problem and still modify the original object, but it definitely will be another PR.

@kobzonega do you maybe have any ideas right now? Otherwise I'll put it in backlog

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use K8S API Patch method instead of Update, that do not rely on resourceVersion field. It would be nice if we add this test, maybe in the separate PR.

return nil
}

if storage.Spec.OperatorConnection != nil {
if storage.Spec.OperatorConnection.Oauth2TokenExchange != nil {
if storage.Spec.OperatorConnection.Oauth2TokenExchange.SignAlg == "" {
Expand Down Expand Up @@ -311,9 +307,43 @@ func hasUpdatesBesidesFrozen(oldStorage, newStorage *Storage) (bool, string) {
oldStorageCopy.Spec.OperatorSync = false
newStorageCopy.Spec.OperatorSync = false

ignoreNonSpecFields := cmpopts.IgnoreFields(Storage{}, "Status", "ObjectMeta", "TypeMeta")
// We will allow configuration diffs if they are limited to
// formatting, order of keys in the map etc. If two maps are
// meaningfully different (not deep-equal), we still disallow
// the diff of course.
configurationCompareOpt := cmp.FilterPath(
func(path cmp.Path) bool {
if sf, ok := path.Last().(cmp.StructField); ok {
return sf.Name() == "Configuration"
}
return false
},
cmp.Comparer(func(a, b string) bool {
var o1, o2 any

if err := yaml.Unmarshal([]byte(a), &o1); err != nil {
return false
}
if err := yaml.Unmarshal([]byte(b), &o2); err != nil {
return false
}

diff := cmp.Diff(o1, o2)
if diff != "" {
storagelog.Info(fmt.Sprintf("Configurations are different:\n%v\n%v", o1, o2))
}

return diff == ""
}),
)

diff := cmp.Diff(oldStorageCopy, newStorageCopy, ignoreNonSpecFields)
ignoreNonSpecFields := cmpopts.IgnoreFields(Storage{}, "Status", "ObjectMeta", "TypeMeta")
diff := cmp.Diff(
oldStorageCopy,
newStorageCopy,
ignoreNonSpecFields,
configurationCompareOpt,
)
return diff != "", diff
}

Expand Down