Skip to content

OCPBUGS-16728: Add admission policy to deny changing an AWS LB type on an existing service #362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

JoelSpeed
Copy link
Contributor

The AWS CCM does not account for changes to the type of the load balancer. The recommended and supportable approach is for users to delete and recreate the service, rather than changing the annotation live.

This PR adds policy to prevent services from adding/removing/changing the annotation to control which kind of load balancer the service represents, such that the annotation must be set on create for NLB load balancers.

This should prevent any leaking of NLB/CLB resources when users mistakenly change the load balancer type.

@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 19, 2024
@openshift-ci-robot
Copy link

@JoelSpeed: This pull request references Jira Issue OCPBUGS-16728, which is invalid:

  • expected the bug to target the "4.18.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

The AWS CCM does not account for changes to the type of the load balancer. The recommended and supportable approach is for users to delete and recreate the service, rather than changing the annotation live.

This PR adds policy to prevent services from adding/removing/changing the annotation to control which kind of load balancer the service represents, such that the annotation must be set on create for NLB load balancers.

This should prevent any leaking of NLB/CLB resources when users mistakenly change the load balancer type.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@JoelSpeed
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Aug 19, 2024
@openshift-ci-robot
Copy link

@JoelSpeed: This pull request references Jira Issue OCPBUGS-16728, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.18.0) matches configured target version for branch (4.18.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Aug 19, 2024
Copy link
Contributor

openshift-ci bot commented Aug 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from joelspeed. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

- expression: |
(has(object.objectMeta.annotations) && 'service.beta.kubernetes.io/aws-load-balancer-type' in object.objectMeta.annotations) ==
(has(oldObject.objectMeta.annotations) && 'service.beta.kubernetes.io/aws-load-balancer-type' in oldObject.objectMeta.annotations) &&
object.objectMeta.annotations['service.beta.kubernetes.io/aws-load-balancer-type'] == oldObject.objectMeta.annotations['service.beta.kubernetes.io/aws-load-balancer-type']
Copy link
Contributor

@theobarberbany theobarberbany Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% following how this works, so this is more for my understanding - shout if I'm wrong:

We assert that our old object and our new object have both got the annotation matching service.beta.kubernetes.io/aws-load-balancer-type. We need this because otherwise the controller would not be able to set the value initially?

Then we check the value of the annotation remains equal:

object.objectMeta.annotations['service.beta.kubernetes.io/aws-load-balancer-type'] == oldObject.objectMeta.annotations['service.beta.kubernetes.io/aws-load-balancer-type']

If this annotation were being changed, we would fail the validation - so we would fail that update, disasllowing changing it's value.

Or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We assert that our old object and our new object have both got the annotation matching service.beta.kubernetes.io/aws-load-balancer-type. We need this because otherwise the controller would not be able to set the value initially?

You're close but missing one nuance, we check that whether "they have the annotation or not" is equal. If one has the annotation and the other does not, then the user changed something, and we aren't allowing that here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh gotcha! Makes sense!

@theobarberbany
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 19, 2024
@JoelSpeed
Copy link
Contributor Author

JoelSpeed commented Aug 19, 2024

/hold

This needs testing

CC @sunzhaohua2 I'd be interested to see if any QE cases are affected by this change as well, I'll give you a ping when it's ready to test

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm Indicates that a PR is ready to be merged. labels Aug 19, 2024
Copy link
Contributor

openshift-ci bot commented Aug 19, 2024

New changes are detected. LGTM label has been removed.

@JoelSpeed JoelSpeed force-pushed the prevent-nlb-clb-switch branch 8 times, most recently from 051246e to 3cded47 Compare August 22, 2024 13:41
@JoelSpeed JoelSpeed force-pushed the prevent-nlb-clb-switch branch from 3cded47 to a21723c Compare September 3, 2024 09:21
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 3, 2024
@JoelSpeed
Copy link
Contributor Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 10, 2024
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2025
@JoelSpeed
Copy link
Contributor Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2025
Copy link
Contributor

openshift-ci bot commented Mar 26, 2025

@JoelSpeed: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack-ovn a21723c link false /test e2e-openstack-ovn
ci/prow/e2e-vsphere-ovn a21723c link false /test e2e-vsphere-ovn
ci/prow/e2e-azure-ovn-upgrade a21723c link false /test e2e-azure-ovn-upgrade
ci/prow/e2e-azure-ovn a21723c link false /test e2e-azure-ovn
ci/prow/level0-clusterinfra-azure-ipi-proxy-tests a21723c link false /test level0-clusterinfra-azure-ipi-proxy-tests
ci/prow/unit a21723c link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 25, 2025
@JoelSpeed
Copy link
Contributor Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants