Skip to content

CM-576: Implementation of Network Policy for Cert Manager Operator #283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

PillaiManish
Copy link
Contributor

@PillaiManish PillaiManish commented Jun 3, 2025

This PR adds the NetworkPolicy for the operator:

  1. Deny all over podSelector
  2. Allow-egress-to-api-server
  3. Allow-ingress-to-metrics

Steps followed:

  • Update the library-go to latest version so that it contains the commit for the NetworkPolicy support
    • update the dependency libraries.
  • adds parameter in the function call for the updated version code
  • run the command make generate
  • add the required NetworkPolicy files and a static controller to watch/reconcile on these resources

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 3, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 3, 2025

@PillaiManish: This pull request references CM-576 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from bharath-b-rh and TrilokGeer June 3, 2025 14:01
@PillaiManish PillaiManish force-pushed the cm-576-network-policy-operator branch 6 times, most recently from 6628d66 to efb2f8d Compare June 10, 2025 11:47
@PillaiManish PillaiManish force-pushed the cm-576-network-policy-operator branch 2 times, most recently from dcaa292 to 9864f42 Compare June 10, 2025 15:19
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 11, 2025

@PillaiManish: This pull request references CM-576 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

This PR adds the NetworkPolicy for the operator:

  1. Deny all over podSelector
  2. Allow-egress-to-api-server
  3. Allow-ingress-to-metrics

Steps followed:

  • Update the library-go to latest version so that it contains the commit for the NetworkPolicy support
  • update the dependency libraries.
  • adds parameter in the function call for the updated version code
  • run the command make generate
  • add the required NetworkPolicy files and a static controller to watch/reconcile on these resources

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Member

@swghosh swghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E0610 17:30:08.850137       1 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.NetworkPolicy: failed to list *v1.NetworkPolicy: networkpolicies.networking.k8s.io is forbidden: User \"system:serviceaccount:cert-manager-operator:cert-manager-operator-controller-manager\" cannot list resource \"networkpolicies\" in API group \"networking.k8s.io\" in the namespace \"cert-manager-operator\""
W0610 17:30:58.441258       1 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.NetworkPolicy: networkpolicies.networking.k8s.io is forbidden: User "system:serviceaccount:cert-manager-operator:cert-manager-operator-controller-manager" cannot list resource "networkpolicies" in API group "networking.k8s.io" in the namespace "cert-manager-operator"
E0610 17:30:58.441294       1 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.NetworkPolicy: failed to list *v1.NetworkPolicy: networkpolicies.networking.k8s.io is forbidden: User \"system:serviceaccount:cert-manager-operator:cert-manager-operator-controller-manager\" cannot list resource \"networkpolicies\" in API group \"networking.k8s.io\" in the namespace \"cert-manager-operator\""

from the e2e operator pod logs it seems that the NetworkPolicy we intend to apply are not getting added at all. One way for the e2e to observe this as a failure is to add a status condition check for the new static-resource-controller being added, refer #279 for details. However, for this to work would need to check if your controller can effectively apply a NetworkPolicy object via library-go first, and resolve that RBAC issue too.

egress:
- ports:
- protocol: TCP
port: 6443
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows (operator pod to) egress ALL to any 6443 port to any pod on the cluster? Is my understanding correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the docs they have mentioned this way. They might tighten it later.

@@ -433,7 +433,7 @@ func validateTolerationsConfig(tolerations []corev1.Toleration, fldPath *field.P
func validateResourceRequirements(requirements corev1.ResourceRequirements, fldPath *field.Path) error {
// convert corev1.ResourceRequirements to core.ResourceRequirements, required for validation.
convRequirements := *(*core.ResourceRequirements)(unsafe.Pointer(&requirements))
return corevalidation.ValidateResourceRequirements(&convRequirements, nil, fldPath.Child("resources"), corevalidation.PodValidationOptions{}).ToAggregate()
return corevalidation.ValidateContainerResourceRequirements(&convRequirements, nil, fldPath.Child("resources"), corevalidation.PodValidationOptions{}).ToAggregate()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coming in from upstream bump?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the upstream, that corevalidation

func validatePodResourceRequirements(requirements *core.ResourceRequirements, podClaimNames sets.Set[string], fldPath *field.Path, opts PodValidationOptions) field.ErrorList {
	return validateResourceRequirements(requirements, validatePodResourceName, podClaimNames, fldPath, opts)
}

func ValidateContainerResourceRequirements(requirements *core.ResourceRequirements, podClaimNames sets.Set[string], fldPath *field.Path, opts PodValidationOptions) field.ErrorList {
	return validateResourceRequirements(requirements, validateContainerResourceName, podClaimNames, fldPath, opts)
}

// Validates resource requirement spec.
func validateResourceRequirements(requirements *core.ResourceRequirements, resourceNameFn func(core.ResourceName, *field.Path) field.ErrorList, podClaimNames sets.Set[string], fldPath *field.Path, opts PodValidationOptions) field.ErrorList {

Two function have been added validatePodResourceRequirements, ValidateContainerResourceRequirements.

"fmt"
"net/http"
fmt "fmt"
http "net/http"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is changing the generated client-gen code.
Auto-generation update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, maybe the bump? I will once take a look on it.

@@ -71,10 +71,13 @@ func RunOperator(ctx context.Context, cc *controllercmd.ControllerContext) error
versionRecorder := status.NewVersionGetter()
versionRecorder.SetVersion("operator", status.VersionForOperatorFromEnv())

operatorclient.OperatorNamespace = cc.OperatorNamespace

kubeInformersForNamespaces := v1helpers.NewKubeInformersForNamespaces(kubeClient,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check library-go implementation for the Static Resources Controller (especially informers),
my guess is that there is no informer/client for NetworkPolicy yet.

Copy link
Contributor

openshift-ci bot commented Jun 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: PillaiManish
Once this PR has been reviewed and has the lgtm label, please assign trilokgeer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s.io/client-go v0.31.7
k8s.io/code-generator v0.31.1
k8s.io/component-base v0.31.7
golang.org/x/tools v0.26.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these bumps to kube 32, can go in as a separate PR (ideally with the upstream bump, but not a must)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean with the PR ?

@swghosh
Copy link
Member

swghosh commented Jun 19, 2025

E0610 17:30:08.850137       1 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.NetworkPolicy: failed to list *v1.NetworkPolicy: networkpolicies.networking.k8s.io is forbidden: User \"system:serviceaccount:cert-manager-operator:cert-manager-operator-controller-manager\" cannot list resource \"networkpolicies\" in API group \"networking.k8s.io\" in the namespace \"cert-manager-operator\""
W0610 17:30:58.441258       1 reflector.go:569] k8s.io/client-go/informers/factory.go:160: failed to list *v1.NetworkPolicy: networkpolicies.networking.k8s.io is forbidden: User "system:serviceaccount:cert-manager-operator:cert-manager-operator-controller-manager" cannot list resource "networkpolicies" in API group "networking.k8s.io" in the namespace "cert-manager-operator"
E0610 17:30:58.441294       1 reflector.go:166] "Unhandled Error" err="k8s.io/client-go/informers/factory.go:160: Failed to watch *v1.NetworkPolicy: failed to list *v1.NetworkPolicy: networkpolicies.networking.k8s.io is forbidden: User \"system:serviceaccount:cert-manager-operator:cert-manager-operator-controller-manager\" cannot list resource \"networkpolicies\" in API group \"networking.k8s.io\" in the namespace \"cert-manager-operator\""

from the e2e operator pod logs it seems that the NetworkPolicy we intend to apply are not getting added at all. One way for the e2e to observe this as a failure is to add a status condition check for the new static-resource-controller being added, refer #279 for details. However, for this to work would need to check if your controller can effectively apply a NetworkPolicy object via library-go first, and resolve that RBAC issue too.

looks like apply-ing NetworkPolicy via Static Resource Controller was already added in openshift/library-go#1961: which suffices, so fixing the RBAC should fix your issue.

So additionally, we would need e2e to check that Status.Conditions is: - {status: "False", type: cert-manager-operator-static-resources-Degraded } (context of why we need it is in OCPBUGS-56758 FYR).

@PillaiManish PillaiManish force-pushed the cm-576-network-policy-operator branch from 9864f42 to 42d6de6 Compare June 30, 2025 07:55
@PillaiManish PillaiManish force-pushed the cm-576-network-policy-operator branch from 42d6de6 to 6cbef25 Compare June 30, 2025 09:17
@PillaiManish PillaiManish force-pushed the cm-576-network-policy-operator branch from 6cbef25 to bd6195f Compare June 30, 2025 09:47
@PillaiManish
Copy link
Contributor Author

/retest

1 similar comment
@PillaiManish
Copy link
Contributor Author

/retest

@TrilokGeer
Copy link
Contributor

/hold
Hold for design review.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 2, 2025
Copy link
Contributor

openshift-ci bot commented Jul 29, 2025

@PillaiManish: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-operator bd6195f link true /test e2e-operator

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 29, 2025
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants