Skip to content

🌱 Allows to redefine ETCD client logger #12271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dmvolod
Copy link
Member

@dmvolod dmvolod commented May 22, 2025

What this PR does / why we need it:
For some reasons, we need to adjust ECTD client logger level to avoid have an unpredictable number of warnings in the log, like

{"level":"warn","ts":"2025-05-22T16:36:08.530926Z","caller":"v3@v3.5.15/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc016cf01e0/etcd-test-control-plane-dx4r5","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: error upgrading connection: error sending request: Post \\\"https://XXX.XX.XX.XX:6443/api/v1/namespaces/kube-system/pods/etcd-test-control-plane-dx4r5/portforward?timeout=10s\\\": EOF\""}
{"level":"warn","ts":"2025-05-22T16:36:23.642869Z","caller":"v3@v3.5.15/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0041314a0/etcd-test-control-plane-dx4r5","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = received context error while waiting for new LB policy update: context deadline exceeded"}
{"level":"warn","ts":"2025-05-22T16:37:11.809235Z","caller":"v3@v3.5.15/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc01595fa40/etcd-test-control-plane-dx4r5","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: error upgrading connection: error sending request: Post \\\"https://XXX.XX.XX.XX:6443/api/v1/namespaces/kube-system/pods/etcd-test-control-plane-dx4r5/portforward?timeout=10s\\\": EOF\""}
{"level":"warn","ts":"2025-05-22T16:37:30.158754Z","caller":"v3@v3.5.15/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc009898f00/etcd-test-control-plane-dx4r5","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = received context error while waiting for new LB policy update: context deadline exceeded"}

But now, ETCD log level is hardcoded to the zapcore.InfoLevel. This PR allows to redefine it on init in the embedded controllers. If we need to redefine it globally with env variable or run option for any deployment, please let me know.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 22, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joelspeed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the do-not-merge/needs-area PR is missing an area label label May 22, 2025
@k8s-ci-robot k8s-ci-robot requested review from g-gaston and sivchari May 22, 2025 17:07
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 22, 2025
@dmvolod
Copy link
Member Author

dmvolod commented May 22, 2025

/area provider/control-plane-kubeadm

@k8s-ci-robot k8s-ci-robot added area/provider/control-plane-kubeadm Issues or PRs related to KCP and removed do-not-merge/needs-area PR is missing an area label labels May 22, 2025
@sbueringer
Copy link
Member

Wondering if kubernetes has a similar use case and how are they handling it (e.g. in the kube-apiserver)

@dmvolod
Copy link
Member Author

dmvolod commented May 22, 2025

Wondering if kubernetes has a similar use case and how are they handling it (e.g. in the kube-apiserver)

This is a very good point. It doesn't work in kube-apiserver at the moment, because community copied/pasted (https://github.com/kubernetes/kubernetes/blob/b35c5c0a301d326fdfa353943fca077778544ac6/staging/src/k8s.io/apiserver/pkg/storage/storagebackend/factory/etcd3.go#L97-L111) the etcdClientDebugLevel function from etcd, which is no longer actual for now. I will plan to create a similar PR for Kubernetes also.

I requested an export of this function in the PR etcd-io/etcd#20006. But as far as I understand, there are no plans to backport it to etcd 3.5.

The only downside to this approach is that it won't work natively. We can have both approaches at the same time to get a universal solution. That is, set the logger via a function and change the logging level via ETCD_CLIENT_DEBUG variable. The only thing is that at the moment we will also have to copy/paste the function until the release of etcd 3.6 with this fix is ​​released and we will not switch to it. I could create and track this issue.

@sbueringer
Copy link
Member

I was wondering if there is a way to "reuse" our regular log level for the etcd logger

@dmvolod
Copy link
Member Author

dmvolod commented May 23, 2025

I was wondering if there is a way to "reuse" our regular log level for the etcd logger

Seems to with some trick like that, it could be possible, but should be validated

func (r *KubeadmControlPlaneReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, options controller.Options) error {
	logger, _ := logutil.CreateDefaultZapLogger(zapcore.Level(ctrl.LoggerFrom(ctx).GetV()))
	etcd.SetLogger(logger)

But in this case, it would be nice to redefine log destination also in advance to log level. SetLogger will help with it.

@@ -174,6 +175,11 @@ func NewClient(ctx context.Context, config ClientConfiguration) (*Client, error)
return client, nil
}

// SetLogger allows to redefine ETCD client logger.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we move this right after etcdClientLogger is defined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about moving SetLogger to controllers/alias.go (I would prefer to avoid adding a new package for this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I don't think we should address this issue this way at all. 99% of our users can't use this

Copy link
Member Author

@dmvolod dmvolod Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I agree with you, it's a very rare case, but if someone to change controller log level to minimal or want to disable it, will see these warnings always in the logs.

I will try to inherit log level from controller-runtime logger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to take closer look and currently not much time due to urgent v1beta2 work, but what do you think about introducing an additional command line flag in KCP that allows to set the etcd logger log level?

This way everyone can use it and not just the very few folks that consume the exported KCP controller code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, not a problem. In this case, do you mean, that we need separate log levels for controller and etcd client. Now, I converted logr -> zapcore levels and both users (embedded and non embedded) could change it globally for controller and etcd client logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two problems with the current implementation:

  1. I don't want to export a public func that allows setting a different etcd logger. I would like to keep this an implementation detail as much as possible
  2. I think the --v log levels don't map well to the zap log levels. They seem to follow a different philosophy (--v=0-10 vs debug,info, ...)

So I would suggest:

  1. Let's introduce a new command line flag for the KCP controller. Something like "etcd-client-log-level"
  2. Create the etcd logger in main.go and pass it through the KubeadmControlPlaneReconciler structs down all the way to where we use it

The flag should have (at least) the following possible values: 'debug', 'info', 'error', 'panic'. (of course we have to map the strings to the zap log level types)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's not a problem to implement additional option, i.e. etcd-client-log-level. I will do it.

But please have a look at this description, it seems to there is a logic to convert logr level to zapcore
https://github.com/go-logr/zapr/blob/6a622587cb9d0540da5cd87a37621cb08eaf8e67/zapr.go#L50-L53

Copy link
Member

@sbueringer sbueringer Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I thought zap has the following log levels:

  • Debug -1
  • Info 0
  • Warn 1
  • Error 2
  • DPanic 3
  • Panic 4
  • Fatal 5

https://github.com/uber-go/zap/blob/master/zapcore/level.go#L35-L52

What am I misinterpreting? :)

Copy link
Member Author

@dmvolod dmvolod Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, description and code are bit confusing. Now I understand that your original approach is more suitable for our case, without SetLogger exporting.

Will implement it soon.

@dmvolod dmvolod force-pushed the redefine-etcd-client-logger branch from f287221 to 9bbad37 Compare June 9, 2025 08:47
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 9, 2025
@dmvolod dmvolod force-pushed the redefine-etcd-client-logger branch from 9bbad37 to 35ee04e Compare June 9, 2025 09:24
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 10, 2025
@dmvolod dmvolod force-pushed the redefine-etcd-client-logger branch from d82f0f3 to 2e898fa Compare June 10, 2025 06:50
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 10, 2025
@dmvolod
Copy link
Member Author

dmvolod commented Jun 10, 2025

/test pull-cluster-api-test-main

@dmvolod
Copy link
Member Author

dmvolod commented Jun 13, 2025

/retest

@dmvolod dmvolod force-pushed the redefine-etcd-client-logger branch from 8734a21 to a64e1a1 Compare June 13, 2025 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/control-plane-kubeadm Issues or PRs related to KCP cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants