-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚠️ fix redundant values in log entries #2074
⚠️ fix redundant values in log entries #2074
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: zbindenren The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @zbindenren. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @sbueringer |
/ok-to-test |
@@ -319,7 +318,6 @@ func (blder *Builder) doController(r reconcile.Reconciler) error { | |||
log := log | |||
if req != nil { | |||
log = log.WithValues( | |||
gvk.Kind, klog.KRef(req.Namespace, req.Name), | |||
"namespace", req.Namespace, "name", req.Name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"namespace", req.Namespace, "name", req.Name, | |
"namespace", req.Namespace, "name", req.Name, |
We should probably remove this line instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincepri thanks for your time to review this.
If I understand correctly, your proposition would result in something like this:
level=INFO msg="creating certificate request" controller=certificate controllerGroup=example.com controllerKind=Certificate Certificate="{certificate-sample appl-k8s-e2etests1-e1}" reconcileID=8e0379c3-ec61-42a7-9bb1-09b19f57bfb6 rev=1
We would also have to remove this line. Otherwise we would still have redundant information in the log line. This would produce:
level=INFO msg="creating certificate request" controller=certificate controllerGroup=example.com Certificate="{certificate-sample appl-k8s-e2etests1-e1}" reconcileID=8e0379c3-ec61-42a7-9bb1-09b19f57bfb6 rev=1
In comparison my change:
level=INFO msg="creating certificate request" controller=certificate controllerGroup=example.com controllerKind=Certificate namespace=appl-k8s-e2etests1-e1 name=certificate-sample reconcileID=8e0379c3-ec61-42a7-9bb1-09b19f57bfb6 rev=1
If you are using indexing software like Splunk or Elastic Search, it would be harder to write queries for your proposition, since the Kind (Certificate
in our the case above) is now a key in the structured log. If you are reconciling for different kinds, we would have a key, that can change for different log entries.
@@ -319,7 +318,6 @@ func (blder *Builder) doController(r reconcile.Reconciler) error { | |||
log := log | |||
if req != nil { | |||
log = log.WithValues( | |||
gvk.Kind, klog.KRef(req.Namespace, req.Name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional btw, we wanted to not break existing users who rely on these fields and allow it to just work for ppl who rely on the klog field naming. Is there any issue with the redundancy? Most log shippers allow to drop fields if you are concerned about the size.
This is a breaking change btw since it might break log indexing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, if we consider log output as part of the controller-runtime's API, this is a breaking change. But if we can agree, that for fixing a 'bug', it is allowed to introduce a breaking change, then let me elaborate why I think this is a bug:
Using a value as key is the main issue I have with the above log entry. I think this is an anti pattern in structured logging. Let's take http requests as analogy. To log http requests with status codes we probably would log something like:
request_path="/api/v1/user/list" request_code="404" ...
and usually not:
/api/v1/user/list="404"
The above code would be ok in my opinion, if we rewrite it to something like:
log = log.WithValues(
"kindReference", klog.KRef(req.Namespace, req.Name),
)
But if we change it, we could remove it altogether.
Additionally it is probably easier to create search queries with the fields controllerKind
, namespace
and name
than with the klog field name Certificate
which changes for each kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the general sentiment to not put values into keys, but it has the advantage that you can search for logs of related to a given object type + name across controllers: #1826 (comment)
We basically had the same discussion we are having here on the issue before this was introduced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you refer to this quote:
By having namespace/name for all of them nested we can simply filter with cluster.name = "" across logs of multiple controllers. Otherwise it would be very hard to take the namespace and name from every controller correlate them to the kind they belong to and then filter across controllers.
The query suggested by @sbueringer is still possible with:
controllerKind=Cluster name=cluster-name namespace=namespace
And for just one controller:
controller=controller controllerKind=Cluster name=cluster-name namespace=namespace
So to summarize pros ✔️ and cons ❌:
If we merge this PR:
- ✔️ we get smaller log entries without redundant information
- ✔️ log entries are easier to read (no nested objects and less noise)
- ✔️ stop using a value as key in structured logging
- ✔️ same filter possibilities in splunk, loki or elastic search
- ❌ breaking change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zbindenren I agree with your reasoning in principle, but the problem is that we want to do this the same way as upstream does so log queries will work for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akutz and @sbueringer were the ones who cared about the previous format, lets give them a chance to voice their opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, @pohly not Andrew
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the very late response, I have to get my notifications a bit more under control.
As stated above the current format was intended as I described in: #1826
Log "<kind>": "<namespace>/<name>" instead of "namespace" and "name"
- Let's align to the structured logging KEP. Specifically to: https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging#object-reference-format-in-logs
- Note: the value is formatted differently for JSON/text format.
- This produces logs consistent with the logs in upstream Kubernetes
- It is a lot easier to correlate logs for a specific object across controllers with this format
Initially we used kind also with a first lower-case character but later on we realized that we can't do this correctly for arbitrary kind's:
- ⚠️ Fix broken kind logging #1954
- tl;dr APIService => aPIService instead of apiService
So yes we are not 100% consistent with upstream, but as far as we can be.
Independent of consistency, my reason for wanting the current format is the following:
In ClusterAPI we have various controllers and they are all reconciling related objects.
With the current loggin we end up with logs like this: (re-ordered keys to make it easier to understand)
MachineSet controller:
{"Cluster":{"name":"quick-start-h0uf1w","namespace":"quick-start-u7zz7r"}, "MachineDeployment":{"name":"quick-start-h0uf1w-md-0","namespace":"quick-start-u7zz7r"}, "MachineSet":{"name":"quick-start-h0uf1w-md-0-6c856bb876","namespace":"quick-start-u7zz7r"},"ts":1673599028821.286,"caller":"machineset/machineset_controller.go:402","msg":"MachineSet is scaling up to 1 replicas by creating 1 machines","v":0,"controller":"machineset","controllerGroup":"cluster.x-k8s.io","controllerKind":"MachineSet","namespace":"quick-start-u7zz7r","name":"quick-start-h0uf1w-md-0-6c856bb876","reconcileID":"7f5cf231-61cb-4cb6-8d28-ac02d709d1ff","replicas":1,"machineCount":0}
Cluster controller
{"Cluster":{"name":"clusterclass-changes-pbb08r","namespace":"clusterclass-changes-xo727h"},"ts":1673599028749.3906,"caller":"cluster/cluster_controller.go:187","msg":"Waiting for the topology to be generated","v":0,"controller":"cluster","controllerGroup":"cluster.x-k8s.io","controllerKind":"Cluster","namespace":"clusterclass-changes-xo727h","name":"clusterclass-changes-pbb08r","reconcileID":"bd128734-0120-4035-964b-e36d0624351a"}
So essentially across controllers we are adding key/value pairs of the objects involved, like {"Cluster":{"name":"clusterclass-changes-pbb08r","namespace":"clusterclass-changes-xo727h"}
, "MachineDeployment":{"name":"quick-start-h0uf1w-md-0","namespace":"quick-start-u7zz7r"}
...
This makes it very easy to query logs across controller, by just querying for log entries with Cluster.name == <cluster-name>
If we just would want to query the logs of a single controller you're format would work as well, but for cross-refencing it makes the queries very complicated in my opinion.
Regarding using values as keys
I don't know where that comes from and what the background is. Is it the same concern about cardinality with labels in metrics? Essentially if you use values as log keys you can end up with very high cardinality?
I'm not sure that this is true in this case, because the amount of kind's that we are reconciling shouldn't be that excessive to become a problem.
(also described here: #1826 (comment))
Regarding redundancy and easier to ready
I agree if folks are reading logs via kubectl less redundancy would indeed help. I think via Kibana / Grafana / .. it doesn't make a difference.
P.S. Just to make sure we're on the same page. We are discussing what the default / out-of-the box behavior should be. It is always possible to use different k/v pairs by setting LogConstructor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your answer.
Regarding your reason for wanting the current format
Your query could also be created without the redundant key like:
controllerKind=Cluster name=<cluster-name>
The above query would show the logs for multiple controllers. If you want to limit the controller you would need to do the following:
controller=machineset controllerKind=Cluster name=<cluster-name>
or
controller=cluster controllerKind=Cluster name=<cluster-name>
To me this doesn't look to complicated. Or do I misunderstand something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
controllerKind
is different for each controller so I can't use controllerKind=Cluster
to filter for logs related to a Cluster across controllers.
controllerKind=Cluster name=<cluster-name>
would only match the logs from the Cluster controller but not the logs from the MachineSet controller
Essentially in the MachineSet controller the controllerKind is MachineSet and also name and namespace are from a MachineSet. In the Cluster controller the controllerKind is Cluster.
Not sure whether it is relevant here, but in case that you haven't seen it: progress has been made with converting the in-tree controllers to structured, contextual logging. See the table in kubernetes/enhancements#3077. PRs are currently under review for 1.27. |
@zbindenren: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
With the current log configuration we get following reconcile log entries:
level=INFO msg="creating certificate request" controller=certificate controllerGroup=example.com controllerKind=Certificate Certificate="{certificate-sample appl-k8s-e2etests1-e1}" namespace=appl-k8s-e2etests1-e1 name=certificate-sample reconcileID=8e0379c3-ec61-42a7-9bb1-09b19f57bfb6 rev=1
As you can see, we have an entry
Certificate="{certificate-sample appl-k8s-e2etests1-e1}"
, but this information is redundant, because we already have:controllerKind=Certificate namespace=appl-k8s-e2etests1-e1 name=certificate-sample
If you want to search for this entry, it is probably easier and more flexible to filter by
controllerKind
,namespace
andname
.The above commit changes the log entry to:
level=INFO msg="creating certificate request" controller=certificate controllerGroup=example.com controllerKind=Certificate namespace=appl-k8s-e2etests1-e1 name=certificate-sample reconcileID=8e0379c3-ec61-42a7-9bb1-09b19f57bfb6 rev=1