Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instructions for structured logging migration #4793

Merged
merged 1 commit into from
Jun 5, 2020

Conversation

serathius
Copy link
Contributor

@serathius serathius commented May 20, 2020

Ref kubernetes/enhancements#1602
/cc @44past4 @brancz @DirectXMan12 @thockin

Wrote instructions for structured logging migration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/developer-guide Issues or PRs related to the developer guide sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels May 20, 2020
@serathius serathius changed the title [WIP] Add instructions for structured logging migration Add instructions for structured logging migration May 25, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 25, 2020
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a lot of comments and then you addressed them later, so I deleted them. Good job :)

minimal design from [logr] thus there is no one-to-one mapping.

Simplified mapping between functions:
* `klog.Infof`, `klog.Info`, `klog.Infoln`, `klog.InfoDepth` -> `klog.InfoS`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "depth" ones are challenging because they presume hat popping up the call-stack is possible. I count about 8 such files that make these calls in k/k. Do you link we need an InfoS variant that includes depth? That would be ugly to add to logr because it tries to abstract multiple implementations, but maybe that's something we should tackle in logr and klog, if it is a missing capability...

Copy link
Contributor

@DirectXMan12 DirectXMan12 May 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the list is:

directxman12@💻 ~[go:kubernetes] $ git rev-parse HEAD
02637bb25016c5362dc945406df0b2114868ecf2
directxman12@💻 ~[go:kubernetes] $ ag 'klog\.\w+Depth'
staging/src/k8s.io/component-base/logs/logs.go
49:     klog.InfoDepth(1, string(data))

staging/src/k8s.io/kubectl/pkg/util/logs/logs.go
42:     klog.InfoDepth(1, string(data))

staging/src/k8s.io/kubectl/pkg/cmd/util/helpers.go
93:             klog.FatalDepth(2, msg)

staging/src/k8s.io/client-go/tools/clientcmd/config.go
486:            klog.FatalDepth(1, err)

staging/src/k8s.io/apiserver/pkg/storage/etcd3/logger.go
35:     klog.InfoDepth(klogWrapperDepth, args...)
39:     klog.InfoDepth(klogWrapperDepth, fmt.Sprintln(args...))
43:     klog.InfoDepth(klogWrapperDepth, fmt.Sprintf(format, args...))
47:     klog.WarningDepth(klogWrapperDepth, args...)
51:     klog.WarningDepth(klogWrapperDepth, fmt.Sprintln(args...))
55:     klog.WarningDepth(klogWrapperDepth, fmt.Sprintf(format, args...))
59:     klog.ErrorDepth(klogWrapperDepth, args...)
63:     klog.ErrorDepth(klogWrapperDepth, fmt.Sprintln(args...))
67:     klog.ErrorDepth(klogWrapperDepth, fmt.Sprintf(format, args...))
71:     klog.FatalDepth(klogWrapperDepth, args...)
75:     klog.FatalDepth(klogWrapperDepth, fmt.Sprintln(args...))
79:     klog.FatalDepth(klogWrapperDepth, fmt.Sprintf(format, args...))

staging/src/k8s.io/apiserver/pkg/server/httplog/httplog.go
161:                    klog.InfoDepth(1, fmt.Sprintf("verb=%q URI=%q latency=%v resp=%v UserAgent=%q srcIP=%q: %v%v",
168:                    klog.InfoDepth(1, fmt.Sprintf("verb=%q URI=%q latency=%v UserAgent=%q srcIP=%q: hijacked",
198:            klog.InfoDepth(1, fmt.Sprintf("Unable to convert %+v into http.Flusher", rl.w))

staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go
114:    klog.ErrorDepth(2, err)

pkg/kubelet/kubeletconfig/util/log/log.go
36:     klog.ErrorDepth(1, fmt.Sprintf(logFmt, s))
48:     klog.InfoDepth(1, fmt.Sprintf(logFmt, s))

Going through those,

  1. component-base is an implementation of io.Writer --> klog that preserves line information. Mainly used to capture stdlib logs to klog.

  2. the first kubectl is a copy of component-base (actually, I think it was vice-versa, but whatever)

  3. Conditionally calls normal logger if the log verbosity is high enough.

  4. GetConfigOrDie, attempting to provide useful info as to where the error came from. Better off calling ErrorS with a sink that produces stack traces.

  5. Extra wrapper for the etcd client to log to klog.

  6. Helper "middleware" for logging http requests. Trying to report caller of the logging helper.

  7. Helper for runtime.HandleError. Just use an implementation that reports stack traces with errors. Take care not to contemplate runtime.HandleError too deeply, or it will make you very, very sad.

  8. Inserts a prefix at the beginning of every line. Really should just go away with proper named loggers.

Several of those are semi-global logging wrappers (1, 2, 5) that are better off just having an implementation that can do caller-skip (e.g. zap). I'm not sure we want the caller-skip at the interface level, though.

3, 4, and 7 and probably just need to go away -- they're better dealt-with with implementations that capture backtraces on error.

8 should be replaced by proper prefix/logger name support, so that a wrapper is no longer needed

That just leaves 6, the middleware logger. @thockin's suggestion @ https://github.com/kubernetes/community/pull/4793/files#r431353200 is a good one, but removes the middleware part (a comment in the file suggests that the middleware-link functionality is probably superfluous). Along the same lines, something like klog.InfoS("http request", httplog.Request(req)) would work too, where httplog.Request is analagous to log.KObj.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several of those are semi-global logging wrappers (1, 2, 5) that are better off just having an implementation that can do caller-skip (e.g. zap). I'm not sure we want the caller-skip at the interface level, though.

These ones seem like valid use-cases to me. In fact, I wrote the first version of (5), I think. The ability to strip frames from the callstack seems pretty necessary for any sort of logging util helper or adapter.

I don't quite grok your comment. How would an implementation that can skip callframes be utilized under something like logr without a function for it in the interface?

In other words, should logr and klog grow new methods for InfoDepth(depth, msg, string...) and ErrorDepth(depth, err, msg, string...) ?

Copy link
Contributor

@DirectXMan12 DirectXMan12 May 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... let me rephrase a bit. I think I misinterpreted somethings, and miscommunicated others. I've rewritten this comment a few times, so it's possible there's something I'm not quite articulating correctly.

I agree that they're valid use cases, but:

All the mentioned numbers follow a pattern of "here's a set of helper functions where all the functions have a consistent caller skip". The caller skip isn't varying per-function in the file. I was originally (above) suggesting something like

var logWrapper = klog.WithSkip(4)

func Info(msg) {
  logWrapper.InfoS(prefix+msg)
}

However, that only works when we're not redirecting to logr. That's not too big a deal -- it preserves the status quo, punts on putting things into our interface, and should be doable with the way the internals of klog work.

We could consider adding the WithSkip to logr, which makes the caller skip compose and feels a lot more like the rest of logr's interface. However, this is a bit annoying to implement in some logging backends (see below the fold), and still feels a bit brittle (again, below the fold).

[below the fold] Why I'm kinda against adding the extra Depth methods

My objections/uneasiness with this has two parts:

The first is that adding depth functions feels a lot like leaking implementation details through to the user. This kinda paints us into the corner in terms of always explicitly supporting caller skip like this -- I could see a world where we decide to drop caller in favor of logger names + stack traces for errors or a world where we decide to do "capture caller into context".

The second is that that per-function caller skip is an easy interface to accidentally misuse:

  • It's easy to miss/forget the Depth in one of your functions.

  • It doesn't compose (you can't wrap helpers in helpers and still get useful information, which is something I've seen with similar types of functionality in codebases like controller-runtime).

  • It's sufficiently brittle that I'd actively encourage folks not to use it -- it relies on exact code structure not changing, which isn't something we can enforce with the compiler, lint for, etc. In the case of the etcd logger, if the internals of the etcd client change, we suddenly lose the usefulness of our logging information. That's not really an interface I want people writing against.

Moving on to practical matters: how much we care about people actually being able to implement this, or would we rather they just ignore it? Zap doesn't support per-function caller skip OOTB, but it's doable with some hacking (you'd call entry := logger.Check(...); entry.Caller = /* calculate caller yourself */, or maybe could finagle a custom zapcore.Core to do it), logrus doesn't even do caller skip AFAICT, etc. Do we want to continue to support delegating to logr? It's not clear how much of a goal that it from the KEP (I think we should, or at least have some similar functionality).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider adding the WithSkip to logr

The problem I have with this is that it doesn't feel like a property ofthe
logger object, but of the call-site. I guess if you are wrapping a Logger in
some other type, you can argue that this IS part of the logger (or as you
characterized - it doesn't change). If there's a one-off logging helper
function, you can always make a temporary object (though that can be
allocation-heav if it is done a lot). E.g.

func logHelper(log logr.Logger, foo Foo) {
  logr.SkipFrames(1).Info("foo message", "address", foo.Addr(), "phone", foo.Phone)
}

per-function caller skip is an easy interface to accidentally misuse:

OK, you convinced me.

how much we care about people actually being able to implement this, or would we rather they just ignore it?

I suspect some log libs will handle it and some will no-op it. I think that's
fine, honestly. There will be pressure or there won't, but at least it is
possible to DTRT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be alloc-heavy if it is done a lot

I would hope escape analysis and friends would prevent a heap allocation here.

minimal design from [logr] thus there is no one-to-one mapping.

Simplified mapping between functions:
* `klog.Infof`, `klog.Info`, `klog.Infoln`, `klog.InfoDepth` -> `klog.InfoS`
Copy link
Contributor

@DirectXMan12 DirectXMan12 May 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the list is:

directxman12@💻 ~[go:kubernetes] $ git rev-parse HEAD
02637bb25016c5362dc945406df0b2114868ecf2
directxman12@💻 ~[go:kubernetes] $ ag 'klog\.\w+Depth'
staging/src/k8s.io/component-base/logs/logs.go
49:     klog.InfoDepth(1, string(data))

staging/src/k8s.io/kubectl/pkg/util/logs/logs.go
42:     klog.InfoDepth(1, string(data))

staging/src/k8s.io/kubectl/pkg/cmd/util/helpers.go
93:             klog.FatalDepth(2, msg)

staging/src/k8s.io/client-go/tools/clientcmd/config.go
486:            klog.FatalDepth(1, err)

staging/src/k8s.io/apiserver/pkg/storage/etcd3/logger.go
35:     klog.InfoDepth(klogWrapperDepth, args...)
39:     klog.InfoDepth(klogWrapperDepth, fmt.Sprintln(args...))
43:     klog.InfoDepth(klogWrapperDepth, fmt.Sprintf(format, args...))
47:     klog.WarningDepth(klogWrapperDepth, args...)
51:     klog.WarningDepth(klogWrapperDepth, fmt.Sprintln(args...))
55:     klog.WarningDepth(klogWrapperDepth, fmt.Sprintf(format, args...))
59:     klog.ErrorDepth(klogWrapperDepth, args...)
63:     klog.ErrorDepth(klogWrapperDepth, fmt.Sprintln(args...))
67:     klog.ErrorDepth(klogWrapperDepth, fmt.Sprintf(format, args...))
71:     klog.FatalDepth(klogWrapperDepth, args...)
75:     klog.FatalDepth(klogWrapperDepth, fmt.Sprintln(args...))
79:     klog.FatalDepth(klogWrapperDepth, fmt.Sprintf(format, args...))

staging/src/k8s.io/apiserver/pkg/server/httplog/httplog.go
161:                    klog.InfoDepth(1, fmt.Sprintf("verb=%q URI=%q latency=%v resp=%v UserAgent=%q srcIP=%q: %v%v",
168:                    klog.InfoDepth(1, fmt.Sprintf("verb=%q URI=%q latency=%v UserAgent=%q srcIP=%q: hijacked",
198:            klog.InfoDepth(1, fmt.Sprintf("Unable to convert %+v into http.Flusher", rl.w))

staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go
114:    klog.ErrorDepth(2, err)

pkg/kubelet/kubeletconfig/util/log/log.go
36:     klog.ErrorDepth(1, fmt.Sprintf(logFmt, s))
48:     klog.InfoDepth(1, fmt.Sprintf(logFmt, s))

Going through those,

  1. component-base is an implementation of io.Writer --> klog that preserves line information. Mainly used to capture stdlib logs to klog.

  2. the first kubectl is a copy of component-base (actually, I think it was vice-versa, but whatever)

  3. Conditionally calls normal logger if the log verbosity is high enough.

  4. GetConfigOrDie, attempting to provide useful info as to where the error came from. Better off calling ErrorS with a sink that produces stack traces.

  5. Extra wrapper for the etcd client to log to klog.

  6. Helper "middleware" for logging http requests. Trying to report caller of the logging helper.

  7. Helper for runtime.HandleError. Just use an implementation that reports stack traces with errors. Take care not to contemplate runtime.HandleError too deeply, or it will make you very, very sad.

  8. Inserts a prefix at the beginning of every line. Really should just go away with proper named loggers.

Several of those are semi-global logging wrappers (1, 2, 5) that are better off just having an implementation that can do caller-skip (e.g. zap). I'm not sure we want the caller-skip at the interface level, though.

3, 4, and 7 and probably just need to go away -- they're better dealt-with with implementations that capture backtraces on error.

8 should be replaced by proper prefix/logger name support, so that a wrapper is no longer needed

That just leaves 6, the middleware logger. @thockin's suggestion @ https://github.com/kubernetes/community/pull/4793/files#r431353200 is a good one, but removes the middleware part (a comment in the file suggests that the middleware-link functionality is probably superfluous). Along the same lines, something like klog.InfoS("http request", httplog.Request(req)) would work too, where httplog.Request is analagous to log.KObj.

because in [logr] Error is used to inform users about errors received from subordinate function calls.
This means that `klog.ErrorS` should only be used when there is a obvious Golang `error` object available.

Creating an error just to call `klog.ErrorS` or passing `nil` is discouraged, please consider using `klog.InfoS` instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious as to why that is -- ErrorS in logr has some semantic differences (e.g. it captures stack traces). I think passing nil is semi-acceptable if the case is this is an error condition that deserves a stack trace, but this is the origin point of the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I will apply your suggestion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the current implementation of klog.ErrorS() calls logr.Info() if the err is nil. If we want to encourage people to pass nil I believe that this should be changed so that logr.Error() would be called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note that klog.ErrorS is not guaranteed to generate a stacktrace as by default it still calls klog.print

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we just kinda note in logr that Error may take special action to deal with errors in logr (see below). For instance, I could see an impl that formatted errors for capture by special error recording tools (e.g. stackdriver error reporting or somesuch), ones that print stack traces (like zap), and so on. It's nice because it leaves us a way to do cool stuff like that in the future, even if we don't do it by default now.


FWIW, logr docs say:

Furthermore, certain implementations may choose to attach additional information (such as stack traces) on calls to Error, so it's preferred to use Error to log errors.

As part of structured logging migration we want to ensure that kubernetes objects references are consistent within the
codebase. Two new utility functions were introduced to klog `klog.KObj` and `klog.KRef`. Any reference
(name, uid, namespace) to Kubernetes Object (Pod, Node, Deployment, CRD) should be rewritten to utilize those functions.
In situations when object `UID` is would be beneficial for log, it should be added as separate field with `_uid` suffix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just have UID be a field in the output of KObj or have KObjWithUID?

Copy link
Contributor

@DirectXMan12 DirectXMan12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of minor comments, but I think this looks pretty good.

I'm curious to see how things like UID practically play out -- can always adjust the plan when we see how relevant that is.

Functions with depth (`klog.InfoDepth`, `klog.WarningDepth`, `klog.ErrorDepth`, `klog.FatalDepth`) are used to indicate
that the source of the log (added as metadata in log) is different than the invocation of logging library. This is
usually used when implementing logging util functions. As logr interface doesn't support depth, those functions should
return logging arguments instead of calling `klog` directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth noting that there are situations where this isn't possible (mainly "adapters" and runtime.HandleError -- see my comment above), even if we decide to punt down the road on how to handle those cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a todo about expanding this section with the discussion above (we'll need it before beta)?

@serathius
Copy link
Contributor Author

/assign @logicalhan

@serathius
Copy link
Contributor Author

ping @logicalhan

@logicalhan
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DirectXMan12, logicalhan, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2020
@k8s-ci-robot k8s-ci-robot merged commit 894a89a into kubernetes:master Jun 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/developer-guide Issues or PRs related to the developer guide cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants