-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-21.1: util/log,server/debug: fixes to logging and the logspy api endpoint #66328
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tbg
approved these changes
Jun 11, 2021
knz
force-pushed
the
backport21.1-66096-61378
branch
3 times, most recently
from
June 11, 2021 16:00
01fe7d8
to
4688e49
Compare
Release note: None
This simplifies the test and makes it more extensible. Release note: None
…entries Prior to this change, we did not have a public data structure able to represent the various fields available in a `crdb-v2` log entry: the existing `logpb.Entry` was not able to distinguish structured and non-structured entries, and did not have the ability to delimit the message and the stack trace. This patch extends `logpb.Entry` to make it more able to represent `crdb-v2` entries, at least for the purpose of extending `debug merge-log` towards conversion between logging formats. Additionally, the patch adds a *best effort* attempt at populating the new fields in the `crdb-v1` entry parser. This is best effort because the crdb-v1 parser is lossy so we cannot faithfully parse its entries reliably. Release note: None
Release note (cli change): When log entries are written to disk, the first few header lines written at the start of every new file now report the configured logging format.
Release note (bug fix): Previously, `--log='file-defaults: {format: crdb-v1}'` was not handled properly. This has been fixed. This bug existed since v21.1.0.
This increases troubleshootability. Release note: None
Prior to this patch, the logspy mechanism was utterly broken: any time it was running, it would cut off any and all log entries from going to files, stderr, network sinks etc. This was a gross violation of the guarantees we have constructed around structured logging, as well as a security vulnerability (see release note below). Additionally, it was impossible to launch multiple logspy sessions simultaneously on the same node, for example using different grep filters. This commit rectifies the situation. At a technical level, we have two changes: one in the logging Go API and one in the `/debug/logspy` HTTP API. **For the logging changes**, this patch replaces the "interceptor" singleton callback function that takes over `(*loggerT).outputLogEntry()`, by a new *interceptor sink* that lives alongside the other sinks on every channel. Because the interceptor logic is now a regular sink, log entries continue to be delivered to other sinks while an interceptor is active. Reusing the sink abstraction, with its own sink configuration with no severity filter, clarifies that the interceptor accepts all the log entries regardless of which filters are configured on other sinks. Additionally, the interceptor sink now supports multiple concurrent interception sessions. Each log entry is delivered to all current interceptors. The `logspy` logic is adapted to use this facility, so that multiple `logspy` requests can run side-by-side. **For the HTTP API change**, we are changing the `/debug/logspy` semantics. This is explained in the release note below. Release note (security update): All the logging output to files or network sinks was previously disabled temporarily while an operator was using the `/debug/logspy` HTTP API, resulting in lost entries and a breach of auditability guarantees. This behavior has been corrected. Release note (bug fix): Log entries are not lost any more while the `/debug/logspy` HTTP API is being used. This bug had existed since CockroachDB v1.1. Release note (api change): The `/debug/logspy` HTTP API has changed. The endpoint now returns JSON data by default. This change is motivated as follows: - the previous format, `crdb-v1`, cannot be parsed reliably. - using JSON entries guarantees that the text of each entry all fits on a single line of output (newline characters inside the messages are escaped). This makes filtering easier and more reliable. - using JSON enables the user to apply `jq` on the output, for example via `curl -s .../debug/logspy | jq ...` If the previous format is desired, the user can pass the query argument `&flatten=1` to the `logspy` URL to obtain the previous flat text format (`crdb-v1`) instead. Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Prior to this patch, any ongoing `/debug/logspy` query would trigger maximum verbosity in the logging package - i.e. cause all logging API calls under `log.V` to be activated. This was problematic, as it would cause a large amount of logging traffic to be pumped through the interceptor logic, increasing the chance for entries to be dropped (especially when `logspy?grep=...` is not used). Additionally, since the previous change to make the interceptor logic a regular sink, all the entries captured by the interceptors are now also duplicated to the other sinks (this is a feature / bug fix, as explained in the previous commit message). However, this change also meant that any ongoing `logspy` request would cause maximum verbosity to all the sinks with threshold INFO (most logging calls under `log.V` have severity INFO). For example, the DEV channel accepts all entries at level INFO or higher in the default config. This in turn could incur unacceptable disk space or IOPS consumption in certain deployments. In orde to mitigate this new problem, this patch removes the special conditional from the logging package. From this point forward, the verbosity of the entries delivered via `/debug/logspy` are those configured via `vmodule` -- no more and no less. To make this configurable for a running server, including one where the SQL endpoint may not be available yet, this patch also introduces a new `/debug/vmodule` HTTP API and extends `/debug/logspy` with the ability to temporarily change `vmodule` *upon explicit request*. Release note (api change): The `/debug/logspy` API does not any more enable maximum logging verbosity automatically. To change the verbosity, use the new `/debug/vmodule` endpoint or pass the `&vmodule=` query parameter to the `/debug/logspy` endpoint. For example, suppose you wish to run a 20s logspy session: - Before: ``` curl 'https://.../debug/logspy?duration=20s&...' ``` - Now: ``` curl 'https://.../debug/logspy?duration=20s&vmodule=...' ``` OR ``` curl 'https://.../debug/vmodule?duration=22s&vmodule=...' curl 'https://.../debug/logspy?duration=20s' ``` As for the regular `vmodule` command-line flag, the maximum verbosity across all the source code can be selected with the pattern `*=4`. Note: at most one in-flight HTTP API request is allowed to modify the `vmodule` parameter. This maintain the invariant that the configuration restored at the end of each request is the same as when the request started. Release note (api change): The new `/debug/vmodule` API makes it possible for an operator to configure the logging verbosity in a similar way as the SQL built-in function `crdb_internal.set_vmodule()`, or to query the current configuration as in `crdb_internal.get_vmodule()`. Additionally, any configuration change performed via this API can be automatically reverted after a configurable delay. The API forms are: - `/debug/vmodule` - retrieve the current configuration. - `/debug/vmodule?set=[vmodule config]&duration=[duration]` - change the configuration to `[vmodule config]` . The previous configuration at the time the `/debug/vmodule` request started is restored after `[duration]`. This duration, if not specified, defaults to twice the default duration of a `logspy` request (currently, the `logspy` default duration is 5s, so the `vmodule` default duration is 10s). If the duration is zero or negative, the previous configuration is never restored.
knz
force-pushed
the
backport21.1-66096-61378
branch
from
June 16, 2021 11:26
4688e49
to
4483626
Compare
This was referenced Aug 13, 2021
This was referenced Aug 13, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport:
logpb.Entry
with more fine grained fields" (util/log: extendlogpb.Entry
with more fine grained fields #66096)logspy
, new HTTP API to changevmodule
" (log,debug: fixlogspy
, new HTTP API to changevmodule
#61378)Please see individual PRs for details.
/cc @cockroachdb/release