-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util/log: new output format 'crdb-v2' #59087
Conversation
Remaining work:
|
0d4757c
to
19d82e0
Compare
this is now ready for review. |
133d6a2
to
c164d4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Just one minor question, but this is looking ready to go.
Reviewed 3 of 3 files at r1, 22 of 22 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @knz and @thtruo)
docs/generated/logformats.md, line 246 at r2 (raw file):
associated, for example header entries in file sinks, the counter position in the common prefix appears as a sequence of two ASCII space characters, which can be used to reliably identify this situation.
Can we reliably guarantee this? I'm thinking of the interpretation of this being "find the two spaces in a line to identify it as a log entry with no counter", and there being some log entry somewhere in the code that unintentionally writes two spaces as part of the message. Maybe I'm overthinking this.
pkg/util/log/logconfig/config.go, line 27 at r2 (raw file):
// DefaultFileFormat is the entry format for file sinks when not // specified in a configuration. const DefaultFileFormat = `crdb-v2`
🎉
This was not done previously as I had some doubt that maybe some log directive was using a format starting with `"%d ..."` i.e. a log message would start with a number and a space. However, `git grep` confirms this is not the case. So it is safe to extract the entry counter this way. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @thtruo)
docs/generated/logformats.md, line 246 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Can we reliably guarantee this? I'm thinking of the interpretation of this being "find the two spaces in a line to identify it as a log entry with no counter", and there being some log entry somewhere in the code that unintentionally writes two spaces as part of the message. Maybe I'm overthinking this.
Clarified the phrasing: yes, it is reliable. There's always a space before and after, so if there are two spaces in a row that means there was no counters.
This new format intends to address all the known shortcomings with `crdb-v1` while remaining compatible with entry parsers designed for the previous version. See the user-facing release note below for a summary of changes; and the included reference documentation for details. Example for a single-line unstructured entry. Before: ``` I210116 22:17:03.736236 57 cli/start.go:681 ⋮ node startup completed: ``` After: ``` I210116 22:17:03.736236 57 cli/start.go:681 ⋮ [-] 12 node startup completed: tag field now always included ^^^ entry counter now always included ^^^ ``` Example for a multi-line unstructured entry. Before: ``` I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 gossip status (ok, 1 node‹›) gossip client (0/3 cur/max conns) gossip connectivity n1 [sentinel]; ``` (subsequent lines lack a log entry prefix; hard to determine where entries start and end) After: ``` I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 gossip status (ok, 1 node‹›) I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip client (0/3 cur/max conns) I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip connectivity I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 + n1 [sentinel]; ^^^ common prefix repeated for each msg line same entry counter for every subsequent line ^^^ continuation marker "+" on susequent lines ^^ ``` Example for a structured entry. Before: ``` I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] Structured entry: {...} ``` After: ``` I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] 21 ={...} entry counter always present ^^ equal sign "=" denotes structured entry ^^ ``` Release note (cli change): The default output format for `file-group` and `stderr` sinks has been changed to `crdb-v2`. This new format is non-ambiguous and makes it possible to reliably parse log files. Refer to the format's documentation for details. Additionally, it prevents single log lines from exceeding a large size; this problem is inherent to the `crdb-v1` format and can prevent `cockroach debug zip` from retrieving v1 log files. The new format has also been designed so that existinglog file analyzers for the `crdb-v1` format can read entries written the new format. However, this conversion may be imperfect. Again, refer to the new format's documentation for details. In case of incompatibility, users can force the previous format by using `format: crdb-v1` in their logging configuration.
I also added a formatting test so we have side-by-side examples of entries with and without counters, to clarify. TFYR! bors r=itsbilal |
Build succeeded: |
Fixes #50166.
This new format intends to address all the known shortcomings with
crdb-v1
while remaining compatible with entry parsers designed for the previous version.See the user-facing release note below for a summary of changes; and the included reference documentation for details.
Example TTY output with colors:
Example for a single-line unstructured entry.
Before:
After:
Example for a multi-line unstructured entry.
Before:
(subsequent lines lack a log entry prefix; hard to determine where
entries start and end)
After:
Example for a structured entry.
Before:
After:
Release note (cli change): The default output format for
file-group
and
stderr
sinks has been changed tocrdb-v2
.This new format is non-ambiguous and makes it possible to reliably
parse log files. Refer to the format's documentation for
details. Additionally, it prevents single log lines from exceeding a
large size; this problem is inherent to the
crdb-v1
format and canprevent
cockroach debug zip
from retrieving v1 log files.The new format has also been designed so that existinglog file
analyzers for the
crdb-v1
format can read entries written the newformat. However, this conversion may be imperfect. Again, refer to
the new format's documentation for details.
In case of incompatibility, users can force the previous format by
using
format: crdb-v1
in their logging configuration.