Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/log: new output format 'crdb-v2' #59087

Merged
merged 2 commits into from
Jan 25, 2021
Merged

Conversation

knz
Copy link
Contributor

@knz knz commented Jan 16, 2021

Fixes #50166.

This new format intends to address all the known shortcomings with crdb-v1 while remaining compatible with entry parsers designed for the previous version.
See the user-facing release note below for a summary of changes; and the included reference documentation for details.

Example TTY output with colors:
image

Example for a single-line unstructured entry.

Before:

I210116 22:17:03.736236 57 cli/start.go:681 ⋮ node startup completed:

After:

I210116 22:17:03.736236 57 cli/start.go:681 ⋮ [-] 12  node startup completed:
              tag field now always included   ^^^
          entry counter now always included       ^^^

Example for a multi-line unstructured entry.

Before:

I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74  gossip status (ok, 1 node‹›)
gossip client (0/3 cur/max conns)
gossip connectivity
  n1 [sentinel];

(subsequent lines lack a log entry prefix; hard to determine where
entries start and end)

After:

I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74  gossip status (ok, 1 node‹›)
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip client (0/3 cur/max conns)
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip connectivity
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +  n1 [sentinel];
  ^^^ common prefix repeated for each msg line
       same entry counter for every subsequent line    ^^^
               continuation marker "+" on susequent lines ^^

Example for a structured entry.

Before:

I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] Structured entry: {...}

After:

I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] 21 ={...}
                            entry counter always present    ^^
                      equal sign "=" denotes structured entry ^^

Release note (cli change): The default output format for file-group
and stderr sinks has been changed to crdb-v2.

This new format is non-ambiguous and makes it possible to reliably
parse log files. Refer to the format's documentation for
details. Additionally, it prevents single log lines from exceeding a
large size; this problem is inherent to the crdb-v1 format and can
prevent cockroach debug zip from retrieving v1 log files.

The new format has also been designed so that existinglog file
analyzers for the crdb-v1 format can read entries written the new
format. However, this conversion may be imperfect. Again, refer to
the new format's documentation for details.

In case of incompatibility, users can force the previous format by
using format: crdb-v1 in their logging configuration.

@knz knz requested review from itsbilal and thtruo January 16, 2021 22:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz force-pushed the 20210116-fmt-v2 branch from eee6e50 to cd8a856 Compare January 16, 2021 22:34
@knz knz changed the title util/log: new output format 'crdb-v2' [WIP] util/log: new output format 'crdb-v2' Jan 16, 2021
@knz knz marked this pull request as draft January 16, 2021 22:40
@knz
Copy link
Contributor Author

knz commented Jan 16, 2021

Remaining work:

  • fix broken tests in util/log
  • fix broken tests elsewhere
  • (optional) break lines longer than X kilobytes (maybe 4K at a time)
  • (optional) truncate JSON payloads larger than X kilobytes and remove the json prefix for them
  • (best effort) try to extend the existing entry parser to merge subsequent lines that have the same counter
  • (optional) make it possible for merge-logs to output using the new format.

@knz knz force-pushed the 20210116-fmt-v2 branch 3 times, most recently from 0d4757c to 19d82e0 Compare January 22, 2021 15:37
@knz knz changed the title [WIP] util/log: new output format 'crdb-v2' util/log: new output format 'crdb-v2' Jan 22, 2021
@knz knz marked this pull request as ready for review January 22, 2021 15:37
@knz knz requested a review from a team as a code owner January 22, 2021 15:37
@knz
Copy link
Contributor Author

knz commented Jan 22, 2021

this is now ready for review.

@knz knz force-pushed the 20210116-fmt-v2 branch 4 times, most recently from 133d6a2 to c164d4a Compare January 23, 2021 14:20
Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Great work! Just one minor question, but this is looking ready to go.

Reviewed 3 of 3 files at r1, 22 of 22 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz and @thtruo)


docs/generated/logformats.md, line 246 at r2 (raw file):

associated, for example header entries in file sinks, the counter position
in the common prefix appears as a sequence of two ASCII space characters,
which can be used to reliably identify this situation.

Can we reliably guarantee this? I'm thinking of the interpretation of this being "find the two spaces in a line to identify it as a log entry with no counter", and there being some log entry somewhere in the code that unintentionally writes two spaces as part of the message. Maybe I'm overthinking this.


pkg/util/log/logconfig/config.go, line 27 at r2 (raw file):

// DefaultFileFormat is the entry format for file sinks when not
// specified in a configuration.
const DefaultFileFormat = `crdb-v2`

🎉

This was not done previously as I had some doubt that maybe some log
directive was using a format starting with `"%d ..."` i.e. a log message
would start with a number and a space.

However, `git grep` confirms this is not the case. So it is safe to
extract the entry counter this way.

Release note: None
@knz knz force-pushed the 20210116-fmt-v2 branch from c164d4a to 1e326f0 Compare January 25, 2021 18:17
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @thtruo)


docs/generated/logformats.md, line 246 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Can we reliably guarantee this? I'm thinking of the interpretation of this being "find the two spaces in a line to identify it as a log entry with no counter", and there being some log entry somewhere in the code that unintentionally writes two spaces as part of the message. Maybe I'm overthinking this.

Clarified the phrasing: yes, it is reliable. There's always a space before and after, so if there are two spaces in a row that means there was no counters.

This new format intends to address all the known shortcomings with
`crdb-v1` while remaining compatible with entry parsers designed for
the previous version.

See the user-facing release note below for a summary of changes; and
the included reference documentation for details.

Example for a single-line unstructured entry.

Before:
```
I210116 22:17:03.736236 57 cli/start.go:681 ⋮ node startup completed:
```

After:
```
I210116 22:17:03.736236 57 cli/start.go:681 ⋮ [-] 12  node startup completed:
              tag field now always included   ^^^
          entry counter now always included       ^^^
```

Example for a multi-line unstructured entry.

Before:
```
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74  gossip status (ok, 1 node‹›)
gossip client (0/3 cur/max conns)
gossip connectivity
  n1 [sentinel];
```

(subsequent lines lack a log entry prefix; hard to determine where
entries start and end)

After:
```
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74  gossip status (ok, 1 node‹›)
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip client (0/3 cur/max conns)
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +gossip connectivity
I210116 22:15:38.105666 452 gossip/gossip.go:567 ⋮ [n1] 74 +  n1 [sentinel];
  ^^^ common prefix repeated for each msg line
       same entry counter for every subsequent line    ^^^
               continuation marker "+" on susequent lines ^^
```

Example for a structured entry.

Before:
```
I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] Structured entry: {...}
```

After:
```
I210116 22:14:38.175829 469 util/log/event_log.go:32 ⋮ [n1] 21 ={...}
                            entry counter always present    ^^
                      equal sign "=" denotes structured entry ^^
```

Release note (cli change): The default output format for `file-group`
and `stderr` sinks has been changed to `crdb-v2`.

This new format is non-ambiguous and makes it possible to reliably
parse log files. Refer to the format's documentation for
details. Additionally, it prevents single log lines from exceeding a
large size; this problem is inherent to the `crdb-v1` format and can
prevent `cockroach debug zip` from retrieving v1 log files.

The new format has also been designed so that existinglog file
analyzers for the `crdb-v1` format can read entries written the new
format. However, this conversion may be imperfect. Again, refer to
the new format's documentation for details.

In case of incompatibility, users can force the previous format by
using `format: crdb-v1` in their logging configuration.
@knz knz force-pushed the 20210116-fmt-v2 branch from 1e326f0 to f0377c4 Compare January 25, 2021 18:20
@knz
Copy link
Contributor Author

knz commented Jan 25, 2021

I also added a formatting test so we have side-by-side examples of entries with and without counters, to clarify.

TFYR!

bors r=itsbilal

@craig
Copy link
Contributor

craig bot commented Jan 25, 2021

Build succeeded:

@craig craig bot merged commit 2ec7d92 into cockroachdb:master Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

log: log lines collected by debug.zip seem to be truncated to 65K
3 participants