server: Implement logsink for the system event logs. #46143

miretskiy · 2020-03-16T12:45:10Z

server: Implement logsink for the system event logs.

Cockroach server logs important system events into the eventlog table.
These events are exposed on the web UI. However, the operators often
want to see those global events while tailing a log file on a single
node.

Implement a mechanism for the server running on each node
to emit those system events into server log file.

If the system log scanning is enabled (via server.eventlogsink.enabled setting),
then each node scans the system log table periodically,
every server.eventlogsink.period period;

For example, below is a single system event emitted to the regular log file.:
I200323 .... [n1] system.eventlog:n=1:'set_cluster_setting':2020-03-23 19:24:29.948279
+0000 UTC '{"SettingName":"server.eventlogsink.max_entries","Value":"101","User":"root"}'

There is no guaranteed that all events from system log will be emitted.
In particular, upon node restart, we only emit events that were generated
from that point on. Also, if for whatever reason,we start emitting
too many system log messages, then only up to the
server.eventlogsink.max_entries (default 100) recent events will be emitted.
However, if we think we have "dropped" some events due to confuration settings,
we will indicate so in the log.

Release notes (feature): Log system wide events into cockroach.log
file on every node.

cockroach-teamcity · 2020-03-16T12:45:18Z

This change is

knz

Please explain the work in more details in the commit message:

how does it work
the list of all cluster settings
how to use them
an example output
what happens if a node gets restarted. Do you then get duplicate entries in the log file? Is there a way to recognize this? (Maybe the very first poll after a node restart should indicate that the logger is "catching up" and thus there may be duplicate entries.)

Reviewed 3 of 3 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)

pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

	p := eventlogSinkPeriod.Get(sv)
	if p < 100*time.Millisecond {

I would recommend setting the minimum to 1 or even 5 seconds. In a geo-distributed clusters, we don't want cross-region queries more frequently than that.

pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

		if int64(len(rows)) == maxEntries {
			log.VEventf(ctx, 1, "Some log entries may have been dropped")

the log message is too vague.
you should pass limit+1 to detect an overflow

miretskiy · 2020-03-19T14:18:43Z

@knz have you had a chance to take a look at this pr?

knz · 2020-03-19T15:00:25Z

see my review above #46143 (review)

miretskiy

Sorry, I missed it... Somehow

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)

miretskiy

Description updated.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)

pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

Previously, knz (kena) wrote…

I would recommend setting the minimum to 1 or even 5 seconds. In a geo-distributed clusters, we don't want cross-region queries more frequently than that.

The default is 10 seconds. This check here is really a fail safe mechanism not to cause a DoS against ourselves.
I guess, if you're in some crazy debugging mode, then perhaps setting this to a low value might actually make sense. If we set the min to 5 seconds, then it would be impossible to set this lower even if we want to. I think 100ms is appropriate for a fail safe, but if you feel strongly, perhaps going up to 1 sec is okay.

pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

Previously, knz (kena) wrote…

the log message is too vague.

you should pass limit+1 to detect an overflow

Log messages changed; wdyt?
are you worried len() returning wider than int64? not clear how/where you see the possibility of an overflow

miretskiy · 2020-03-27T13:32:16Z

@knz pls take another look?

knz · 2020-03-27T13:33:52Z

I have not forgotten! but it's been a bit hectic this week. I will look at this.

knz

Ok so progress here is good.
There are a few more points of interest:

the include_events setting is not yet explained in the commit message
the release note should include an explanation of 1) why it's important + what problem it solves 2) compact explanation of how it works 3) concise summary of limitations

Finally, I am still not very comfortable with the limitations of this implementation.

Our users have requested this work because they want to see their events in log files. The implementation you are providing here can miss events altogether, as well as report duplicate events in the case of a small backward clock jump (because events use the system clock, not the HLC).
I agree the latter is practically unlikely, but the fact these exceptions exist will need to be documented, and create special cases for the user to think about.

In the mean time, I do not understand what prevents a user from setting up a changefeed manually on system.eventlog. This way they can get the data out in a way that's reliable. What were the obstacles against that?

Reviewed 1 of 1 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)

pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

The default is 10 seconds. This check here is really a fail safe mechanism not to cause a DoS against ourselves.
I guess, if you're in some crazy debugging mode, then perhaps setting this to a low value might actually make sense. If we set the min to 5 seconds, then it would be impossible to set this lower even if we want to. I think 100ms is appropriate for a fail safe, but if you feel strongly, perhaps going up to 1 sec is okay.

If there are 10 nodes, 100ms will cause 1000 kv requests per second to that range. That's a hot spot.
Meanwhile 10 nodes is actually pretty common (3 DCs x 3 nodes per DC).

I would move for a minimum of 1 second.

pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Log messages changed; wdyt?

are you worried len() returning wider than int64? not clear how/where you see the possibility of an overflow

Regarding the limit. Your code here wants to detect when the query has hit the limit.
Suppose the caller passed the value maxEntries == 10.

The way you implemented this currently does not make a difference between
a) there are exactly 10 entries in the event log, and the overflow is not reached
b) there were more than 10 entries in the event log so an overflow is reached and should be reported

My point is that if you were to pass maxEntries+1 in the SELECT queries, then you can check if len(rows) > maxEntries here and get a more accurate test.

pkg/server/server_systemlog_logsink_test.go, line 82 at r1 (raw file):

	// Scan all entries.
	_ = sqlDB.Exec(t, "set cluster setting server.eventlogsink.max_entries=-1")

cluster settings propagate asynchronously. You need to add some wait underneath each setting update otherwise your test will be flaky.

Fixes cockroachdb#45643 Cockroach server logs important system events into the eventlog table. These events are exposed on the web UI. However, the operators often want to see those global events while tailing a log file on a single node. Implement a mechanism for the server running on each node to emit those system events into server log file. If the system log scanning is enabled (via server.eventlogsink.enabled setting), then each node scans the system log table periodically, every server.eventlogsink.period period; For example, below is a single system event emitted to the regular log file.: I200323 .... [n1] system.eventlog:n=1:'set_cluster_setting':2020-03-23 19:24:29.948279 +0000 UTC '{"SettingName":"server.eventlogsink.max_entries","Value":"101","User":"root"}' There is no guaranteed that all events from system log will be eimitted. In particular, upon node restart, we only emit events that were generated from that point on. Also, if for whatever reason,we start emitting too many system log messages, then only up to the server.eventlogsink.max_entries (default 100) recent events will be emitted. If we think we have "dropped" some events due to confuration settings, we will indicate so in the log. The administrators may choose to restrict the set of events emitted by changing server.eventlogsink.include_events and/or server.eventlogsink.exclude_events settings. These settings specify regular expressions to include or exclude events with matching event types. Release notes (feature): Log system wide events into cockroach.log file on every node. This feature allows the administrator logged in into one of the nodes to monitor that nodes log file and see important "system" events, such as table/index creationg, schema change jobs, etc. To use this feature, the server.eventlogsink.enabled setting needs to be set to true.

miretskiy

Commit/rel note messages updated.

Re your other concerns: yes, there are limitations, though, for the system events, those limitations are not
super terrible -- afterall, you can always query syslog table if you suspect that some events have been missed
(and of course, I do try to emit the warning message if there are some events that were missed).

Regarding changefeed -- I tried that. There are few things that are currently in a way:

you cannot start changefeed on the system tables
you cannot start changefeed on the tables with more than 1 locality group.
It's pretty strange to me, but the syslog table has 5(!) locality groups. For no reason that I can see.

All of those limitations will, likely, be solved in time. And it would have been my preference to use changefeed.
This quick and dirty implementation is just that -- a quick and dirty implementation that can show these log messages right now.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)

pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

Previously, knz (kena) wrote…

If there are 10 nodes, 100ms will cause 1000 kv requests per second to that range. That's a hot spot.
Meanwhile 10 nodes is actually pretty common (3 DCs x 3 nodes per DC).

I would move for a minimum of 1 second.

Done.

pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

Previously, knz (kena) wrote…

Regarding the limit. Your code here wants to detect when the query has hit the limit.
Suppose the caller passed the value maxEntries == 10.

The way you implemented this currently does not make a difference between
a) there are exactly 10 entries in the event log, and the overflow is not reached
b) there were more than 10 entries in the event log so an overflow is reached and should be reported

My point is that if you were to pass maxEntries+1 in the SELECT queries, then you can check if len(rows) > maxEntries here and get a more accurate test.

Good point. Done

pkg/server/server_systemlog_logsink_test.go, line 82 at r1 (raw file):

Previously, knz (kena) wrote…

cluster settings propagate asynchronously. You need to add some wait underneath each setting update otherwise your test will be flaky.

Hmm... I ran the test 1000 times -- 0 flakes.
Regardless, it's a good point -- changed it to set the settings directly.

knz

Code changes LGTM.
However regarding the discussion about desirability, please look at my last comment here:

#45643 (comment)

I'd like to kindly request you discuss this again with your team in light of peter's latest input, and only decide where you want to go with this.

Reviewed 3 of 3 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained

miretskiy requested a review from knz March 16, 2020 12:45

knz reviewed Mar 19, 2020

View reviewed changes

miretskiy commented Mar 23, 2020

View reviewed changes

miretskiy force-pushed the eventlog branch from 09e4a81 to 1cc384d Compare March 23, 2020 19:54

miretskiy requested a review from knz March 23, 2020 19:54

miretskiy commented Mar 23, 2020

View reviewed changes

miretskiy force-pushed the eventlog branch from 1cc384d to 5c55957 Compare March 23, 2020 20:14

knz reviewed Mar 30, 2020

View reviewed changes

miretskiy force-pushed the eventlog branch from 5c55957 to eae879d Compare April 1, 2020 13:17

miretskiy requested a review from knz April 1, 2020 13:17

miretskiy commented Apr 1, 2020

View reviewed changes

knz reviewed Apr 1, 2020

View reviewed changes

miretskiy closed this Apr 1, 2020

miretskiy deleted the eventlog branch April 27, 2020 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: Implement logsink for the system event logs. #46143

server: Implement logsink for the system event logs. #46143

miretskiy commented Mar 16, 2020 •

edited by knz

Loading

cockroach-teamcity commented Mar 16, 2020

knz left a comment

miretskiy commented Mar 19, 2020

knz commented Mar 19, 2020

miretskiy left a comment

miretskiy left a comment

miretskiy commented Mar 27, 2020

knz commented Mar 27, 2020

knz left a comment

miretskiy left a comment

knz left a comment

server: Implement logsink for the system event logs. #46143

server: Implement logsink for the system event logs. #46143

Conversation

miretskiy commented Mar 16, 2020 • edited by knz Loading

cockroach-teamcity commented Mar 16, 2020

knz left a comment

Choose a reason for hiding this comment

miretskiy commented Mar 19, 2020

knz commented Mar 19, 2020

miretskiy left a comment

Choose a reason for hiding this comment

miretskiy left a comment

Choose a reason for hiding this comment

miretskiy commented Mar 27, 2020

knz commented Mar 27, 2020

knz left a comment

Choose a reason for hiding this comment

miretskiy left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

miretskiy commented Mar 16, 2020 •

edited by knz

Loading