Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Implement logsink for the system event logs. #46143

Closed
wants to merge 1 commit into from

Conversation

miretskiy
Copy link
Contributor

@miretskiy miretskiy commented Mar 16, 2020

server: Implement logsink for the system event logs.

Informs #45643

Cockroach server logs important system events into the eventlog table.
These events are exposed on the web UI. However, the operators often
want to see those global events while tailing a log file on a single
node.

Implement a mechanism for the server running on each node
to emit those system events into server log file.

If the system log scanning is enabled (via server.eventlogsink.enabled setting),
then each node scans the system log table periodically,
every server.eventlogsink.period period;

For example, below is a single system event emitted to the regular log file.:
I200323 .... [n1] system.eventlog:n=1:'set_cluster_setting':2020-03-23 19:24:29.948279
+0000 UTC '{"SettingName":"server.eventlogsink.max_entries","Value":"101","User":"root"}'

There is no guaranteed that all events from system log will be emitted.
In particular, upon node restart, we only emit events that were generated
from that point on. Also, if for whatever reason,we start emitting
too many system log messages, then only up to the
server.eventlogsink.max_entries (default 100) recent events will be emitted.
However, if we think we have "dropped" some events due to confuration settings,
we will indicate so in the log.

Release notes (feature): Log system wide events into cockroach.log
file on every node.

@miretskiy miretskiy requested a review from knz March 16, 2020 12:45
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain the work in more details in the commit message:

  • how does it work
  • the list of all cluster settings
  • how to use them
  • an example output
  • what happens if a node gets restarted. Do you then get duplicate entries in the log file? Is there a way to recognize this? (Maybe the very first poll after a node restart should indicate that the logger is "catching up" and thus there may be duplicate entries.)

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)


pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

	p := eventlogSinkPeriod.Get(sv)
	if p < 100*time.Millisecond {

I would recommend setting the minimum to 1 or even 5 seconds. In a geo-distributed clusters, we don't want cross-region queries more frequently than that.


pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

		if int64(len(rows)) == maxEntries {
			log.VEventf(ctx, 1, "Some log entries may have been dropped")
  1. the log message is too vague.
  2. you should pass limit+1 to detect an overflow

@miretskiy
Copy link
Contributor Author

@knz have you had a chance to take a look at this pr?

@knz
Copy link
Contributor

knz commented Mar 19, 2020

see my review above #46143 (review)

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed it... Somehow

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description updated.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

Previously, knz (kena) wrote…

I would recommend setting the minimum to 1 or even 5 seconds. In a geo-distributed clusters, we don't want cross-region queries more frequently than that.

The default is 10 seconds. This check here is really a fail safe mechanism not to cause a DoS against ourselves.
I guess, if you're in some crazy debugging mode, then perhaps setting this to a low value might actually make sense. If we set the min to 5 seconds, then it would be impossible to set this lower even if we want to. I think 100ms is appropriate for a fail safe, but if you feel strongly, perhaps going up to 1 sec is okay.


pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

Previously, knz (kena) wrote…
  1. the log message is too vague.
  2. you should pass limit+1 to detect an overflow
  1. Log messages changed; wdyt?

  2. are you worried len() returning wider than int64? not clear how/where you see the possibility of an overflow

@miretskiy
Copy link
Contributor Author

@knz pls take another look?

@knz
Copy link
Contributor

knz commented Mar 27, 2020

I have not forgotten! but it's been a bit hectic this week. I will look at this.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so progress here is good.
There are a few more points of interest:

  • the include_events setting is not yet explained in the commit message
  • the release note should include an explanation of 1) why it's important + what problem it solves 2) compact explanation of how it works 3) concise summary of limitations

Finally, I am still not very comfortable with the limitations of this implementation.

Our users have requested this work because they want to see their events in log files. The implementation you are providing here can miss events altogether, as well as report duplicate events in the case of a small backward clock jump (because events use the system clock, not the HLC).
I agree the latter is practically unlikely, but the fact these exceptions exist will need to be documented, and create special cases for the user to think about.

In the mean time, I do not understand what prevents a user from setting up a changefeed manually on system.eventlog. This way they can get the data out in a way that's reliable. What were the obstacles against that?

Reviewed 1 of 1 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)


pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

The default is 10 seconds. This check here is really a fail safe mechanism not to cause a DoS against ourselves.
I guess, if you're in some crazy debugging mode, then perhaps setting this to a low value might actually make sense. If we set the min to 5 seconds, then it would be impossible to set this lower even if we want to. I think 100ms is appropriate for a fail safe, but if you feel strongly, perhaps going up to 1 sec is okay.

If there are 10 nodes, 100ms will cause 1000 kv requests per second to that range. That's a hot spot.
Meanwhile 10 nodes is actually pretty common (3 DCs x 3 nodes per DC).

I would move for a minimum of 1 second.


pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…
  1. Log messages changed; wdyt?

  2. are you worried len() returning wider than int64? not clear how/where you see the possibility of an overflow

Regarding the limit. Your code here wants to detect when the query has hit the limit.
Suppose the caller passed the value maxEntries == 10.

The way you implemented this currently does not make a difference between
a) there are exactly 10 entries in the event log, and the overflow is not reached
b) there were more than 10 entries in the event log so an overflow is reached and should be reported

My point is that if you were to pass maxEntries+1 in the SELECT queries, then you can check if len(rows) > maxEntries here and get a more accurate test.


pkg/server/server_systemlog_logsink_test.go, line 82 at r1 (raw file):

	// Scan all entries.
	_ = sqlDB.Exec(t, "set cluster setting server.eventlogsink.max_entries=-1")

cluster settings propagate asynchronously. You need to add some wait underneath each setting update otherwise your test will be flaky.

Fixes cockroachdb#45643

Cockroach server logs important system events into the eventlog table.
These events are exposed on the web UI.  However, the operators often
want to see those global events while tailing a log file on a single
node.

Implement a mechanism for the server running on each node
to emit those system events into server log file.

If the system log scanning is enabled (via server.eventlogsink.enabled setting),
then each node scans the system log table periodically,
every server.eventlogsink.period period;

For example, below is a single system event emitted to the regular log file.:
  I200323 .... [n1] system.eventlog:n=1:'set_cluster_setting':2020-03-23 19:24:29.948279
    +0000 UTC '{"SettingName":"server.eventlogsink.max_entries","Value":"101","User":"root"}'

There is no guaranteed that all events from system log will be eimitted.
In particular, upon node restart, we only emit events that were generated
from that point on.  Also, if for whatever reason,we start emitting
too many system log messages, then only up to the
server.eventlogsink.max_entries (default 100) recent events will be emitted.
If we think we have "dropped" some events due to confuration settings,
we will indicate so in the log.

The administrators may choose to restrict the set of events emitted
by changing server.eventlogsink.include_events and/or
server.eventlogsink.exclude_events settings.  These settings specify
regular expressions to include or exclude events with matching event
types.

Release notes (feature): Log system wide events into cockroach.log
file on every node.

This feature allows the administrator logged in into one of the
nodes to monitor that nodes log file and see important "system" events,
such as table/index creationg, schema change jobs, etc.

To use this feature, the server.eventlogsink.enabled setting needs
to be set to true.
Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit/rel note messages updated.

Re your other concerns: yes, there are limitations, though, for the system events, those limitations are not
super terrible -- afterall, you can always query syslog table if you suspect that some events have been missed
(and of course, I do try to emit the warning message if there are some events that were missed).

Regarding changefeed -- I tried that. There are few things that are currently in a way:

  1. you cannot start changefeed on the system tables
  2. you cannot start changefeed on the tables with more than 1 locality group.
    It's pretty strange to me, but the syslog table has 5(!) locality groups. For no reason that I can see.

All of those limitations will, likely, be solved in time. And it would have been my preference to use changefeed.
This quick and dirty implementation is just that -- a quick and dirty implementation that can show these log messages right now.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/server/server_systemlog_logsink.go, line 64 at r1 (raw file):

Previously, knz (kena) wrote…

If there are 10 nodes, 100ms will cause 1000 kv requests per second to that range. That's a hot spot.
Meanwhile 10 nodes is actually pretty common (3 DCs x 3 nodes per DC).

I would move for a minimum of 1 second.

Done.


pkg/server/server_systemlog_logsink.go, line 148 at r1 (raw file):

Previously, knz (kena) wrote…

Regarding the limit. Your code here wants to detect when the query has hit the limit.
Suppose the caller passed the value maxEntries == 10.

The way you implemented this currently does not make a difference between
a) there are exactly 10 entries in the event log, and the overflow is not reached
b) there were more than 10 entries in the event log so an overflow is reached and should be reported

My point is that if you were to pass maxEntries+1 in the SELECT queries, then you can check if len(rows) > maxEntries here and get a more accurate test.

Good point. Done


pkg/server/server_systemlog_logsink_test.go, line 82 at r1 (raw file):

Previously, knz (kena) wrote…

cluster settings propagate asynchronously. You need to add some wait underneath each setting update otherwise your test will be flaky.

Hmm... I ran the test 1000 times -- 0 flakes.
Regardless, it's a good point -- changed it to set the settings directly.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM.
However regarding the discussion about desirability, please look at my last comment here:

#45643 (comment)

I'd like to kindly request you discuss this again with your team in light of peter's latest input, and only decide where you want to go with this.

Reviewed 3 of 3 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@miretskiy miretskiy closed this Apr 1, 2020
@miretskiy miretskiy deleted the eventlog branch April 27, 2020 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants