-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bulkio: provide visibility for events in debug logs #45643
Comments
Interesting. We'd essentially need to... poll the eventlog table and log changes? I was thinking about this from a slightly different angle recently which was that when bulk jobs schedule workers (distsql flow procs), those workers should log at startup and defer a log at exit, so you'd have an idea of why a given node was running a given piece of code. |
That's the obvious implementation. Perhaps we can do something more clever. I'm sure @ajwerner would suggest something with range feeds. |
Indeed I would. |
Can we also consider a network log sink - have our logs push to a network service. |
Definitely. Is there a standard one to use? |
There are a couple actually. Aaron and I were talking about making that part of the security roadmap, since there's a "problems to solve" section already for this kind of work. |
(My technical proposal would be to start an experiment using syslog - which has its own standard protocol and distributed network sinks as plug-ins - and see where that brings us.) |
Fixes cockroachdb#45643 Cockroach server logs important system events into the eventlog table. These events are exposed on the web UI. However, the operators often want to see those global events while tailing a log file on a single node. Implement a mechanism for the server running on each node to emit those system events into server log file. Release notes (feature): Log system wide events into cockroach.log file on every node.
Fixes cockroachdb#45643 Cockroach server logs important system events into the eventlog table. These events are exposed on the web UI. However, the operators often want to see those global events while tailing a log file on a single node. Implement a mechanism for the server running on each node to emit those system events into server log file. If the system log scanning is enabled (via server.eventlogsink.enabled setting), then each node scans the system log table periodically, every server.eventlogsink.period period; For example, below is a single system event emitted to the regular log file.: I200323 .... [n1] system.eventlog:n=1:'set_cluster_setting':2020-03-23 19:24:29.948279 +0000 UTC '{"SettingName":"server.eventlogsink.max_entries","Value":"101","User":"root"}' There is no guaranteed that all events from system log will be eimitted. In particular, upon node restart, we only emit events that were generated from that point on. Also, if for whatever reason,we start emitting too many system log messages, then only up to the server.eventlogsink.max_entries (default 100) recent events will be emitted. However, if we think we have "dropped" some events due to confuration settings, we will indicate so in the log. Release notes (feature): Log system wide events into cockroach.log file on every node.
Fixes cockroachdb#45643 Cockroach server logs important system events into the eventlog table. These events are exposed on the web UI. However, the operators often want to see those global events while tailing a log file on a single node. Implement a mechanism for the server running on each node to emit those system events into server log file. If the system log scanning is enabled (via server.eventlogsink.enabled setting), then each node scans the system log table periodically, every server.eventlogsink.period period; For example, below is a single system event emitted to the regular log file.: I200323 .... [n1] system.eventlog:n=1:'set_cluster_setting':2020-03-23 19:24:29.948279 +0000 UTC '{"SettingName":"server.eventlogsink.max_entries","Value":"101","User":"root"}' There is no guaranteed that all events from system log will be eimitted. In particular, upon node restart, we only emit events that were generated from that point on. Also, if for whatever reason,we start emitting too many system log messages, then only up to the server.eventlogsink.max_entries (default 100) recent events will be emitted. If we think we have "dropped" some events due to confuration settings, we will indicate so in the log. The administrators may choose to restrict the set of events emitted by changing server.eventlogsink.include_events and/or server.eventlogsink.exclude_events settings. These settings specify regular expressions to include or exclude events with matching event types. Release notes (feature): Log system wide events into cockroach.log file on every node. This feature allows the administrator logged in into one of the nodes to monitor that nodes log file and see important "system" events, such as table/index creationg, schema change jobs, etc. To use this feature, the server.eventlogsink.enabled setting needs to be set to true.
I have discussed this with @petermattis today.
|
Customer request: provide visibility for cluster level events in the logs. We're already recording these cluster level events in the
system.events
table, and exposing them in the UI, but the customer wants to tail logs. Their ideal is for thesystem.events
rows to be replicated in every node's logs in case one or more of the nodes are down.This doesn't really fall under any team's ownership, but I'm taking Bulk I/O because jobs are one of the more frequent creators of events.
The text was updated successfully, but these errors were encountered: