-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging changes to support Beats Elasticsearch module (including logging to JSON) #32850
Comments
Pinging @elastic/es-core-infra |
Pinging @ruflin; let me know if I missed anything! |
In some scenarios, an exception occurs, a large amount of logs will be printed, tens of MB per second, if it can be optimized, the same type exception is suppressed ,as print info :
A large number of exception log writes, seriously affecting disk write performance. |
Now that we have a Logging UI in Kibana, we were hoping to show ES deprecation (and potentially other types of) logs in the UI. The user would discover said logs via the Monitoring UI, filtered down to the context they are in: a cluster, a node within a cluster, etc. In order for this to work correctly, item 2 in the checklist above needs to be implemented:
Any chance this specific item could be prioritized to enable the Monitoring UI use case described above? |
We have been meaning to follow-up on this requirement, because as written we can’t meet it. We need to understand more how important and how strong this requirement is. It is simply not possible include the node ID and cluster ID on every log line. We emit log lines before these data are available. |
Is this true for all types of logs? For example, would it be possible to emit the node ID and cluster ID on every log line for deprecation logs? Are there any guarantees about when the data becomes available or, conversely, what types of log messages might not have this data in them since it's not available yet? |
Yes, it could be true for the deprecation log too. Also, it would be that we are considering combining all the log files into a single log file for all instances, not only for the Docker images.
There aren't guarantees per se. They can be categorized as all startup log messages up to the point that we recover node state would not have the node ID, and all startup log messages up to the point that we recover cluster state would not have the cluster ID (superset of those without the node ID). |
Thanks Jason. Roughly how many log entries are we talking about without cluster UUID or node ID? The reason I ask is: thinking a bit out of the box, would it be silly to re-emit those messages with the IDs in them, once the IDs become available? Just want to consider all possibilities before we say it's okay to "drop" certain messages that don't have IDs. |
I can not bound it because we can always add more in the future, and if debug or trace logging is enabled (or even only for certain components) it can be quite a lot of log messages. |
In that case, I'm in favor of doing a best effort here. That is, I'm in favor of ES including the cluster UUID and node ID on every log line as and when that information becomes available. It would be an improvement over what we can do today in the UI with the logs in terms of correlating them with other information about the node or cluster. And we'd still have the un-correlatable logs (but fewer in number now) around in filebeat indices, in case we later think of some way to display these meaningfully to the user. |
I opened a draft PR to discuss the technical details of the implementation. Feel free to review/comment. In regards to the format itself. The idea would be to change the log line from:
To:
Before the first ClusterStateUpdate is received, the cluster_id and node_id would not be present in log lines:
Or can they be null? like: In order to support multiple output streams from docker we will have a
I am stil not sure how to format stacktraces in the log lines. Should they also contain the fields mentioned in 1st post by Jason? |
also, should we allow users to change the pattern we create now in a JSON form? |
Hi @pgomulka this is looking great!
You mean because they are multiline? If so, what about escaping the newlines in the |
I have a request for one more logging change to support the Beats Elasticsearch module. Please let me know if I should add it as a checklist item in this issue's description OR if you'd prefer that I make a new issue OR if this comment is sufficient. Currently we see lines like these in the Elasticsearch logs:
Notice the durations in these lines, |
@ycombinator in regards to this duration units inconsistency, that might be a bug I think. Can you please a raise a separate issue? |
@ycombinator
To make a stacktrace human readable in JSON format I have enclosed each line of stacktrace in a separate array element. |
Also I am wondering if we should add a new schema to ECS. We are after all integrating beats with ES.
|
Another problem to consider is how should we handle exceptions, messages written directly to the console. That is not a problem when running elasticsearch from a binary - log files will be parsed - but might be an issue when running from docker. For instance if elasticsearch fails bootstrap checks we will write the exception message directly to the org.elasticsearch.cli.Terminal
In Boostrap.java we remove ConsoleAppender, expecting this to print out from Terminal again later.
but since docker is only using ConsoleAppender, it will miss the message in json format and will later only print out the non-json message on a Terminal Logs from docker: Logs from log file
Should we aim to format Terminal messages as well? |
Hey folks I'm hoping to get some traction on this issue. We're interested in pushing forward this deprecation log in the monitoring UI initiative and we've discussed that we need Any update here? |
@chrisronline the PR #36833 with ES changes is on track to be merged into 7.0. This will have cluster.uuid, but as discussed earlier only once the ES started up. see comment you can see sample messages in here |
@pgomulka Thanks! That's helpful for sure! We're hoping to get our side done before 7.0 so users can take advantage of the UI for the 7.0 upgrade. Is it possible to tackle this work in phases, where the final phase is the PR you have linked, but we can have an initial phase where we add the |
@chrisronline in my view it would be better to push that all together. The majority of the work is related to propagating cluster.uuid and node.id and testing. Most of the test I wrote already assume the json format, so that would be actually more work now to isolate that change @danielmitterdorfer @nik9000 any views on this? |
Just wanted to point out that the audit log is not following this convention currently: plaintext logs are being sent to |
I am a bit skeptical about this for the following reasons:
For the record, with the introduction of json logging, at least the docker use case looks really nice and easy to read esp. piping it to
One concern though is the possible amount of frustration with new users not preparing themselves for this switch and frantically looking on how to change back to normal log output. I chatted with @pgomulka earlier and suggested whether it makes sense to ship the For example: Suggestion for log4j2.properties (docker)
WDYT? |
Un-commenting the section requires a node restart. In the case of the audit log, we went with both formats enabled out-of-the-box, with two independent loggers, and we strongly advised disabling the logger for the old format by using the cluster settings API.
This is correct.
We named the new audit log |
The json format will be a default one in the long run. |
We can consider this; I am not convinced that it's a change that we need to make. |
I think it's worth mentioning on top of @pgomulka 's notes that, going forward, logs on the console for the docker image will be in |
If we go down the path of keeping both flavors of logs - plaintext and json - I'd like to know what globs/extensions each flavor would use. That way Filebeat can look in the right places and know how to parse what it sees. |
@ycombinator we will produce following log files with new JSON format
and following log files with an old pattern (not containing newly added IDs like cluster.uuid)
|
This is perfect, thanks @pgomulka! |
@pgomulka Looking closer at the log file names, I have a couple of questions:
Thanks! |
I do not think we should change the name of the existing logs; that is a breaking change with little benefit to us. |
I have renamed the I am however wondering if audit logs should not be in GC logs stay untouched. |
There is not direct benefit for Elasticsearch but I think we should change this for Filebeat to make collection easier. Right now on the Filebeat side, we define what a server log is by excluding all the other log types:
If a new log type is added, this config breaks which I don't think it should. Having |
I just had a sync up with @pgomulka . Here the conclusions:
@ycombinator The last point has some effects on the Filebeat module. For Filebeat 7 I think the default should be to only have the JSON logs in the path patterns. If someone wants to ingest the old logs from 7, some manual config changes for the paths are needed by the user. We need docs for this. @pgomulka Is there a way that the Elasticsearch user can disable the "double" logging and only have JSON logs? |
@ruflin Users are free to opt out from either logs by deleting unwanted appenders from We are aware of the additional log size, but during core-infra meeting we agreed that this is not a major concern. We can reduce the log size if needed by changing the rolling parameter |
In order to support JSON log format, a custom pattern layout was used and its configuration is enclosed in ESJsonLayout. Users are free to use their own patterns, but if smooth Beats integration is needed, they should use ESJsonLayout. EvilLoggerTests are left intact to make sure user's custom log patterns work fine. To populate additional fields node.id and cluster.uuid which are not available at start time, a cluster state update will have to be received and the values passed to log4j pattern converter. A ClusterStateObserver.Listener is used to receive only one ClusteStateUpdate. Once update is received the nodeId and clusterUUid are set in a static field in a NodeAndClusterIdConverter. Following fields are expected in JSON log lines: type, tiemstamp, level, component, cluster.name, node.name, node.id, cluster.uuid, message, stacktrace see ESJsonLayout.java for more details and field descriptions Docker log4j2 configuration is now almost the same as the one use for ES binary. The only difference is that docker is using console appenders, whereas ES is using file appenders. relates: #32850
in order to keep json logs consistent the security audit logs are renamed from .log to .json relates #32850
When extending ESIntegTestCase are run on the same jvm, the static field in NodeAndClusterIdConverter will throw an AlreadySet exceptions. overriding the configuration method from Node.configureNodeAndClusterIdStateListener in the MockNode will prevent the listener registration from happening relates #32850
Since the name of the audit log file was changed it should be documented in a migration doc. relates elastic#32850
The migration documentation for an audit logging changes. Removal of plaintext logs and rename of json log file relates #32850
This relates to the effort in Beats (filebeat) to build a module for Elasticsearch. To support this effort, we want to make some changes to the logs in Elasticsearch including removing some existing pain points. The main highlight of these changes will be shipping with JSON logging.
server.log
. Today the main log file is<clustername>.log
and this creates complexity for Filebeat; it would be simpler for them if they could rely on an include pattern based on a predictable log file name (another possibility here isserver-<clustername>.log
). Note that this will be a breaking change.timestamp
,log_level
,component
,node_name
,index_name
,shard_id
,message
).The text was updated successfully, but these errors were encountered: