-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata overwrites event data #7959
Comments
There are a few fields in Beats that are "special" as you found out. The main problem with your data is that it would conflict with our template, meaning the template defines It would say it is ok if we allow to overwrite |
The thing here is that I'm sending the output to kafka, so the template basically doesn't play in my case... so I could afford to have this mismatch. What do you think about a flag If I'm not mistaken, the new flag should alter the behaviour of the default branch in: https://github.com/elastic/beats/blob/6.4/libbeat/common/mapstr.go#L74 and https://github.com/elastic/beats/blob/6.4/libbeat/common/mapstr.go#L93. If you're ok, I can prepare a PR for that change and we'll discuss it, what do you think @ruflin ? |
What about putting your |
This is a workaround I tested, basically (for those interested): - type: log
enabled: true
paths:
- /var/log/nginx/json-access.log
fields:
beat.type: nginx_access
beat._transform: json
fields_under_root: true I don't process json in the input, but I flag the events with and in filebeat.yml I'm adding a processor that converts to json if processors:
- decode_json_fields:
fields:
- message
target: ""
overwrite_keys: true
when:
equals:
beat._transform: json
- drop_fields:
fields:
- beat._transform
- message
when:
has_fields:
- beat._transform This works because global processors are applied last, so the
Logstash is one of several systems that pulls the data, that's why I'm so adamant of changing the events. I'll stay in 6.2.4 for now until I can update each of the components to use the |
Thanks for sharing the workaround. Are there some cases where your above workaround will not work? |
For my usecase it works perfectly, it just polutes the config a little bit and makes it a bit more "complicated" but does the job. Also, in terms of performance doesn't seem to affect, as the json decoding & dropping I think it's ok to close this issue, thanks @ruflin for your help! |
Let's close this issue for now. We can still reopen in case of other issues popping up. Also it's good to have the issue here for other people stumbling over the same issue. |
Prepare a log file with:
Filebeat input configuration:
Resulting output:
The problem here is that
host
was overwritten byhost.name
from this PR: #7051 and documentation states that:https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-log.html#filebeat-input-log-config-json
I've been digging a little around the code and the issue seems to be in the order in which these actions are performed: https://github.com/tsg/beats/blob/6.3/libbeat/publisher/pipeline/processor.go#L32-L41
Beats and host metadata are added at the end, right before global processors, maybe it should be added before the client processors, or avoid overwritting fields that are already in the event when adding the fields. The purpose was to avoid any processor removing beats data (#5149 (comment)) but with the undesired effect of overwritting event data.
This was also discussed here: https://discuss.elastic.co/t/logstash-errors-after-upgrading-to-filebeat-6-3-0/135984/19
Btw, if we agree on one of the two alternatives (move the adding beat field a few steps earlier or avoid overwrite) I can work on a PR.
The text was updated successfully, but these errors were encountered: