-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dissect tag on parsing error #8751
Conversation
Before when a parsing error occurred the events was returned untouched and an error was logged, if you don't look at your logs you have no the idea that the tokenizer was not able to match your string. Instead, when a parsing error occurs in the Dissect processor, we will now add a tag named 'dissect_parsing_error' to the 'log.flags' field. With that information, you are now able to reprocess your data or do filtering on the UI. Fixes: elastic#8123
074371a
to
97e3e19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be consistent and call it flag
everywhere in the code and docs.
@@ -27,6 +27,8 @@ import ( | |||
"github.com/elastic/beats/libbeat/processors" | |||
) | |||
|
|||
const tagParsingError = "dissect_parsing_error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also call it flag
?
@@ -176,3 +176,59 @@ func TestFieldAlreadyExist(t *testing.T) { | |||
}) | |||
} | |||
} | |||
|
|||
func TestErrorTagging(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flag
@ruflin Now with more flags (tm) :) |
@@ -24,6 +24,9 @@ import ( | |||
"github.com/elastic/beats/libbeat/common" | |||
) | |||
|
|||
// FlagField fields used to keep information or errors when events are parsed. | |||
const FlagField = "log.flags" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if dissect should write it's flags into log.flags
or rather event.flags
? Reasons is that dissect is not only for logs but more generic.
Should have spotted this earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@webmat I think we need event.flags
in the future in ECS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin Would it be the same for when an event is truncated
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did create elastic/ecs#100 a while ago for the log
tag field. There is no issue yet for a more generic set of flags.
I agree with @ruflin that the dissect error should not be set on log.flags
.
event.flags
is a bit better. But I think this approach still mixes up pipeline & processing metadata with userland data (like the error
discussion we had last week, @ruflin). The following idea hasn't been fleshed out yet, but I've been thinking we should introduce a section that's clearly about stuff that happened in the processing pipeline. E.g. pipeline.error
, pipeline.tags
(or flags), if someone wants to note down timings of each step in their pipeline, they'd do it under pipeline.
as well, etc. However this will have to come after ECS 1.0/GA, so don't wait on this being defined for what needs to happen in Beats.
In the meantime, what I would suggest instead is to do what we've been doing for years, and add this dissect tag to tags
directly, like Logstash does with _grok_parse_failure
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And @ph, to answer your more recent question, I would consider the truncation to be userland information, about the log itself. So I do think having truncated
right on log.flags
makes sense.
This is the new field where the multiline
tag is also being added, correct? (Sorry I haven't been following these developments very closely)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's the same field correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks for confirming. So my opinion for now is that flags that are descriptive of the log itself or the log entry should be added to log.flags
, so multiline
, truncated
, as they are now.
Parsing flags like dissect_parsing_error
, on the other hand, should be added to tags
, until we define a more general place to put pipeline errors and details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not block this PR, lets go with log.flags
for now. Lets open a more general discussion where information from processing should go.
For tags
in LS: We should probably also tackle this.
Before when a parsing error occurred the events was returned untouched and an error was logged, if you don't look at your logs you have no the idea that the tokenizer was not able to match your string. Instead, when a parsing error occurs in the Dissect processor, we will now add a tag named 'dissect_parsing_error' to the 'log.flags' field. With that information, you are now able to reprocess your data or do filtering on the UI. Fixes: elastic#8123 (cherry picked from commit 8dbfed2)
Before when a parsing error occurred the events was returned untouched and an error was logged, if you don't look at your logs you have no the idea that the tokenizer was not able to match your string. Instead, when a parsing error occurs in the Dissect processor, we will now add a tag named 'dissect_parsing_error' to the 'log.flags' field. With that information, you are now able to reprocess your data or do filtering on the UI. Fixes: #8123 (cherry picked from commit 8dbfed2)
Before when a parsing error occurred the events was returned untouched
and an error was logged, if you don't look at your logs you have no
the idea that the tokenizer was not able to match your string.
Instead, when a parsing error occurs in the Dissect processor, we will now
add a tag named 'dissect_parsing_error' to the 'log.flags' field.
With that information, you are now able to reprocess your data or do
filtering on the UI.
Fixes: #8123