-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INGEST: Document Processor Conditional #33388
INGEST: Document Processor Conditional #33388
Conversation
Pinging @elastic/es-core-infra |
[source,js] | ||
-------------------------------------------------- | ||
{ | ||
"bytes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the set processor for this example ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that seems less confusing :)
then the processor will be executed for the given document otherwise it will be skipped. | ||
The `if` field takes a map with the script settings used defined in <<script-processor, script-options>> | ||
and accesses a read only version of the document via the same `ctx` variable used by scripts in the | ||
<<script-processor>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we link to the painless doc ?
Do you think it would useful to also have additional examples (not necessarily the full the processor) of how to implement logstash's ~=
? It seemed to be the most common operator to use for this type of check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yea it could make sense to add more examples (I agree, it's way too hard to figure out how to do that in Painless from what we currently have in the docs).
WDYT @rjernst ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, my only concern is making sure the example works in docs tests.
@@ -721,12 +721,29 @@ All processors are defined in the following way within a pipeline definition: | |||
// NOTCONSOLE | |||
|
|||
Each processor defines its own configuration parameters, but all processors have | |||
the ability to declare `tag` and `on_failure` fields. These fields are optional. | |||
the ability to declare `tag` ,`on_failure` and `if` fields. These fields are optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space before comma should be after
|
||
A `tag` is simply a string identifier of the specific instantiation of a certain | ||
processor in a pipeline. The `tag` field does not affect the processor's behavior, | ||
but is very useful for bookkeeping and tracing errors to specific processors. | ||
|
||
The `if` field must contain a script that returns a boolean value. If the script evaluates to `true` | ||
then the processor will be executed for the given document otherwise it will be skipped. | ||
The `if` field takes a map with the script settings used defined in <<script-processor, script-options>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if
doesn't take a map, right? I think you mean "object" (since we are talking about json), and that is optional (the example you have below here has if
as the script source directly. You would only need an object if you wanted to pass params (unlikely for ingest I think) or use a different scripting language (not really possible since expressions don't support this, but theoretically a native script could be written as a custom script engine).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right .. "object" wins :)
Hmm on that note and kinda off topic but: maybe we should/could enable expressions here the same way we do for bucket selector aggregation (interpret 1.0
as true
) to become consistent there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be difficult and not very useful, since expressions only have access to numerics. Expressions operate on doc values, but ingest does not have doc values, so it would require a lot of hacking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok makes sense :)
} | ||
} | ||
-------------------------------------------------- | ||
// NOTCONSOLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this omitted from tests? Can we make an example that will work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured just with the example config we don't have enough to go by for running anything, we'd need to actually index a document against a concrete pipeline here to make a test out of it right?
(that may be a little confusing given we only want an example of the if
field here? Idk, you decide :))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we can have extra setup that is hidden from the generated documentation. I think having the examples always "work" is key to keeping the documentation up to date as apis change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok :) Will look into that tomorrow morning :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't add a test for a snipped like this it seems. I get this error:
Execution failed for task ':docs:buildRestTests'.
> reference/ingest/ingest-node.asciidoc[737:745](js)// CONSOLE: Didn't match (?:(?<comment>#.+)|(?:(?:(?<method>GET|PUT|POST|HEAD|OPTIONS|DELETE)\s+(?<pathAndQuery>[^\n]+)(?<body>(?:\n(?!GET|PUT|POST|HEAD|OPTIONS|DELETE|startyaml|#)[^\n]+)+)?)|(?:startyaml(?s)(?<yaml>.+?)(?-s)endyaml)))\n+: {
"set": {
"if": "ctx.bar == 'expectedValue'",
"field": "foo",
"value": "bar"
}
}
if I try anything here. => Seems I at least need a snippet that includes an actual request. That would break with the style of the following docs that also just show a quick outline of the configuration (without tests) for each processor?
... but:
|
@nik9000 Can you provide any suggestions on making the snippet test discussed here work? |
@original-brownbear @rjernst - I think a standalone page (in addition to what is here), something like "Conditional Execution" would be beneficial. With specific mentions about the regex requirements, conditional dropping of events, and conditional pipelines. I can take the TODO to create that content (and implement the tests there). Hopefully that will unblock this PR. thoughts ? |
@jakelandis I think that (standalone page) would make sense. Putting the example scripts and the how-to use the conditional processor could be pretty helpful for people trying to convert a LS config (or just their problem) into our code without having to jump between the Painless docs and pipeline docs :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to get in since we need conditional processor docs, but I would still like suggestions from @nik9000 on ways to test this so we don't need NOTCONSOLE.
@rjernst alright thanks! Merged master into this PR since its been a while => will merge once green then. |
* master: (24 commits) ingest: better support for conditionals with simulate?verbose (elastic#34155) [Rollup] Job deletion should be invoked on the allocated task (elastic#34574) [DOCS] .Security index is never auto created (elastic#34589) CCR: Requires soft-deletes on the follower (elastic#34725) re-enable bwc tests (elastic#34743) Empty GetAliases authorization fix (elastic#34444) INGEST: Document Processor Conditional (elastic#33388) [CCR] Add total fetch time leader stat (elastic#34577) SQL: Support pattern against compatible indices (elastic#34718) [CCR] Auto follow pattern APIs adjustments (elastic#34518) [Test] Remove dead code from ExceptionSerializationTests (elastic#34713) A small typo in migration-assistance doc (elastic#34704) ingest: processor stats (elastic#34724) SQL: Implement IN(value1, value2, ...) expression. (elastic#34581) Tests: Add checks to GeoDistanceQueryBuilderTests (elastic#34273) INGEST: Rename Pipeline Processor Param. (elastic#34733) Core: Move IndexNameExpressionResolver to java time (elastic#34507) [DOCS] Force Merge: clarify execution and storage requirements (elastic#33882) TESTING.asciidoc fix examples using forbidden annotation (elastic#34515) SQL: Implement `CONVERT`, an alternative to `CAST` (elastic#34660) ...
* INGEST: Document Processor Conditional Relates elastic#33188
* INGEST: Document Processor Conditional Relates #33188
6.5 backport: fcad4e7 |
* INGEST: Document Processor Conditional Relates #33188
Relates #33188