Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add copy_from parameter for set ingest processor #63540

Merged
merged 11 commits into from
Nov 2, 2020

Conversation

gaobinlong
Copy link
Contributor

Relates to #55682 and #51046, and refractor the code of #56985.

The main point of this PR is adding a copy_from parameter for set ingest processor to support copy the value of one field to another. We can copy boolean, number, array, object and other data types by the parameter copy_from instead of using template snippet which only produces string value.

Why adding a new parameter to implement the copy function but not modifying the template snippet so that it can produce object or array value ? I think we can make it compatible with older versions, users can still use template snippet to copy the produced string value to other field, but also can use copy_from to copy object or array value from one field to another.

The main changes are:

  1. Add copy_from parameter for set ingest processor.
  2. Add unit test and yml test for the change.
  3. Update the set processor's document.

@danhermann
Copy link
Contributor

Why adding a new parameter to implement the copy function but not modifying the template snippet so that it can produce object or array value ? I think we can make it compatible with older versions, users can still use template snippet to copy the produced string value to other field, but also can use copy_from to copy object or array value from one field to another.

Would it be possible to modify the template snippet code to produce object and array values? Our team discussed this and couldn't think of any good reason for producing string values. We would release this feature in the next major version of ES so maintaining compatibility with older versions is not required.

@gaobinlong
Copy link
Contributor Author

@danhermann, the usage of template snippet is flexible, for example, "{{field_a}} {{field_b}}" is used to concatenate the value of field_a and field_b, but if field_a and field_b are not string, what's the result after executing the template snippet if template snippet can produce object or array value? I'm not clear about that. Another example is that {{.}} is valid currently which will output the whole document. So maybe specifying a field by the parameter copy_from can simply copy the value from one filed to another without handing some complex situation. In addition, maybe it's not easy to modify the template snippet code to produce object and array values, from the source code: MustacheScriptEngine, I found that the MustacheScriptEngine only produces string value. Expect your opinion on this.

@danhermann
Copy link
Contributor

@danhermann, the usage of template snippet is flexible, for example, "{{field_a}} {{field_b}}" is used to concatenate the value of field_a and field_b, but if field_a and field_b are not string, what's the result after executing the template snippet if template snippet can produce object or array value? I'm not clear about that. Another example is that {{.}} is valid currently which will output the whole document. So maybe specifying a field by the parameter copy_from can simply copy the value from one filed to another without handing some complex situation. In addition, maybe it's not easy to modify the template snippet code to produce object and array values, from the source code: MustacheScriptEngine, I found that the MustacheScriptEngine only produces string value. Expect your opinion on this.

Thanks, @gaobinlong. Your explanation makes sense and we'd like to move forward with the copy_from parameter as you've implemented it. I'll review this shortly and we'll work on getting it merged.

@danhermann
Copy link
Contributor

@elasticmachine update branch

@danhermann
Copy link
Contributor

@elasticmachine ok to test

Copy link
Contributor

@danhermann danhermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaobinlong, from my testing, this looks good and works well. I left some minor wording changes below as well as a request to split up one of the unit tests. I think we can get it merged after that.

@gaobinlong
Copy link
Contributor Author

@elasticmachine test this please.

@danhermann
Copy link
Contributor

Thanks, @gaobinlong. This looks good. The last failing test is due to the fact that line 120 in SetProcessor.java is now longer than the limit of 140 characters. If you can split the line to keep it under the length limit, I believe that all tests will pass.

@danhermann
Copy link
Contributor

@elasticmachine update branch

@danhermann
Copy link
Contributor

Thanks again for this contribution, @gaobinlong! I'll get it merged and backported.

@danhermann danhermann merged commit b17ce85 into elastic:master Nov 2, 2020
@danhermann danhermann added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement backport pending v7.11.0 v8.0.0 labels Nov 2, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v7.11.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants