-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RemoveProcessor updated to support fieldsToKeep #83665
Conversation
- Enhancement related to issue 83010 [elastic#83010]
Pinging @elastic/es-data-management (Team:Data Management) |
@elasticmachine ok to test |
@zembrzuski, thank you for working on this. In thinking a bit about it, I think the fields to keep should always include the metadata fields on the document. These are necessary for ingestion to work properly and their removal causes a variety of problems. I'll probably have a few more suggestions after a more thorough review, but I do know this change will be necessary. |
Hi @danhermann Tks so much for reviewing my PR. This is one of my first PRs, and I am very excited about contributing to ElasticSearch. |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @zembrzuski, thank you for working on this. It looks pretty good. There are just a few changes to be made below and then I think we'll be able to get it merged.
RemoveProcessor(String tag, String description, List<TemplateScript.Factory> fieldsToRemove, boolean ignoreMissing) { | ||
super(tag, description); | ||
this.fieldsToRemove = new ArrayList<>(fieldsToRemove); | ||
this.fieldsToKeep = new ArrayList<>(); | ||
this.ignoreMissing = ignoreMissing; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, this constructor has only one use in a test class. I'd suggest removing it and updating that one reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
.filter(documentField -> IngestDocument.Metadata.isMetadata(documentField) == false) | ||
.filter(documentField -> shouldKeep(documentField, fieldsToKeep, document) == false) | ||
.forEach(documentField -> removeWhenPresent(document, documentField)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an else
block here. I know that one of either fieldsToKeep
or fieldsToRemove
must be empty, but I prefer not to skip the entire block rather than iterate over an empty list in the latter case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
public static boolean isMetadata(String field) { | ||
return Arrays.stream(Metadata.values()).anyMatch(e -> e.fieldName.equals(field)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's statically initialize a list or set of metadata field names so we're not creating a new stream every time the processor is executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
if (v instanceof Map) { | ||
allFields.addAll(getAllFields((Map) v, prefix + k + ".")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: let's use instanceof
pattern matching here
if (v instanceof Map) { | |
allFields.addAll(getAllFields((Map) v, prefix + k + ".")); | |
} | |
if (v instanceof Map mapValue) { | |
allFields.addAll(getAllFields(mapValue, prefix + k + ".")); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
if (compiledTemplatesToRemove.isEmpty() && compiledTemplatesToKeep.isEmpty()) { | ||
throw new IllegalArgumentException( | ||
"missing field [processors.remove.keep] or [processors.remove.field]. Please specify one of them." | ||
); | ||
} | ||
|
||
if (compiledTemplatesToRemove.isEmpty() == false && compiledTemplatesToKeep.isEmpty() == false) { | ||
throw new IllegalArgumentException( | ||
"Too many fields specified. Please specify either [processors.remove.keep] or [processors.remove.field]." | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of consistency, let's use the same exception and message style as other processors with mutually exclusive configuration options. See the network direction processor for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
docs/changelog/83665.yaml
Outdated
@@ -0,0 +1,5 @@ | |||
pr: 83665 | |||
summary: "RemoveProcessor updated to support fieldsToKeep" | |||
area: Infra/Scripting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, this falls into the "ingest" category:
area: Infra/Scripting | |
area: Ingest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
23e71f2
to
42f610a
Compare
Hi @danhermann |
@zembrzuski, I think you just need to update the error message in the RemoveProcessorFactory tests so they pass. |
Ohh, my bad. I've just fixed these tests. |
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/rest-compatibility |
@zembrzuski, everything here looks good. One minor change needed and then we can get this merged: On the following two pairs of lines, That will change the error messages a bit, so a couple of the unit tests will likely need to be updated accordingly. |
Hi @danhermann |
Thanks, @zembrzuski. It looks like the formatting check is complaining about a couple lines now. You should be able to fix that by running |
Ohh, I'm sorry. I've read about checkstyle in the contribution guideline but I forgot to check it this time. |
Looks great, @zembrzuski! I've merged this in and it will be available in the next release. Thanks again for your contribution! |
Hey guys, so this landed in 8.2 right? Is it possible that the documentation was forgotten? I can't find anything related to the new I'll give it a try anyway :) |
Relates to #83010