Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow metadata fields in the _source #61590

Merged
merged 15 commits into from
Sep 18, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -391,7 +391,7 @@ private static void innerParseObject(ParseContext context, ObjectMapper mapper,
if (token == XContentParser.Token.FIELD_NAME) {
currentFieldName = parser.currentName();
paths = splitAndValidatePath(currentFieldName);
if (context.mapperService().isMetadataField(context.path().pathAsText(currentFieldName))) {
if (context.mapperService().isFieldAllowedInSource(context.path().pathAsText(currentFieldName)) == false) {
throw new MapperParsingException("Field [" + currentFieldName + "] is a metadata field and cannot be added inside"
+ " a document. Use the index API request parameters.");
} else if (containsDisabledObjectMapper(mapper, paths)) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -495,6 +495,18 @@ public boolean isMetadataField(String field) {
return mapperRegistry.isMetadataField(indexVersionCreated, field);
}

/**
* @return Whether a field should be included in the document _source.
*/
public boolean isFieldAllowedInSource(String field) {
if(isMetadataField(field)) {
// Metadata fields may not be allowed in the document _source.
return mapperRegistry.isAllowedInSource(indexVersionCreated, field);
} else {
return true;
}
}

/** An analyzer wrapper that can lookup fields within the index mappings */
final class MapperAnalyzerWrapper extends DelegatingAnalyzerWrapper {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,14 @@ MetadataFieldMapper.Builder parse(String name, Map<String, Object> node,
* @param parserContext context that may be useful to build the field like analyzers
*/
MetadataFieldMapper getDefault(ParserContext parserContext);

/**
* @return Whether a metadata field can be included in the document _source.
*/
default boolean isAllowedInSource() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we really need this new flag isAllowedInSource. Maybe we could avoid adding a new flag and instead do the following:

  • For metadata fields that are not allowed in _source, make sure that MetadataFieldMapper#parse throws an descriptive error.
  • Many metadata fields currently have logic in parse that's unrelated to _source parsing. We could make sure to move it into the special methods preParse or postParse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani I agree with you that delegating this to MetadataFieldMapper#parse() would be a simpler solution. However I see two caveats:

  • The approach with isAllowedInSource() method detects that a metadata field is being parsed when parsing the field name. This is early in the parsing process. An exception is thrown and the field value parsing is skipped.

  • As you mention, I have seen classes (such as IgnoredFieldMapper and VersionFieldMapper) that override parse() method to do nothing on one hand, while calling super.parse() from preParse()/postParse() methods. This approach is too complicated imho, but I am not sure that refactoring this is very simple. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is early in the parsing process. An exception is thrown and the field value parsing is skipped.

I don't think we need to optimize performance in this case because it's an error condition (and should be a rare one too).

This approach is too complicated imho, but I am not sure that refactoring this is very simple.

I agree that the logic is complex/ hard to read! I haven't deeply looked into the refactoring myself -- maybe you could try it out and we can rediscuss the approach if you see a roadblock ?

// By default return false for metadata fields.
return false;
}
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,18 @@ public boolean isMetadataField(Version indexCreatedVersion, String field) {
return getMetadataMapperParsers(indexCreatedVersion).containsKey(field);
}

/**
* Returns true if the provided field can be included in the document _source.
*/
public boolean isAllowedInSource(Version indexCreatedVersion, String field) {
MetadataFieldMapper.TypeParser parser = getMetadataMapperParsers(indexCreatedVersion).get(field);
if (parser != null) {
return parser.isAllowedInSource();
}
// Non metadata fields should alway be allowed in document _source.
return true;
}

/**
* Returns a function that given an index name, returns a predicate that fields must match in order to be returned by get mappings,
* get index, get field mappings and field capabilities API. Useful to filter the fields that such API return.
Expand Down