Skip to content

Commit

Permalink
Use mappings to format doc-value fields by default. (#30831)
Browse files Browse the repository at this point in the history
Doc-value fields now return a value that is based on the mappings rather than
the script implementation by default.

This deprecates the special `use_field_mapping` docvalue format which was added
in #29639 only to ease the transition to 7.x and it is not necessary anymore in
7.0.
  • Loading branch information
jpountz authored Jan 30, 2019
1 parent b63b50b commit c8af0f4
Show file tree
Hide file tree
Showing 29 changed files with 105 additions and 152 deletions.
10 changes: 10 additions & 0 deletions docs/reference/migration/migrate_7_0/search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,16 @@ using the "all fields" mode ("default_field": "*") or other fieldname expansions
Search requests with extra content after the main object will no longer be accepted
by the `_search` endpoint. A parsing exception will be thrown instead.

[float]
==== Doc-value fields default format

The format of doc-value fields is changing to be the same as what could be
obtained in 6.x with the special `use_field_mapping` format. This is mostly a
change for date fields, which are now formatted based on the format that is
configured in the mappings by default. This behavior can be changed by
specifying a <<search-request-docvalue-fields,`format`>> within the doc-value
field.

[float]
==== Context Completion Suggester

Expand Down
23 changes: 11 additions & 12 deletions docs/reference/search/request/docvalue-fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ GET /_search
"match_all": {}
},
"docvalue_fields" : [
"my_ip_field", <1>
{
"field": "my_ip_field", <1>
"format": "use_field_mapping" <2>
"field": "my_keyword_field" <2>
},
{
"field": "my_date_field",
Expand All @@ -25,10 +25,10 @@ GET /_search
--------------------------------------------------
// CONSOLE
<1> the name of the field
<2> the special `use_field_mapping` format tells Elasticsearch to use the format from the mapping
<3> date fields may use a custom format
<2> an object notation is supported as well
<3> the object notation allows to specify a custom format

Doc value fields can work on fields that are not stored.
Doc value fields can work on fields that have doc-values enabled, regardless of whether they are stored

`*` can be used as a wild card, for example:

Expand All @@ -41,8 +41,8 @@ GET /_search
},
"docvalue_fields" : [
{
"field": "*field", <1>
"format": "use_field_mapping" <2>
"field": "*_date_field", <1>
"format": "epoch_millis" <2>
}
]
}
Expand All @@ -62,9 +62,8 @@ While most fields do not support custom formats, some of them do:
- <<date,Date>> fields can take any <<mapping-date-format,date format>>.
- <<number,Numeric>> fields accept a https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html[DecimalFormat pattern].

All fields support the special `use_field_mapping` format, which tells
Elasticsearch to use the mappings to figure out a default format.
By default fields are formatted based on a sensible configuration that depends
on their mappings: `long`, `double` and other numeric fields are formatted as
numbers, `keyword` fields are formatted as strings, `date` fields are formatted
with the configured `date` format, etc.

NOTE: The default is currently to return the same output as
<<search-request-script-fields,script fields>>. However it will change in 7.0
to behave as if the `use_field_mapping` format was provided.
5 changes: 1 addition & 4 deletions docs/reference/search/request/inner-hits.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -246,10 +246,7 @@ POST test/_search
"inner_hits": {
"_source" : false,
"docvalue_fields" : [
{
"field": "comments.text.keyword",
"format": "use_field_mapping"
}
"comments.text.keyword"
]
}
}
Expand Down
3 changes: 1 addition & 2 deletions docs/reference/sql/endpoints/translate.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,7 @@ Which returns:
"size" : 10,
"docvalue_fields" : [
{
"field": "page_count",
"format": "use_field_mapping"
"field": "page_count"
},
{
"field": "release_date",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ public void testQueryBuilderBWC() throws Exception {
QueryBuilder expectedQueryBuilder = (QueryBuilder) CANDIDATES.get(i)[1];
Request request = new Request("GET", "/" + index + "/_search");
request.setJsonEntity("{\"query\": {\"ids\": {\"values\": [\"" + Integer.toString(i) + "\"]}}, " +
"\"docvalue_fields\": [{\"field\":\"query.query_builder_field\", \"format\":\"use_field_mapping\"}]}");
"\"docvalue_fields\": [{\"field\":\"query.query_builder_field\"}]}");
Response rsp = client().performRequest(request);
assertEquals(200, rsp.getStatusLine().getStatusCode());
Map<?, ?> hitRsp = (Map<?, ?>) ((List<?>) ((Map<?, ?>)toMap(rsp).get("hits")).get("hits")).get(0);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ setup:
"Nested doc version and seqIDs":

- skip:
version: " - 6.3.99"
reason: "object notation for docvalue_fields was introduced in 6.4"
version: " - 6.99.99"
reason: "Triggers warnings before 7.0"

- do:
index:
Expand All @@ -62,7 +62,7 @@ setup:
- do:
search:
rest_total_hits_as_int: true
body: { "query" : { "nested" : { "path" : "nested_field", "query" : { "match_all" : {} }, "inner_hits" : { version: true, "docvalue_fields": [ { "field": "_seq_no", "format": "use_field_mapping" } ]} }}, "version": true, "docvalue_fields" : [ { "field": "_seq_no", "format": "use_field_mapping" } ] }
body: { "query" : { "nested" : { "path" : "nested_field", "query" : { "match_all" : {} }, "inner_hits" : { version: true, "docvalue_fields": [ "_seq_no" ]} }}, "version": true, "docvalue_fields" : [ "_seq_no" ] }

- match: { hits.total: 1 }
- match: { hits.hits.0._index: "test" }
Expand All @@ -86,7 +86,7 @@ setup:
- do:
search:
rest_total_hits_as_int: true
body: { "query" : { "nested" : { "path" : "nested_field", "query" : { "match_all" : {} }, "inner_hits" : { version: true, "docvalue_fields": [ { "field": "_seq_no", "format": "use_field_mapping" } ]} }}, "version": true, "docvalue_fields" : [ { "field": "_seq_no", "format": "use_field_mapping" } ] }
body: { "query" : { "nested" : { "path" : "nested_field", "query" : { "match_all" : {} }, "inner_hits" : { version: true, "docvalue_fields": [ "_seq_no" ]} }}, "version": true, "docvalue_fields" : [ "_seq_no" ] }

- match: { hits.total: 1 }
- match: { hits.hits.0._index: "test" }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -144,12 +144,9 @@ setup:
---
"docvalue_fields":
- skip:
version: " - 6.4.0"
reason: format option was added in 6.4 and the deprecation message changed in 6.4.1
features: warnings
version: " - 6.9.99"
reason: Triggers a deprecation warning before 7.0
- do:
warnings:
- 'There are doc-value fields which are not using a format. The output will change in 7.0 when doc value fields get formatted based on mappings by default. It is recommended to pass [format=use_field_mapping] with a doc value field in order to opt in for the future behaviour and ease the migration to 7.0: [count]'
search:
body:
docvalue_fields: [ "count" ]
Expand All @@ -158,12 +155,9 @@ setup:
---
"multiple docvalue_fields":
- skip:
version: " - 6.4.0"
reason: format option was added in 6.4 and the deprecation message changed in 6.4.1
features: warnings
version: " - 6.9.99"
reason: Triggered a deprecation warning before 7.0
- do:
warnings:
- 'There are doc-value fields which are not using a format. The output will change in 7.0 when doc value fields get formatted based on mappings by default. It is recommended to pass [format=use_field_mapping] with a doc value field in order to opt in for the future behaviour and ease the migration to 7.0: [count, include.field1.keyword]'
search:
body:
docvalue_fields: [ "count", "include.field1.keyword" ]
Expand All @@ -172,22 +166,22 @@ setup:
---
"docvalue_fields as url param":
- skip:
version: " - 6.4.0"
reason: format option was added in 6.4 and the deprecation message changed in 6.4.1
features: warnings
version: " - 6.99.99"
reason: Triggered a deprecation warning before 7.0
- do:
warnings:
- 'There are doc-value fields which are not using a format. The output will change in 7.0 when doc value fields get formatted based on mappings by default. It is recommended to pass [format=use_field_mapping] with a doc value field in order to opt in for the future behaviour and ease the migration to 7.0: [count]'
search:
docvalue_fields: [ "count" ]
- match: { hits.hits.0.fields.count: [1] }

---
"docvalue_fields with default format":
- skip:
version: " - 6.3.99"
reason: format option was added in 6.4
version: " - 6.99.99"
reason: Only triggers warnings on 7.0+
features: warnings
- do:
warnings:
- "[use_field_mapping] is a special format that was only used to ease the transition to 7.x. It has become the default and shouldn't be set explicitly anymore."
search:
body:
docvalue_fields:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,8 @@ setup:
"Docvalues_fields size limit":

- skip:
version: " - 6.3.99"
reason: "The object notation for docvalue_fields is only supported on 6.4+"
version: " - 6.99.99"
reason: "Triggers warnings before 7.0"
- do:
catch: /Trying to retrieve too many docvalue_fields\. Must be less than or equal to[:] \[2\] but was \[3\]\. This limit can be set by changing the \[index.max_docvalue_fields_search\] index level setting\./
search:
Expand All @@ -78,12 +78,9 @@ setup:
query:
match_all: {}
docvalue_fields:
- field: "one"
format: "use_field_mapping"
- field: "two"
format: "use_field_mapping"
- field: "three"
format: "use_field_mapping"
- "one"
- "two"
- "three"

---
"Script_fields size limit":
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,6 @@
*/
public class DocValueFieldsContext {

public static final String USE_DEFAULT_FORMAT = "use_field_mapping";

/**
* Wrapper around a field name and the format that should be used to
* display values of this field.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@
import org.elasticsearch.index.fielddata.AtomicNumericFieldData;
import org.elasticsearch.index.fielddata.IndexFieldData;
import org.elasticsearch.index.fielddata.IndexNumericFieldData;
import org.elasticsearch.index.fielddata.ScriptDocValues;
import org.elasticsearch.index.fielddata.SortedBinaryDocValues;
import org.elasticsearch.index.fielddata.SortedNumericDoubleValues;
import org.elasticsearch.index.mapper.MappedFieldType;
Expand All @@ -46,7 +45,6 @@
import java.util.HashMap;
import java.util.List;
import java.util.Objects;
import java.util.stream.Collectors;

/**
* Query sub phase which pulls data from doc values
Expand All @@ -55,7 +53,8 @@
*/
public final class DocValueFieldsFetchSubPhase implements FetchSubPhase {

private static final DeprecationLogger deprecationLogger = new DeprecationLogger(
private static final String USE_DEFAULT_FORMAT = "use_field_mapping";
private static final DeprecationLogger DEPRECATION_LOGGER = new DeprecationLogger(
LogManager.getLogger(DocValueFieldsFetchSubPhase.class));

@Override
Expand All @@ -66,9 +65,9 @@ public void hitsExecute(SearchContext context, SearchHit[] hits) throws IOExcept
String name = context.collapse().getFieldName();
if (context.docValueFieldsContext() == null) {
context.docValueFieldsContext(new DocValueFieldsContext(
Collections.singletonList(new FieldAndFormat(name, DocValueFieldsContext.USE_DEFAULT_FORMAT))));
Collections.singletonList(new FieldAndFormat(name, null))));
} else if (context.docValueFieldsContext().fields().stream().map(ff -> ff.field).anyMatch(name::equals) == false) {
context.docValueFieldsContext().fields().add(new FieldAndFormat(name, DocValueFieldsContext.USE_DEFAULT_FORMAT));
context.docValueFieldsContext().fields().add(new FieldAndFormat(name, null));
}
}

Expand All @@ -79,33 +78,28 @@ public void hitsExecute(SearchContext context, SearchHit[] hits) throws IOExcept
hits = hits.clone(); // don't modify the incoming hits
Arrays.sort(hits, Comparator.comparingInt(SearchHit::docId));

List<String> noFormatFields = context.docValueFieldsContext().fields().stream().filter(f -> f.format == null).map(f -> f.field)
.collect(Collectors.toList());
if (noFormatFields.isEmpty() == false) {
deprecationLogger.deprecated("There are doc-value fields which are not using a format. The output will "
+ "change in 7.0 when doc value fields get formatted based on mappings by default. It is recommended to pass "
+ "[format={}] with a doc value field in order to opt in for the future behaviour and ease the migration to "
+ "7.0: {}", DocValueFieldsContext.USE_DEFAULT_FORMAT, noFormatFields);
if (context.docValueFieldsContext().fields().stream()
.map(f -> f.format)
.filter(USE_DEFAULT_FORMAT::equals)
.findAny()
.isPresent()) {
DEPRECATION_LOGGER.deprecated("[" + USE_DEFAULT_FORMAT + "] is a special format that was only used to " +
"ease the transition to 7.x. It has become the default and shouldn't be set explicitly anymore.");
}

for (FieldAndFormat fieldAndFormat : context.docValueFieldsContext().fields()) {
String field = fieldAndFormat.field;
MappedFieldType fieldType = context.mapperService().fullName(field);
if (fieldType != null) {
final IndexFieldData<?> indexFieldData = context.getForField(fieldType);
final DocValueFormat format;
if (fieldAndFormat.format == null) {
format = null;
} else {
String formatDesc = fieldAndFormat.format;
if (Objects.equals(formatDesc, DocValueFieldsContext.USE_DEFAULT_FORMAT)) {
formatDesc = null;
}
format = fieldType.docValueFormat(formatDesc, null);
String formatDesc = fieldAndFormat.format;
if (Objects.equals(formatDesc, USE_DEFAULT_FORMAT)) {
// TODO: Remove in 8.x
formatDesc = null;
}
final DocValueFormat format = fieldType.docValueFormat(formatDesc, null);
LeafReaderContext subReaderContext = null;
AtomicFieldData data = null;
ScriptDocValues<?> scriptValues = null; // legacy
SortedBinaryDocValues binaryValues = null; // binary / string / ip fields
SortedNumericDocValues longValues = null; // int / date fields
SortedNumericDoubleValues doubleValues = null; // floating-point fields
Expand All @@ -115,9 +109,7 @@ public void hitsExecute(SearchContext context, SearchHit[] hits) throws IOExcept
int readerIndex = ReaderUtil.subIndex(hit.docId(), context.searcher().getIndexReader().leaves());
subReaderContext = context.searcher().getIndexReader().leaves().get(readerIndex);
data = indexFieldData.load(subReaderContext);
if (format == null) {
scriptValues = data.getLegacyFieldValues();
} else if (indexFieldData instanceof IndexNumericFieldData) {
if (indexFieldData instanceof IndexNumericFieldData) {
if (((IndexNumericFieldData) indexFieldData).getNumericType().isFloatingPoint()) {
doubleValues = ((AtomicNumericFieldData) data).getDoubleValues();
} else {
Expand All @@ -138,10 +130,7 @@ public void hitsExecute(SearchContext context, SearchHit[] hits) throws IOExcept
final List<Object> values = hitField.getValues();

int subDocId = hit.docId() - subReaderContext.docBase;
if (scriptValues != null) {
scriptValues.setNextDocId(subDocId);
values.addAll(scriptValues);
} else if (binaryValues != null) {
if (binaryValues != null) {
if (binaryValues.advanceExact(subDocId)) {
for (int i = 0, count = binaryValues.docValueCount(); i < count; ++i) {
values.add(format.format(binaryValues.nextValue()));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@
import org.elasticsearch.script.ScriptType;
import org.elasticsearch.search.SearchModule;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.DocValueFieldsContext;
import org.elasticsearch.search.fetch.subphase.DocValueFieldsContext.FieldAndFormat;
import org.elasticsearch.search.fetch.subphase.FetchSourceContext;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilderTests;
Expand Down Expand Up @@ -158,8 +157,7 @@ public static InnerHitBuilder randomInnerHits() {
innerHits.setStoredFieldNames(randomListStuff(16, () -> randomAlphaOfLengthBetween(1, 16)));
}
innerHits.setDocValueFields(randomListStuff(16,
() -> new FieldAndFormat(randomAlphaOfLengthBetween(1, 16),
randomBoolean() ? null : DocValueFieldsContext.USE_DEFAULT_FORMAT)));
() -> new FieldAndFormat(randomAlphaOfLengthBetween(1, 16), null)));
// Random script fields deduped on their field name.
Map<String, SearchSourceBuilder.ScriptField> scriptFields = new HashMap<>();
for (SearchSourceBuilder.ScriptField field: randomListStuff(16, InnerHitBuilderTests::randomScript)) {
Expand Down Expand Up @@ -201,8 +199,7 @@ static InnerHitBuilder mutate(InnerHitBuilder original) throws IOException {
modifiers.add(() -> {
if (randomBoolean()) {
copy.setDocValueFields(randomValueOtherThan(copy.getDocValueFields(),
() -> randomListStuff(16, () -> new FieldAndFormat(randomAlphaOfLengthBetween(1, 16),
randomBoolean() ? null : DocValueFieldsContext.USE_DEFAULT_FORMAT))));
() -> randomListStuff(16, () -> new FieldAndFormat(randomAlphaOfLengthBetween(1, 16), null))));
} else {
copy.addDocValueField(randomAlphaOfLengthBetween(1, 16));
}
Expand Down
Loading

0 comments on commit c8af0f4

Please sign in to comment.