-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formatted sort values for search_after #69192
Comments
Pinging @elastic/es-search (Team:Search) |
@jimczi Thanks for the proposal, Jim. This proposal is also relevant and useful for I am also wondering about having {
"query": {
"match_all": {}
},
"formats": [
{
"field": "timestamp",
"format": "strict_date_optional_time_nanos"
},
{
"field": "my_unsigned_long",
"format": "string"
}
],
"sort": ["timestamp", "my_unsigned_long"],
"fields": ["timestamp", "my_unsigned_long"]
} |
@mayya-sharipova While your proposal avoids specifying the format multiple times, I think the implementation is a bit more complicated and BWC is also an issue. Also, if a user wants the highest resolution for sort, but a lower resolution for fields, then it's not possible with a single format session. I've open #70357 for this. It would be great if you can take a look. |
This commit updates the default format of date_nanos field on existing and new indices to use `strict_date_optional_time_nanos` instead of `strict_date_optional_time`. Using `strict_date_optional_time` as the default format for date_nanos doesn't make sense because it accepts and parses dates with nanosecond precision, but when it formats it drops the nanoseconds. The change should be transparent for users, these formats accept the same input. Relates elastic#69192 Closes elastic#67063
If a search after request targets multiple indices and some of its sort field has type `date` in one index but `date_nanos` in other indices, then Elasticsearch won't interpret the search_after parameter correctly in every target index. The sort value of a date field by default is a long of milliseconds since the epoch while a date_nanos field is a long of nanoseconds. This commit introduces the `format` parameter in the sort field so a sort value of a date or date_nanos will be formatted using a date format in a search response. The below example illustrates how to use this new parameter. ```js { "query": { "match_all": {} }, "sort": [ { "timestamp": { "order": "asc", "format": "strict_date_optional_time_nanos" } } ] } ``` ```js { "query": { "match_all": {} }, "sort": [ { "timestamp": { "order": "asc", "format": "strict_date_optional_time_nanos" } } ], "search_after": [ "2015-01-01T12:10:30.123456789Z" // in `strict_date_optional_time_nanos` format ] } ``` Closes #69192
This commit updates the default format of date_nanos field on existing and new indices to use `strict_date_optional_time_nanos` instead of `strict_date_optional_time`. Using `strict_date_optional_time` as the default format for date_nanos doesn't make sense because it accepts and parses dates with nanosecond precision, but when it formats it drops the nanoseconds. The change should be transparent for users, these formats accept the same input. Relates #69192 Closes #67063
This commit updates the default format of date_nanos field on existing and new indices to use `strict_date_optional_time_nanos` instead of `strict_date_optional_time`. Using `strict_date_optional_time` as the default format for date_nanos doesn't make sense because it accepts and parses dates with nanosecond precision, but when it formats it drops the nanoseconds. The change should be transparent for users, these formats accept the same input. Relates #69192 Closes #67063
This commit updates the default format of date_nanos field on existing and new indices to use `strict_date_optional_time_nanos` instead of `strict_date_optional_time`. Using `strict_date_optional_time` as the default format for date_nanos doesn't make sense because it accepts and parses dates with nanosecond precision, but when it formats it drops the nanoseconds. The change should be transparent for users, these formats accept the same input. Relates #69192 Closes #67063
If a search after request targets multiple indices and some of its sort field has type `date` in one index but `date_nanos` in other indices, then Elasticsearch won't interpret the search_after parameter correctly in every target index. The sort value of a date field by default is a long of milliseconds since the epoch while a date_nanos field is a long of nanoseconds. This commit introduces the `format` parameter in the sort field so a sort value of a date or date_nanos will be formatted using a date format in a search response. The below example illustrates how to use this new parameter. ```js { "query": { "match_all": {} }, "sort": [ { "timestamp": { "order": "asc", "format": "strict_date_optional_time_nanos" } } ] } ``` ```js { "query": { "match_all": {} }, "sort": [ { "timestamp": { "order": "asc", "format": "strict_date_optional_time_nanos" } } ], "search_after": [ "2015-01-01T12:10:30.123456789Z" // in `strict_date_optional_time_nanos` format ] } ``` Closes elastic#69192
If a search after request targets multiple indices and some of its sort field has type `date` in one index but `date_nanos` in other indices, then Elasticsearch won't interpret the search_after parameter correctly in every target index. The sort value of a date field by default is a long of milliseconds since the epoch while a date_nanos field is a long of nanoseconds. This commit introduces the `format` parameter in the sort field so a sort value of a date or date_nanos will be formatted using a date format in a search response. The below example illustrates how to use this new parameter. ```js { "query": { "match_all": {} }, "sort": [ { "timestamp": { "order": "asc", "format": "strict_date_optional_time_nanos" } } ] } ``` ```js { "query": { "match_all": {} }, "sort": [ { "timestamp": { "order": "asc", "format": "strict_date_optional_time_nanos" } } ], "search_after": [ "2015-01-01T12:10:30.123456789Z" // in `strict_date_optional_time_nanos` format ] } ``` Closes #69192
Today the sort values used to rank each hit in the response are exposed as raw values in an array (
response.hits.hit.0.sort
).These values are meant to be copied in
search_after
request in order to paginate efficiently over a set of results.By default, the sort value for
date
anddate_nanos
field is represented as a long, that's the internal representation that we use for this field. This leaking of internal representation is problematic because the returned value cannot be interpreted without context.date
returns the number of milliseconds since epoch whiledate_nanos
returns the number of nanoseconds. In order to fix this discrepancy we'd like to gradually introduce formatted sort values.At first we'd like to add a
format
option to any sort value in a search request. Setting a format there would ensure that the sort values in the response would be formatted accordingly:The same format would also be used to parse the
search_after
value so that copying thesort
values directly insearch_after
continues to work:It would be nice to also apply the formatter of the field by default if no format is specified. That would solve the leaking of internal representation entirely but would have more impact on users.
The text was updated successfully, but these errors were encountered: