-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for runtime fields #59332
Labels
>feature
Meta
release highlight
:Search/Search
Search-related issues that do not fall into other categories
Team:Search
Meta label for search team
Comments
javanna
added
>feature
:Search/Search
Search-related issues that do not fall into other categories
Meta
labels
Jul 9, 2020
Pinging @elastic/es-search (:Search/Search) |
This was referenced Jul 9, 2020
This was referenced Jul 13, 2020
javanna
added a commit
that referenced
this issue
Jul 14, 2020
javanna
added a commit
that referenced
this issue
Jul 17, 2020
This addresses a TODO around using the script params, which are now parsed from the mappings. It also expand existing tests to verify that params are carried around and accessible in script for both fielddata and queries. Relates to #59332
This was referenced Jul 20, 2020
javanna
added a commit
that referenced
this issue
Jul 21, 2020
This was referenced Sep 15, 2020
nik9000
added a commit
that referenced
this issue
Sep 15, 2020
We were checking for loops in queries before, but we had an "off by one" error where we wouldn't notice the "top level" runtime field when detecting a loop. So the error message would be wrong. I also caught a few bugs with query generation caused by missing `@Override` annotations and fixed a few of them. There is a bug with `regexp` queries with match options that I'm not fixing in this PR but will get to later. Relates to #59332
nik9000
added a commit
that referenced
this issue
Sep 16, 2020
This implements the `fields` API in `_search` for runtime fields using doc values. Most of that implementation is stolen from the `docvalue_fields` fetch sub-phase, just moved into the same API that the `fields` API uses. At this point the `docvalue_fields` fetch phase looks like a special case of the `fields` API. While I was at it I moved the "which doc values sub-implementation should I use for fetching?" question from a bunch of `instanceof`s to a method on `LeafFieldData` so we can be much more flexible with what is returned and we're not forced to extend certain classes just to make the fetch phase happy. Relates to #59332
nik9000
added a commit
that referenced
this issue
Sep 16, 2020
We were checking for loops in queries before, but we had an "off by one" error where we wouldn't notice the "top level" runtime field when detecting a loop. So the error message would be wrong. I also caught a few bugs with query generation caused by missing `@Override` annotations and fixed a few of them. There is a bug with `regexp` queries with match options that I'm not fixing in this PR but will get to later. Relates to #59332
This was referenced Nov 5, 2020
nik9000
added a commit
that referenced
this issue
Nov 10, 2020
This adds a way to specify the `runtime_mappings` on a search request which are always "runtime" fields. It looks like: ``` curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test curl -XPOST -uelastic:password -HContent-Type:application/json 'localhost:9200/test/_bulk?pretty&refresh' -d' {"index": {}} {"animal": "cat", "sound": "meow"} {"index": {}} {"animal": "dog", "sound": "woof"} {"index": {}} {"animal": "snake", "sound": "hisssssssssssssssss"} ' curl -XPOST -uelastic:password -HContent-Type:application/json localhost:9200/test/_search?pretty -d' { "runtime_mappings": { "animal.upper": { "type": "keyword", "script": "for (String s : doc[\"animal.keyword\"]) {emit(s.toUpperCase())}" } }, "query": { "match": { "animal.upper": "DOG" } } }' ``` NOTE: If we have to send a search request with runtime mappings to a node that doesn't support runtime mappings at all then we'll fail the search request entirely. The alternative would be to not send those runtime mappings and let the node fail the search request with an "unknown field" error. I believe this is would be hard to surprising because you defined the field in the search request. NOTE: It isn't obvious but you can also use `runtime_mappings` to override fields inside objects by naming the runtime fields with `.` in them. Like this: ``` curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_bulk?refresh -d' {"index":{}} {"name": {"first": "Andrew", "last": "Wiggin"}} {"index":{}} {"name": {"first": "Julian", "last": "Delphiki", "suffix": "II"}} ' curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_search?pretty -d'{ "runtime_mappings": { "name.first": { "type": "keyword", "script": "if (\"Wiggin\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Ender\");} else if (\"Delphiki\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Bean\");}" } }, "query": { "match": { "name.first": "Bean" } } }' ``` Relates to #59332
nik9000
added a commit
to nik9000/elasticsearch
that referenced
this issue
Nov 10, 2020
This adds a way to specify the `runtime_mappings` on a search request which are always "runtime" fields. It looks like: ``` curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test curl -XPOST -uelastic:password -HContent-Type:application/json 'localhost:9200/test/_bulk?pretty&refresh' -d' {"index": {}} {"animal": "cat", "sound": "meow"} {"index": {}} {"animal": "dog", "sound": "woof"} {"index": {}} {"animal": "snake", "sound": "hisssssssssssssssss"} ' curl -XPOST -uelastic:password -HContent-Type:application/json localhost:9200/test/_search?pretty -d' { "runtime_mappings": { "animal.upper": { "type": "keyword", "script": "for (String s : doc[\"animal.keyword\"]) {emit(s.toUpperCase())}" } }, "query": { "match": { "animal.upper": "DOG" } } }' ``` NOTE: If we have to send a search request with runtime mappings to a node that doesn't support runtime mappings at all then we'll fail the search request entirely. The alternative would be to not send those runtime mappings and let the node fail the search request with an "unknown field" error. I believe this is would be hard to surprising because you defined the field in the search request. NOTE: It isn't obvious but you can also use `runtime_mappings` to override fields inside objects by naming the runtime fields with `.` in them. Like this: ``` curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_bulk?refresh -d' {"index":{}} {"name": {"first": "Andrew", "last": "Wiggin"}} {"index":{}} {"name": {"first": "Julian", "last": "Delphiki", "suffix": "II"}} ' curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_search?pretty -d'{ "runtime_mappings": { "name.first": { "type": "keyword", "script": "if (\"Wiggin\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Ender\");} else if (\"Delphiki\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Bean\");}" } }, "query": { "match": { "name.first": "Bean" } } }' ``` Relates to elastic#59332
nik9000
added a commit
that referenced
this issue
Nov 11, 2020
* Add `runtime_mappings` to search request (backport of #64374) This adds a way to specify the `runtime_mappings` on a search request which are always "runtime" fields. It looks like: ``` curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test curl -XPOST -uelastic:password -HContent-Type:application/json 'localhost:9200/test/_bulk?pretty&refresh' -d' {"index": {}} {"animal": "cat", "sound": "meow"} {"index": {}} {"animal": "dog", "sound": "woof"} {"index": {}} {"animal": "snake", "sound": "hisssssssssssssssss"} ' curl -XPOST -uelastic:password -HContent-Type:application/json localhost:9200/test/_search?pretty -d' { "runtime_mappings": { "animal.upper": { "type": "keyword", "script": "for (String s : doc[\"animal.keyword\"]) {emit(s.toUpperCase())}" } }, "query": { "match": { "animal.upper": "DOG" } } }' ``` NOTE: If we have to send a search request with runtime mappings to a node that doesn't support runtime mappings at all then we'll fail the search request entirely. The alternative would be to not send those runtime mappings and let the node fail the search request with an "unknown field" error. I believe this is would be hard to surprising because you defined the field in the search request. NOTE: It isn't obvious but you can also use `runtime_mappings` to override fields inside objects by naming the runtime fields with `.` in them. Like this: ``` curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_bulk?refresh -d' {"index":{}} {"name": {"first": "Andrew", "last": "Wiggin"}} {"index":{}} {"name": {"first": "Julian", "last": "Delphiki", "suffix": "II"}} ' curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_search?pretty -d'{ "runtime_mappings": { "name.first": { "type": "keyword", "script": "if (\"Wiggin\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Ender\");} else if (\"Delphiki\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Bean\");}" } }, "query": { "match": { "name.first": "Bean" } } }' ``` Relates to #59332
javanna
added a commit
that referenced
this issue
Nov 12, 2020
The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script. ``` PUT /my-index/_mappings { "runtime" : { "day_of_week" : { "type" : "keyword", "script" : { "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } } }, "properties" : { "timestamp" : { "type" : "date" } } } ``` Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated. Thanks to the introduction of the runtime section, runtime fields override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request. Relates to #59332
javanna
added a commit
to javanna/elasticsearch
that referenced
this issue
Nov 12, 2020
The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script. ``` PUT /my-index/_mappings { "runtime" : { "day_of_week" : { "type" : "keyword", "script" : { "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } } }, "properties" : { "timestamp" : { "type" : "date" } } } ``` Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated. Thanks to the introduction of the runtime section, runtime fields override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request. Relates to elastic#59332
javanna
added a commit
that referenced
this issue
Nov 13, 2020
The runtime section is at the same level as the existing properties section. Its purpose is to hold runtime fields only. With the introduction of the runtime section, a runtime field can be defined by specifying its type (previously called runtime_type) and script. ``` PUT /my-index/_mappings { "runtime" : { "day_of_week" : { "type" : "keyword", "script" : { "source" : "emit(doc['timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))" } } }, "properties" : { "timestamp" : { "type" : "date" } } } ``` Fields defined in the runtime section can be updated at any time as they are not present in the lucene index. They get replaced entirely when they get updated. Thanks to the introduction of the runtime section, runtime fields override existing mapped fields defined with the same name, similarly to runtime fields defined in the search request. Relates to #59332
javanna
added a commit
that referenced
this issue
Dec 8, 2020
javanna
added a commit
to javanna/elasticsearch
that referenced
this issue
Dec 8, 2020
javanna
added a commit
that referenced
this issue
Dec 8, 2020
All of the tasks for the first phase of Runtime Fields are completed. Runtime fields will ship with 7.11. Closing this. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>feature
Meta
release highlight
:Search/Search
Search-related issues that do not fall into other categories
Team:Search
Meta label for search team
Runtime fields
We would like to increase the flexibility of the search API by introducing support for runtime fields.
Runtime fields are not indexed and do not have doc_values, meaning Lucene is completely unaware of them, but they are consumed through the field capabilities API and the search API like any ordinary field. It is possible to retrieve them as well as query them, aggregate and sort on them.
Runtime fields make searches slower, as computing their values for each document (that may match the query) is costly, depending on how they are calculated; it is highly recommended that the Async Search API is used to run searches that use runtime fields.
One limitation of runtime fields compared to ordinary fields is that they don’t support scoring as they are not indexed and we are not going to compute the document frequency for them, which is required for scoring.
Runtime fields are not part of the _source, hence they are not returned by default as part of the search hits. They can be specifically requested through the field retrieval API (#55363).
A runtime field is defined by its data type and the script that computes its values. As of today, each search section already supports scripting, but the contexts are different, as well as the required syntax. We want to unify this to a single place where a script can be specified. Such a script always has access to
_source
, any other stored fields as well as doc_values.A runtime fields can be defined in the mappings by adding its definition to a new
runtime
section at the same level asproperties
, where fields that exist in _source are defined:The data types supported for runtime fields are initially
keyword
,long
,double
,date
,ip
,boolean
,geo_point
. In the example above, we extract the day of week (e.g. Monday) from another field calledtimestamp
which is defined as adate
. The script can refer to other fields, including other runtime fields: we need to implement a mechanism to resolve fields dependencies in the correct order, and prevent cyclic dependencies.The defined field can then be used like any other field in the different sections of the search API:
Each runtime field type will consist of a
MappedFieldType
that exposes a runtime fielddata implementation that generates doc_values on the fly for the needed data type. Additionally, all the basic Lucene queries for each runtime field type need to be written to query the corresponding fielddata/doc_values implementation.Support for runtime fields in Elasticsearch will be released under the Elastic license.
The following is a high-level list of tasks required to develop the initial support for runtime fields, which will lay the foundations for the next phases:
Mappers and field types
ScriptService
available when parsing field mappers (Add the ScriptService to the field parser config #60933)runtime
field mapper with dynamic mapped field type, one per supported runtime_type (Scripted keyword field #58939, Make ScriptFieldMapper a parameterized mapper #59391, Runtime fields: rework script service injection #59659, Render script params in script field xcontent #59813, Standardize script field's rejection error #60029, Replace script unit tests with integration tests #60027, Runtime script field mapper to reject copy_to and fields #60580)keyword
field type (Add term query for keyword script fields #59372, Two queries for keyword script field #59527, Add tests for keyword script field's fielddata #59523, Remaining queries for script keyword fields #59630, Scripted keyword field type: update family type and test field caps output #59672, Error on bad shape relations in runtime fields #60463)long
(Add long flavored script field #59721, Remaining queries for long script field #59816, ScriptLongFieldData to extend LeafLongFieldData #59884)double
(Add double script fields #59933)date
field type (Add runtime_script date field #60092, Format support for script doc fields #60465, Add a consistent way to parse dates #61105)ip
field type (Implement runtime script ips #60533 + Fixvalue
method on ip scripts #61230)boolean
field type (Add boolean values script fields #60830)API
Scripting
Infrastructure
SearchLookup
available toMappedFieldType#fielddataBuilder
(Pass a SearchLookup supplier through to fielddataBuilder #60224. Pass SearchLookup supplier through to fielddataBuilder #61430)Security
Telemetry
Docs
runtime
section and corresponding field types ([DOCS] Add docs for runtime fields #62653)The text was updated successfully, but these errors were encountered: