diff --git a/docs/changelog/109501.yaml b/docs/changelog/109501.yaml new file mode 100644 index 0000000000000..6e81f98816cbf --- /dev/null +++ b/docs/changelog/109501.yaml @@ -0,0 +1,14 @@ +pr: 109501 +summary: Reflect latest changes in synthetic source documentation +area: Mapping +type: enhancement +issues: [] +highlight: + title: Synthetic `_source` improvements + body: |- + There are multiple improvements to synthetic `_source` functionality: + + * Synthetic `_source` is now supported for all field types including `nested` and `object`. `object` fields are supported with `enabled` set to `false`. + + * Synthetic `_source` can be enabled together with `ignore_malformed` and `ignore_above` parameters for all field types that support them. + notable: false diff --git a/docs/reference/data-streams/tsds.asciidoc b/docs/reference/data-streams/tsds.asciidoc index 460048d8ccbc9..de89fa1ca3f31 100644 --- a/docs/reference/data-streams/tsds.asciidoc +++ b/docs/reference/data-streams/tsds.asciidoc @@ -53,8 +53,9 @@ shard segments by `_tsid` and `@timestamp`. documents, the document `_id` is a hash of the document's dimensions and `@timestamp`. A TSDS doesn't support custom document `_id` values. + * A TSDS uses <>, and as a result is -subject to a number of <>. +subject to some <> and <> applied to the `_source` field. NOTE: A time series index can contain fields other than dimensions or metrics. diff --git a/docs/reference/mapping/fields/source-field.asciidoc b/docs/reference/mapping/fields/source-field.asciidoc index ec824e421e015..903b301ab1a96 100644 --- a/docs/reference/mapping/fields/source-field.asciidoc +++ b/docs/reference/mapping/fields/source-field.asciidoc @@ -6,11 +6,11 @@ at index time. The `_source` field itself is not indexed (and thus is not searchable), but it is stored so that it can be returned when executing _fetch_ requests, like <> or <>. -If disk usage is important to you then have a look at -<> which shrinks disk usage at the cost of -only supporting a subset of mappings and slower fetches or (not recommended) -<> which also shrinks disk -usage but disables many features. +If disk usage is important to you, then consider the following options: + +- Using <>, which reconstructs source content at the time of retrieval instead of storing it on disk. This shrinks disk usage, at the cost of slower access to `_source` in <> and <> queries. +- <>. This shrinks disk +usage but disables features that rely on `_source`. include::synthetic-source.asciidoc[] @@ -43,7 +43,7 @@ available then a number of features are not supported: * The <>, <>, and <> APIs. -* In the {kib} link:{kibana-ref}/discover.html[Discover] application, field data will not be displayed. +* In the {kib} link:{kibana-ref}/discover.html[Discover] application, field data will not be displayed. * On the fly <>. diff --git a/docs/reference/mapping/fields/synthetic-source.asciidoc b/docs/reference/mapping/fields/synthetic-source.asciidoc index a0e7aed177a9c..ccea38cf602da 100644 --- a/docs/reference/mapping/fields/synthetic-source.asciidoc +++ b/docs/reference/mapping/fields/synthetic-source.asciidoc @@ -28,45 +28,22 @@ PUT idx While this on the fly reconstruction is *generally* slower than saving the source documents verbatim and loading them at query time, it saves a lot of storage -space. +space. Additional latency can be avoided by not loading `_source` field in queries when it is not needed. + +[[synthetic-source-fields]] +===== Supported fields +Synthetic `_source` is supported by all field types. Depending on implementation details, field types have different properties when used with synthetic `_source`. + +<> construct synthetic `_source` using existing data, most commonly <> and <>. For these field types, no additional space is needed to store the contents of `_source` field. Due to the storage layout of <>, the generated `_source` field undergoes <> compared to original document. + +For all other field types, the original value of the field is stored as is, in the same way as the `_source` field in non-synthetic mode. In this case there are no modifications and field data in `_source` is the same as in the original document. Similarly, malformed values of fields that use <> or <> need to be stored as is. This approach is less storage efficient since data needed for `_source` reconstruction is stored in addition to other data required to index the field (like `doc_values`). [[synthetic-source-restrictions]] ===== Synthetic `_source` restrictions -There are a couple of restrictions to be aware of: +Synthetic `_source` cannot be used together with field mappings that use <>. -* When you retrieve synthetic `_source` content it undergoes minor -<> compared to the original JSON. -* Synthetic `_source` can be used with indices that contain only these field -types: - -** <> -** {plugins}/mapper-annotated-text-usage.html#annotated-text-synthetic-source[`annotated-text`] -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> -** <> +Some field types have additional restrictions. These restrictions are documented in the **synthetic `_source`** section of the field type's <>. [[synthetic-source-modifications]] ===== Synthetic `_source` modifications @@ -178,4 +155,40 @@ that ordering. [[synthetic-source-modifications-ranges]] ====== Representation of ranges -Range field vales (e.g. `long_range`) are always represented as inclusive on both sides with bounds adjusted accordingly. See <>. +Range field values (e.g. `long_range`) are always represented as inclusive on both sides with bounds adjusted accordingly. See <>. + +[[synthetic-source-precision-loss-for-point-types]] +====== Reduced precision of `geo_point` values +Values of `geo_point` fields are represented in synthetic `_source` with reduced precision. See <>. + + +[[synthetic-source-fields-native-list]] +===== Field types that support synthetic source with no storage overhead +The following field types support synthetic source using data from <> or <>, and require no additional storage space to construct the `_source` field. + +NOTE: If you enable the <> or <> settings, then additional storage is required to store ignored field values for these types. + +** <> +** {plugins}/mapper-annotated-text-usage.html#annotated-text-synthetic-source[`annotated-text`] +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <> +** <>