Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic _source automatically sets store to true on text fields #97039

Closed
martijnvg opened this issue Jun 23, 2023 · 6 comments · Fixed by #106338
Closed

Synthetic _source automatically sets store to true on text fields #97039

martijnvg opened this issue Jun 23, 2023 · 6 comments · Fixed by #106338

Comments

@martijnvg
Copy link
Member

martijnvg commented Jun 23, 2023

There are at least some metric related integrations that have templates that use the text field type.
When these integrations are migrated to tsdb, then often this fails initially complaining with the error that text field either needs a subfield of type keyword or should be stored (by setting store=true, enabling stored fields for this field specifically).
This is a bit confusing sometimes, because the error doesn't come from ES directly and takes time to figure out what needs to be changed.

There are two options here, remove the text field from template when upgrading to tsdb or make suggested changes to the field of type text. What often happens is that store is set to true. However if this is always the intended approach for integrations, should Elasticsearch change the default of store to true for tsdb?

Relates to #96254

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 23, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@agithomas
Copy link

haproxy integration has datastream stats which has field having type text. Since there exist a text type field, on doing TSDB enablement (stack : 8.8.2), error mentioned below appears. This will be a blocker for TSDB enablement of HAProxy.

Install the package Error: can't install the package: can't install the package: could not install package; API status code = 500; response body = {"statusCode":500,"error":"Internal Server Error","message":"illegal_argument_exception\n\tRoot causes:\n\t\tillegal_argument_exception: field [haproxy.stat.source.address] of type [text] doesn't support synthetic source unless it is stored or has a sub-field of type [keyword] with doc values or stored and without a normalizer"}

@martijnvg
Copy link
Member Author

We discussed this and we should default the store option to true in case synthetic source is used and there are no sub fields of type keyword. This way synthetic source can synthesise the field from the stored field. If there is a keyword sub field then synthetic source can synthesise from the sub field.

This better than failing with an error. If store is specifically configured to false on text fields then creating a template or index with synthetic source will return the ... doesn't support synthetic source unless it is stored or has a sub-field of type [keyword] with doc values or stored and without a normalizer error.

@agithomas
Copy link

agithomas commented Aug 10, 2023

We discussed this and we should default the store option to true

I assume, based on the discussion here, what is stopping from having this option used is not having the support of package-spec. Correct?

Please let me know from which version of Elasticsearch, the support of store : (true | false) is present? 8.8.0 and above ?

@martijnvg
Copy link
Member Author

Please let me know from which version of Elasticsearch, the support of store : (true | false) is present? 8.8.0 and above ?

Setting store to true has been option since tsdb was GA-ed iirc. So I think from 8.7.0 and later.

I assume, based on the discussion #96254 (comment),

Note that the `error: building package failed: invalid content found in built zip package: found 1 validation error:

  1. file "fields.yml" is invalid: field 0.fields.3.fields.7: Additional property store is not allowed` error isn't an elasticsearch error.

@wchaparro wchaparro changed the title Should text fields be stored by default if synthetic source is enabled? Synthetic _source automatically sets store to true on text fields Oct 27, 2023
@elasticsearchmachine elasticsearchmachine added Team:StorageEngine and removed Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Mar 11, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

lkts added a commit to lkts/elasticsearch that referenced this issue Mar 14, 2024
Synthetic source requires text fields to be stored or have keyword
sub-field that supports synthetic source. If there are no keyword fields
 users currently have to explicitly set 'store' to 'true' or get a
validation exception. This is not the best experience. It is quite
likely that setting `store` to `true` is  the correct thing to do but
users still get an error and need to investigate it. With this change if
 `store` setting is not specified in such context it  will be set to
 `true` by default. Setting it explicitly to `false` results in the
 exception.

Closes elastic#97039
lkts added a commit that referenced this issue Mar 26, 2024
* Text fields are stored by default with synthetic source

Synthetic source requires text fields to be stored or have keyword
sub-field that supports synthetic source. If there are no keyword fields
 users currently have to explicitly set 'store' to 'true' or get a
validation exception. This is not the best experience. It is quite
likely that setting `store` to `true` is  the correct thing to do but
users still get an error and need to investigate it. With this change if
 `store` setting is not specified in such context it  will be set to
 `true` by default. Setting it explicitly to `false` results in the
 exception.

Closes #97039
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants