Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch accepts and stores floats in keyword mapped field #34222

Closed
bmcconaghy opened this issue Oct 2, 2018 · 7 comments
Closed

Elasticsearch accepts and stores floats in keyword mapped field #34222

bmcconaghy opened this issue Oct 2, 2018 · 7 comments
Labels
>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@bmcconaghy
Copy link

Elasticsearch version (bin/elasticsearch --version): 7.0.0-alpha1 (also 6.4)

Plugins installed: []

JVM version (java -version):

OS version (uname -a if on a Unix-like system):

Description of the problem including expected versus actual behavior:
Related to: elastic/kibana#23530

Steps to reproduce:

PUT debug-index

PUT debug-index/doc/_mapping
{
  "properties": {
    "id": {
      "type": "keyword"
    }
  }
}

PUT debug-index/doc/1
{
  "id": 10.0
}

GET debug-index/doc/1

Last command returns:

{
  "_index": "debug-index",
  "_type": "doc",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "id": 10.0
  }
}

Note that a keyword field has accepted a float and preserved it as such in returning it. I think ES should either reject the PUT of this document or coerce the float into a string.

Provide logs (if relevant):

@javanna
Copy link
Member

javanna commented Oct 2, 2018

Note that what gets returned is the _source which is what was submitted at index time, unchanged, regardless of how each field gets mapped and indexed. So the fact that the field is returned as a float has little to do with how the field itself was mapped internally. I assume it's mapped as a string (keyword, like stated in the mappings) and we may indeed want to throw an error instead.

@javanna javanna added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Oct 2, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@jimczi
Copy link
Contributor

jimczi commented Oct 2, 2018

I think ES should either reject the PUT of this document or coerce the float into a string.

As @javanna explained we always coerce numerics into strings if the field is mapped as a text/keyword. We also preserve the original _source so the field still appear as a numeric when you retrieve the _source. However for numeric fields we have a coerce option (defaults to true) which when disabled throws an error if the json type is not a numeric. Are you saying that we should have the same option on text/keyword field to disallow automatic coercion ? This would be consistent with the other field types but I wonder if it would be useful or not for users.

@bmcconaghy
Copy link
Author

Seems to me like the default of coerce=true has the potential of hiding bugs in how people are mapping their fields (or how they are sending the data). Seems like either the mapping or the data is wrong if a user is sending 10.0 as a value for a field intended to be a keyword. Maybe I'm not thinking of some valid use case where someone would legitimately be doing this though.

@colings86
Copy link
Contributor

This is one of the many cases where there is a balance between the getting started experience (being able to get data into Elasticsearch quickly without having to jump through many hoops) and production protection. In the past we have erred on the side of the getting started experience with this and thats why the default is coerce-true for numeric fields.

Noted that there is currently no coerce option for non-numeric field types and this is what Jim is suggesting we might want to add depending on feedback from the community. We can certain talk about what we feel the default should be if we decide to extend the coerce option to other types but we should weigh up the production benefits against the getting started experience and we should land on a default that is the same across all types otherwise it will be confusing for users.

@jtibshirani
Copy link
Contributor

jtibshirani commented May 20, 2019

In #25861, we decided to remove the coerce option on numeric fields and always accept strings. To be consistent with that decision, it seems we should not add coerce to keyword fields, and continue to accept numbers. For this reason, I'm wondering if it makes sense to close this issue -- I'll solicit some opinions from the rest of the team.

@jtibshirani
Copy link
Contributor

We've decided to close this issue and keep the current behavior. If we receive more feedback that the behavior is confusing, we can revisit this choice (keeping in mind consistency across different field types as @colings86 mentioned).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants