Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic templates should be deduplicated #28988

Closed
kostasb opened this issue Mar 12, 2018 · 9 comments
Closed

Dynamic templates should be deduplicated #28988

kostasb opened this issue Mar 12, 2018 · 9 comments
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@kostasb
Copy link

kostasb commented Mar 12, 2018

Bug Report / Feature request:

Check for identical entries in the list of dynamic templates and deduplicate or reject them to avoid bloating.

Describe the issue:

It is currently possible to have the same dynamic template definition appended multiple times in the dynamic templates list of a mapping or a template, which can result to significant performance degradation.

See example template of an actual problematic 300.000 lines+ configuration installed in a cluster due to a repetition when putting the dynamic templates. This eventually resulted in a slowdown, tasks piling up, causing rejected bulks etc.

Elasticsearch version: issue exists up to current - 6.2.2

Steps to reproduce:

PUT the same dynamic template definition multiple times when defining a mapping or a template:

PUT my_index
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [
        {
          "docker.container.labels": {
            "mapping": {
              "type": "keyword"
            },
            "match_mapping_type": "string",
            "path_match": "docker.container.labels.*"
          }
        },
        {
          "docker.container.labels": {
            "mapping": {
              "type": "keyword"
            },
            "match_mapping_type": "string",
            "path_match": "docker.container.labels.*"
          }
        }
      ]
    }
  }
}
@kostasb kostasb added the >bug label Mar 12, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@colings86 colings86 added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Apr 24, 2018
@jtibshirani jtibshirani added the help wanted adoptme label May 1, 2019
@jtibshirani
Copy link
Contributor

Perhaps we should actually throw an error in the case of duplicate templates, as this would likely not be intentional and could indicate a mistake on the part of the user.

cbuescher pushed a commit to cbuescher/elasticsearch that referenced this issue Feb 3, 2020
Currently the same `dynamic_templates` name can be used more than once in the
same mappings file. This most certainly is unintentional and can create issues like
those reported in elastic#28988. Furthermore, when templates are matched later in
`RootObjectMapper#findTemplate`, it looks like we simply pick the first matching
one we see in the input array, so using the same template name twice sounds like
an error prone idea anyway. This change checks for duplicate names on parsing
and throws an error if we encounter the same name twice.

Closes elastic#28988
@cbuescher
Copy link
Member

As pointed out in the PR above, simply disallowing duplicate names - although it might still be desireable - is not as straight forward since users like APM (see #29200) might rely on this, although it also creates issues on their side. Just mentioning here to cross-link for the next time this issue is visited.

@cbuescher
Copy link
Member

Simply erroring on duplicate template names seems to be to fragile, but looking at DynamicTemplate it should be possible to detect similar templates, e.g. by implementing some equals method on the properties and disallow similar definitions. The problem described here is caused by accidentally repeating the same definition many times. Guarding against that should be possible without disallowing duplicate names altogether so usages like the one in #29200 are not affected.

cbuescher pushed a commit to cbuescher/elasticsearch that referenced this issue Feb 7, 2020
Currently the same dynamic_templates can be defined more than once by error in
the same mappings file. This most certainly is unintentional and can create
issues like those reported in elastic#28988. This change checks for duplicates while
parsing and throws an error if we encounter the same definition twice.

Closes elastic#28988
@cbuescher
Copy link
Member

We discussed this with the team and the problems create when there are multiple duplicate keys in the dynamic_template mapping
lead us to think that this should be disallowed altogether.
Apart from the issue presented here there are problems when merging templates that contain duplicate keys (the duplicates don't have to be present in all the merged templates, one seems to be enough). In this case only one is picked, which creates problems as described in elastic/apm-server#2693 (comment).
We'd like to dissallow this kind of template definition altogether but need to do this in a backwards-compatible way that allows for existing templates like the ones mentioned in #29200 (comment) to emit warnings and potentially be updated by appending unique prefixes to the duplicated names. I will open a new issue around this to discuss the details.

@cbuescher
Copy link
Member

I opened #53326 with some more detail about out discussion around deprecating and removing the possibility of having the same template names in dynamic_template.

@cbuescher
Copy link
Member

Since the plan for the full removal of duplicate dynamic template names spans multiple versions, maybe we could add #52062 still as a short-term solution for at least the problem described here (accidentally programmatically adding the same section over and over again?). @jtibshirani wdyt?

@jtibshirani jtibshirani removed the help wanted adoptme label Mar 10, 2020
@jtibshirani
Copy link
Contributor

@cbuescher my intuition is that we should just focus on #53326 (and close #28988 and #52062 in favor of that issue). I could see some users accidentally having duplicate entries in their templates, especially with long templates or those that are created programmatically. So if we added #52062, I think we would also want a deprecation period before we disallow duplicate templates entirely. It seems simplest for both us and users if we only performed the combined change described in #53326.

@jtibshirani
Copy link
Contributor

Closing in favor of #53326.

@javanna javanna added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
7 participants