Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalized/Shared Mapping #12817

Closed
pickypg opened this issue Aug 11, 2015 · 4 comments
Closed

Normalized/Shared Mapping #12817

pickypg opened this issue Aug 11, 2015 · 4 comments
Labels
discuss :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@pickypg
Copy link
Member

pickypg commented Aug 11, 2015

Problem

Currently, for anyone that uses time-based indices, they will run into repeating their mapping--ideally using a template--for each split of the index.

This can be wasteful and it's hard to manage from ORM tools and tools like Kibana where a mapping of 300 days with 100 fields becomes 30,000 fields. To display content generated from these kind of indices requires somehow looping across them and combining them.

Proposed Solution

I think there are two different solutions to this problem where the first is easier than the second:

  1. Create a Normalized Mapping API that returns a single, combined index mapping that represents the merged fields from every index. If there are any additions, then they are appended. If there are any differences/conflicts, then it should simply fail. This simply avoids the network hop for any service that is already doing this now.
  2. Enable indices to use a new type of shared mapping. It could behave similarly to the snapshot/restore API where the mapping is pointed too by the index. If any change is ever made to the shared mapping, then it creates a new version of it and only new indices point to it. Naturally the shards will have to store the mapping, but any operation against them could use the single, shared mapping. This should also help to reduce the cluster state size by avoiding even adding a new mapping with any new index that shares the same mapping.
@pickypg pickypg added discuss :Search Foundations/Mapping Index mappings, including merging and defining field types labels Aug 11, 2015
@rashidkpc
Copy link

Just popping in to say this would be crazy useful for us. We're planning to deprecate our storage of a pre-normalized copy of the mapping as it was simply a work around for not having structured exceptions in Elasticsearch and caused a ton of big problems for users that update their field lists.

Having this done in elasticsearch would be a huge performance and reliability boost!

@rjernst
Copy link
Member

rjernst commented Jan 18, 2016

Where would the performance, or reliability boost come in? The cluster state is already compressed, so similarities in mappings across indexes shouldn't take up extra space there. As for sharing in memory, I think that might be a very large change (we don't have really anything shared across indexes right now, afaik), and I'm not sure the memory savings really add up to anything significant, compared to the cost of having that many indexes (in which case, if that is a bottleneck, the user should probably create new indexes less often). Instead, something like @s1monw has suggested before, having the ability to collapse multiple indices into one for archiving, would be much better on performance.

@clintongormley
Copy link
Contributor

Another thought that came up in FixItFriday: HTTP compression should greatly reduce the amount of data being sent over the wire (given that there is so much repetition in the mappings). Unfortunately, HTTP compression is disabled by default (see #1482) . @kimchy can you remember the details?

I tested this out with 10 indices containing the same mapping of twenty fields, and it reduced a GET _mapping from 5589 bytes to 209 bytes... Sounds like this could be worth doing.

@clintongormley
Copy link
Contributor

Closing in favour of #15728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

5 participants