Normalized/Shared Mapping #12817

pickypg · 2015-08-11T23:39:06Z

Problem

Currently, for anyone that uses time-based indices, they will run into repeating their mapping--ideally using a template--for each split of the index.

This can be wasteful and it's hard to manage from ORM tools and tools like Kibana where a mapping of 300 days with 100 fields becomes 30,000 fields. To display content generated from these kind of indices requires somehow looping across them and combining them.

Proposed Solution

I think there are two different solutions to this problem where the first is easier than the second:

Create a Normalized Mapping API that returns a single, combined index mapping that represents the merged fields from every index. If there are any additions, then they are appended. If there are any differences/conflicts, then it should simply fail. This simply avoids the network hop for any service that is already doing this now.
Enable indices to use a new type of shared mapping. It could behave similarly to the snapshot/restore API where the mapping is pointed too by the index. If any change is ever made to the shared mapping, then it creates a new version of it and only new indices point to it. Naturally the shards will have to store the mapping, but any operation against them could use the single, shared mapping. This should also help to reduce the cluster state size by avoiding even adding a new mapping with any new index that shares the same mapping.

rashidkpc · 2016-01-14T15:23:46Z

Just popping in to say this would be crazy useful for us. We're planning to deprecate our storage of a pre-normalized copy of the mapping as it was simply a work around for not having structured exceptions in Elasticsearch and caused a ton of big problems for users that update their field lists.

Having this done in elasticsearch would be a huge performance and reliability boost!

rjernst · 2016-01-18T20:36:41Z

Where would the performance, or reliability boost come in? The cluster state is already compressed, so similarities in mappings across indexes shouldn't take up extra space there. As for sharing in memory, I think that might be a very large change (we don't have really anything shared across indexes right now, afaik), and I'm not sure the memory savings really add up to anything significant, compared to the cost of having that many indexes (in which case, if that is a bottleneck, the user should probably create new indexes less often). Instead, something like @s1monw has suggested before, having the ability to collapse multiple indices into one for archiving, would be much better on performance.

clintongormley · 2016-01-19T12:49:46Z

Another thought that came up in FixItFriday: HTTP compression should greatly reduce the amount of data being sent over the wire (given that there is so much repetition in the mappings). Unfortunately, HTTP compression is disabled by default (see #1482) . @kimchy can you remember the details?

I tested this out with 10 indices containing the same mapping of twenty fields, and it reduced a GET _mapping from 5589 bytes to 209 bytes... Sounds like this could be worth doing.

clintongormley · 2016-03-01T10:39:43Z

Closing in favour of #15728

pickypg added discuss :Search Foundations/Mapping Index mappings, including merging and defining field types labels Aug 11, 2015

spalger mentioned this issue Feb 29, 2016

Missing field mappings - Increase the default lookBack setting? elastic/kibana#6362

Closed

clintongormley closed this as completed Mar 1, 2016

rashidkpc mentioned this issue Mar 3, 2016

Deal with ramifications of Elasticsearch deprecating the "string" field type elastic/kibana#6403

Closed

4 tasks

rashidkpc mentioned this issue Apr 7, 2016

Return an aggregated view of all mappings/properties of all types #15728

Closed

javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalized/Shared Mapping #12817

Normalized/Shared Mapping #12817

pickypg commented Aug 11, 2015

rashidkpc commented Jan 14, 2016

rjernst commented Jan 18, 2016

clintongormley commented Jan 19, 2016

clintongormley commented Mar 1, 2016

Normalized/Shared Mapping #12817

Normalized/Shared Mapping #12817

Comments

pickypg commented Aug 11, 2015

Problem

Proposed Solution

rashidkpc commented Jan 14, 2016

rjernst commented Jan 18, 2016

clintongormley commented Jan 19, 2016

clintongormley commented Mar 1, 2016