-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalized/Shared Mapping #12817
Comments
Just popping in to say this would be crazy useful for us. We're planning to deprecate our storage of a pre-normalized copy of the mapping as it was simply a work around for not having structured exceptions in Elasticsearch and caused a ton of big problems for users that update their field lists. Having this done in elasticsearch would be a huge performance and reliability boost! |
Where would the performance, or reliability boost come in? The cluster state is already compressed, so similarities in mappings across indexes shouldn't take up extra space there. As for sharing in memory, I think that might be a very large change (we don't have really anything shared across indexes right now, afaik), and I'm not sure the memory savings really add up to anything significant, compared to the cost of having that many indexes (in which case, if that is a bottleneck, the user should probably create new indexes less often). Instead, something like @s1monw has suggested before, having the ability to collapse multiple indices into one for archiving, would be much better on performance. |
Another thought that came up in FixItFriday: HTTP compression should greatly reduce the amount of data being sent over the wire (given that there is so much repetition in the mappings). Unfortunately, HTTP compression is disabled by default (see #1482) . @kimchy can you remember the details? I tested this out with 10 indices containing the same mapping of twenty fields, and it reduced a GET _mapping from 5589 bytes to 209 bytes... Sounds like this could be worth doing. |
Closing in favour of #15728 |
Problem
Currently, for anyone that uses time-based indices, they will run into repeating their mapping--ideally using a template--for each split of the index.
This can be wasteful and it's hard to manage from ORM tools and tools like Kibana where a mapping of 300 days with 100 fields becomes 30,000 fields. To display content generated from these kind of indices requires somehow looping across them and combining them.
Proposed Solution
I think there are two different solutions to this problem where the first is easier than the second:
The text was updated successfully, but these errors were encountered: