-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More efficient encoding of range fields. #26470
Merged
jpountz
merged 1 commit into
elastic:master
from
jpountz:enhancement/range_docvalue_encoding
Sep 13, 2017
Merged
More efficient encoding of range fields. #26470
jpountz
merged 1 commit into
elastic:master
from
jpountz:enhancement/range_docvalue_encoding
Sep 13, 2017
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jpountz
added
:Search Foundations/Mapping
Index mappings, including merging and defining field types
>enhancement
v6.0.0
labels
Sep 1, 2017
martijnvg
approved these changes
Sep 1, 2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement!
@@ -161,4 +170,42 @@ boolean matches(BytesRef from, BytesRef to, BytesRef otherFrom, BytesRef otherTo | |||
|
|||
} | |||
|
|||
public enum LengthType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much better than having the length logic on several places!
@elasticmachine test it please |
This PR removes the vInt that precedes every value in order to know how long they are. Instead the query takes an enum that tells how to compute the length of values: for fixed-length data (ip addresses, double, float) the length is a constant while longs and integers use a variable-length representation that allows the length to be computed from the encoded values. Also the encoding of ints/longs was made a bit more efficient in order not to waste 3 bits in the header. As a consequence, values between -8 and 7 can now be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2 bytes or less. Closes elastic#26443
jpountz
force-pushed
the
enhancement/range_docvalue_encoding
branch
from
September 13, 2017 11:10
0669b4e
to
175426a
Compare
jpountz
added a commit
that referenced
this pull request
Sep 13, 2017
This PR removes the vInt that precedes every value in order to know how long they are. Instead the query takes an enum that tells how to compute the length of values: for fixed-length data (ip addresses, double, float) the length is a constant while longs and integers use a variable-length representation that allows the length to be computed from the encoded values. Also the encoding of ints/longs was made a bit more efficient in order not to waste 3 bits in the header. As a consequence, values between -8 and 7 can now be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2 bytes or less. Closes #26443
jpountz
added a commit
that referenced
this pull request
Sep 13, 2017
This PR removes the vInt that precedes every value in order to know how long they are. Instead the query takes an enum that tells how to compute the length of values: for fixed-length data (ip addresses, double, float) the length is a constant while longs and integers use a variable-length representation that allows the length to be computed from the encoded values. Also the encoding of ints/longs was made a bit more efficient in order not to waste 3 bits in the header. As a consequence, values between -8 and 7 can now be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2 bytes or less. Closes #26443
jasontedor
added a commit
to synhershko/elasticsearch
that referenced
this pull request
Sep 14, 2017
* master: (39 commits) [Docs] Correct typo in removal_of_types.asciidoc (elastic#26646) [Docs] "The the" is a great band, but ... (elastic#26644) Move all repository-azure classes under one single package (elastic#26624) [Docs] Update link in removal_of_types.asciidoc (elastic#26614) Fix percolator highlight sub fetch phase to not highlight query twice (elastic#26622) Refactor bootstrap check results and error messages Add BootstrapContext to expose settings and recovered state to bootstrap checks (elastic#26628) [Tests] Removing skipping tests in search rest tests Initialize checkpoint tracker with allocation ID Move non-core mappers to a module. (elastic#26549) [Docs] Clarify size parameter in Completion Suggester doc (elastic#26617) Add soft limit on allowed number of script fields in request (elastic#26598) Remove MapperService#dynamic. (elastic#26603) Fix incomplete sentences in parent-join docs (elastic#26623) More efficient encoding of range fields. (elastic#26470) Ensure module is bundled before installing in tests Add boolean similarity to built in similarity types (elastic#26613) [Tests] Remove skip tests in search/30_limits.yml Let search phases override max concurrent requests Add a soft limit for the number of requested doc-value fields (elastic#26574) ...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>enhancement
:Search Foundations/Mapping
Index mappings, including merging and defining field types
v6.0.0-rc1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR removes the vInt that precedes every value in order to know how long
they are. Instead the query takes an enum that tells how to compute the length
of values: for fixed-length data (ip addresses, double, float) the length is a
constant while longs and integers use a variable-length representation that
allows the length to be computed from the encoded values.
Also the encoding of ints/longs was made a bit more efficient in order not to
waste 3 bits in the header. As a consequence, values between -8 and 7 can now
be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2
bytes or less.
Closes #26443