-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better sizing BytesRef for Strings in Queries #115655
Changes from 5 commits
3a56144
008f3ad
aa515da
bd5c321
81b2b19
4553d9f
75ca070
72ee4c4
581f777
1de2d8f
71b8279
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 115655 | ||
summary: Better sizing `BytesRef` for Strings in Queries | ||
area: Search | ||
type: enhancement | ||
issues: [] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,6 +14,7 @@ | |
import org.apache.lucene.search.NamedMatches; | ||
import org.apache.lucene.search.Query; | ||
import org.apache.lucene.util.BytesRef; | ||
import org.apache.lucene.util.UnicodeUtil; | ||
import org.elasticsearch.common.ParsingException; | ||
import org.elasticsearch.common.Strings; | ||
import org.elasticsearch.common.io.stream.StreamInput; | ||
|
@@ -216,12 +217,14 @@ public final int hashCode() { | |
* @return the same input object or a {@link BytesRef} representation if input was of type string | ||
*/ | ||
static Object maybeConvertToBytesRef(Object obj) { | ||
if (obj instanceof String) { | ||
return BytesRefs.checkIndexableLength(BytesRefs.toBytesRef(obj)); | ||
} else if (obj instanceof CharBuffer) { | ||
return BytesRefs.checkIndexableLength(new BytesRef((CharBuffer) obj)); | ||
} else if (obj instanceof BigInteger) { | ||
return BytesRefs.toBytesRef(obj); | ||
if (obj instanceof String v) { | ||
byte[] b = new byte[UnicodeUtil.calcUTF16toUTF8Length(v, 0, v.length())]; | ||
UnicodeUtil.UTF16toUTF8(v, 0, v.length(), b); | ||
return BytesRefs.checkIndexableLength(new BytesRef(b, 0, b.length)); | ||
} else if (obj instanceof CharBuffer v) { | ||
return BytesRefs.checkIndexableLength(new BytesRef(v)); | ||
} else if (obj instanceof BigInteger v) { | ||
return BytesRefs.toBytesRef(v); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we test the change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I though we could not because of Java, but actually we can and I implemented it here. |
||
} | ||
return obj; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we extract these 3 lines to a separate utility method (maybe on a more appropriate class)? :) This would be very useful for saving non-trivial amounts of heap in other places!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, moved to
BytesRefs
and added Java Docs 😄