-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opt](inverted index) Optimize the compression of inverted index position information #242
Conversation
zzzxl1993
commented
Oct 12, 2024
•
edited
Loading
edited
- Optimize position information in the inverted index
run buildall |
|
||
size_t P4DEC(unsigned char *__restrict in, size_t n, uint32_t *__restrict out); | ||
size_t P4NZDEC(unsigned char *__restrict in, size_t n, uint32_t *__restrict out); | ||
size_t P4ENC(uint32_t *__restrict in, size_t n, unsigned char *__restrict out); | ||
size_t P4NZENC(uint32_t *__restrict in, size_t n, unsigned char *__restrict out); | ||
|
||
class PforUtil { | ||
public: | ||
static constexpr size_t blockSize = 128; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change block size to 128 from 512?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockSize is only used in position compression
{ | ||
if (!readers.empty()) { | ||
auto release_readers = [this]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we release reader here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release memory in the try->catch->finally code
@@ -86,6 +87,62 @@ class TermDocsBuffer { | |||
IndexVersion indexVersion_ = IndexVersion::kV0; | |||
}; | |||
|
|||
class TermPostingsBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need this buffer? index input already has buffer inside.
7a42b2c
to
dd08cae
Compare
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need clucene UT test here, for example, set IndexVersion v1 and v2
dd8bcb2
to
1761933
Compare
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm