-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce IOContext.LOAD #11930
Introduce IOContext.LOAD #11930
Conversation
The default codec has a number of small and hot files, that actually used to be fully loaded in memory before we moved them off-heap. In the general case, these files are expected to fully fit into the page cache for things to work well. Should we give control over preloading to codecs? This is what this commit does for the following files: - Terms index (`tip`) - Points index (`kdi`) - Stored fields index (`fdx`) - Terms vector index (`tvx`) This only has an effect on `MMapDirectory`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine.
Maybe we should still add the the BiPredicate<String,IOContext>
constant to MMapDirectory in this PR (but not eanble it by default). This would require that #11929 is merged first.
* choose to load such files into physical memory (e.g. Java heap) as a way to provide stronger | ||
* guarantees on query latency. | ||
*/ | ||
public final boolean load; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description looks good.
@@ -82,7 +83,7 @@ final class Lucene90DocValuesProducer extends DocValuesProducer { | |||
merging = false; | |||
|
|||
// read in the entries from the metadata file. | |||
try (ChecksumIndexInput in = state.directory.openChecksumInput(metaName, state.context)) { | |||
try (ChecksumIndexInput in = state.directory.openChecksumInput(metaName, IOContext.READONCE)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about this earlier, but it is really nice you fixed this,too!
As we started over here: My suggestion would be to let this go in without changing the default in MMapDirectory. If we want to change the default we can make a followup PR on MMapDirectory. But at moment due to disagreement if preload is enough, we should just add the option to enable this in MMapDir. If we really want to stick the small files and load them to heap, maybe a NRTCachingDirectory like approach would be best. |
Agreed Uwe. |
The default codec has a number of small and hot files, that actually used to be fully loaded in memory before we moved them off-heap. In the general case, these files are expected to fully fit into the page cache for things to work well. Should we give control over preloading to codecs? This is what this commit does for the following files: - Terms index (`tip`) - Points index (`kdi`) - Stored fields index (`fdx`) - Terms vector index (`tvx`) This only has an effect on `MMapDirectory`.
I think a CHANGES.txt entry would be good. |
This PR introduces a new
LOAD
IOContext
for files that are a small fraction of the total index size and are expected to be accessed in a random-access fashion. This may be leveraged by someDirectory
implementations to load such files into physical memory (e.g. Java heap) in order to provide stronger guarantees on query latency.