-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mmapfs as default store type #38157
Use mmapfs as default store type #38157
Conversation
With this commit we switch the default store type from `hybridfs` to `mmapfs`. While `hybridfs` is beneficial for random access workloads (think: updates and queries) when the index size is much larger than the available page cache, it incurs a performance penalty on smaller indices that fit into the page cache (or are not much larger than that). This performance penalty shows not only for bulk updates or queries but also for bulk indexing (without *any* conflicts) when an external document id is provided by the client. For example, in the `geonames` benchmark this results in a throughput reduction of roughly 17% compared to `mmapfs`. This reduction is caused by document id lookups that show up as the top contributor in the profile when enabling `hybridfs`. Below is such an example stack trace as captured by async-profiler during a benchmarking trial where we can see that the overhead is caused by additional `read` system calls for document id lookups: ``` __GI_pread64 sun.nio.ch.FileDispatcherImpl.pread0 sun.nio.ch.FileDispatcherImpl.pread sun.nio.ch.IOUtil.readIntoNativeBuffer sun.nio.ch.IOUtil.read sun.nio.ch.FileChannelImpl.readInternal sun.nio.ch.FileChannelImpl.read org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal org.apache.lucene.store.BufferedIndexInput.refill org.apache.lucene.store.BufferedIndexInput.readByte org.apache.lucene.store.DataInput.readVInt org.apache.lucene.store.BufferedIndexInput.readVInt org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup.getDocID org.elasticsearch.common.lucene.uid.PerThreadIDVersionAndSeqNoLookup. lookupVersion org.elasticsearch.common.lucene.uid.VersionsAndSeqNoResolver.loadDocIdAndVersion org.elasticsearch.index.engine.InternalEngine.resolveDocVersion org.elasticsearch.index.engine.InternalEngine.planIndexingAsPrimary org.elasticsearch.index.engine.InternalEngine.indexingStrategyForOperation org.elasticsearch.index.engine.InternalEngine.index org.elasticsearch.index.shard.IndexShard.index org.elasticsearch.index.shard.IndexShard.applyIndexOperation org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary [...] ``` For these reasons we are restoring `mmapfs` as the default store type. Relates elastic#36668
Pinging @elastic/es-distributed |
I am confused why |
After further investigation it turns out that this is due the compound format of Lucene ( We could add
We expect that this approach would provide good performance for small and large indices but do not have experimental evidence to back up this hypothesis. Also, there might be other side effects (apart from the increased number of file handles) that we need to consider first. As we first need to decide on the way forward, I have marked this PR as |
I have run further experiments by now. Adding |
I have opened #38940 instead where I also present benchmark results. |
With this commit we switch the default store type from
hybridfs
tommapfs
.While
hybridfs
is beneficial for random access workloads (think: updates andqueries) when the index size is much larger than the available page cache, it
incurs a performance penalty on smaller indices that fit into the page cache (or
are not much larger than that).
This performance penalty shows not only for bulk updates or queries but also for
bulk indexing (without any conflicts) when an external document id is provided
by the client. For example, in the
geonames
benchmark this results in athroughput reduction of roughly 17% compared to
mmapfs
. This reduction iscaused by document id lookups that show up as the top contributor in the profile
when enabling
hybridfs
. Below is such an example stack trace as captured byasync-profiler during a benchmarking trial where we can see that the overhead is
caused by additional
read
system calls for document id lookups:For these reasons we are restoring
mmapfs
as the default store type.Relates #36668