Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for disabling BKDReader packedIndex off heap #2657

Open
travisbenedict opened this issue Mar 29, 2022 · 11 comments
Open

Support for disabling BKDReader packedIndex off heap #2657

travisbenedict opened this issue Mar 29, 2022 · 11 comments
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search

Comments

@travisbenedict
Copy link

Is your feature request related to a problem? Please describe.
In Lucene 8.4 the BKDReader packedIndex was moved off heap (LUCENE-8932). In some cases this can cause the BKDReader to consume more CPU cycles and increase request latency

Describe the solution you'd like
Add a setting to allow the BKDReader packedIndex to be moved on heap.

Describe alternatives you've considered
Move the BKDReader packedIndex on heap by default, but this would likely increase memory consumption for users.

Additional context
Similar issue - #825

Flame graphs are included for Elasticsearch versions 7.9 and 7.1

ES 7.9 Lucene 8.6.2
79FlameGraph

ES 7.1 Lucene 8.0.0
71FlameGraph

@travisbenedict travisbenedict added enhancement Enhancement or improvement to existing feature or request untriaged labels Mar 29, 2022
@superawesome
Copy link

I think I'm hitting this too... 8-10x search latency increase on 7.9 vs 7.1. Do certain types of queries / operations cause this more than others?

Also, from the Lucene bug I see this:

"I have added a second constructor that takes both the IndexInput and a boolean for on/off-heap. The first constructor will select off-heap iff the IndexInput extends ByteBufferIndexInput."

I'm wondering if there's any way to make IndexInput not an instance of ByteBufferIndexInput. I don't know the code, so I don't know what else it possibly could be, or how to get there, or what (if any) ramifications that might have.

@jainankitk
Copy link
Collaborator

It was made default in LUCENE-9398 from 8.6. We need support in lucene that can be exposed as setting in Opensearch for moving it on/off heap as per the use case

@manojfaria
Copy link

manojfaria commented Mar 26, 2023

+1 regarding the feature to enable/disable BKDReader on-heap/off-heap, for select use cases. Few teams upgrading from ES 7.1 to ES 7.9 are reporting increased SearchLatencies (as tracked via this issue).

My limited understanding is that BKDReader data structure is used for searching and indexing high-dimensional data. BKDReader off heap was introduced as an optimization to reduce memory usage and improve overall scalability of Elasticsearch.

May I request additional insights and guidance to below mentioned topics - as part of this github issue (Or guide me if the below mentioned topics should be discussed and addressed separately)?

  • List the query patterns that exercise BKDReader. In my limited experience, BKDReader gets exercised when queries are executed on numeric fields.
  • What are the "access patterns", "use cases" or "scenarios" wherein BKDReader on-heap performs better than BKDReader off-heap (and vice-versa)? It appears that BKDReader on-heap may be better suited when query latency/performance is of utmost importance, and dimensionality of underlying data is not very high. Thoughts?

@nknize
Copy link
Collaborator

nknize commented Mar 28, 2023

More context is needed here. Direct memory (mmaped files) is subject to swapping. Was this tested w/ swapping enabled (which can seriously impact a nodes performance) or disabled? e.g.,:

  1. Was swapping enabled? Try rerunning w/ and w/o swap enabled and report findings.

  2. Try memory locking the process bootstrap.memory_lock true

  3. We're ulimits set to unlimited?

opensearch soft memlock unlimited
opensearch hard memlock unlimited

@jainankitk
Copy link
Collaborator

jainankitk commented Jun 6, 2023

Was swapping enabled?

No, swapping is disabled due to performance degradation concerns

We're ulimits set to unlimited?

Yes they are:

% cat '/etc/security/limits.d/10-memlock.conf'
*         hard    memlock           unlimited
*         soft    memlock           unlimited

@nknize
Copy link
Collaborator

nknize commented Jun 7, 2023

In some cases this can cause the BKDReader to consume more CPU cycles and increase request latency

Can we get more information on the "some cases"? The flamegraph is pretty but we need more context. Can we get a query profile before/after? Field mappings? Segment geometry?

More specifically, this change was made in Lucene which impacts not just OpenSearch but Solr and Elasticsearch. I have yet to see any concern from the latter two. It's possible they haven't hit it yet, but any changes to upstream Lucene will need more context with supporting benchmarks to clearly show this regression results from the BKD on/off heap changes.

With supporting query information it will also be easier to repro this using the lucene benchmark tool to support a change in the upstream. It's unfortunate no benchmarks were provided with the original change.

@mikemccand
Copy link

As long as the index is fully hot (is it here?), moving the BKD index off-heap should not cause anything near the performance regression that the o.p. flame charts seem to indicate. The BKD index is still in RAM in both cases, just accessed via slightly different APIs. Also, traversing the BKD index is not the hot spot over the overall BKDReader.intersect -- visiting the leaf blocks (which are off-heap in both cases) is where the heavy work happens.

Could you attach a delta flame-chart as well? That would show us precisely what's slower and what's faster. And, any way to attach or link to a fully interactive flame chart? We need the equivalent of Pastebin but slightly interactive so I can drill down on the flame chart.

Also note that in Lucene's nightly benchmarks, the IntNRQ query tasks tests BKD performance, and it does not show any regression in the June 20, 2020 time-frame:

https://home.apache.org/~mikemccand/lucenebench/IntNRQ.html

Aside: it's nice to see the GC reduction from old to new Lucene versions in the o.p. flame charts.

@nknize
Copy link
Collaborator

nknize commented Jun 8, 2023

As long as the index is fully hot (is it here?)

Good question. I assumed it was during the run but the description doesn't give any specifics on the configuration when the the flame graph was obtained. Was this obtained from a single node in a cluster? Was this a single node setup? What else was going on during the query (e.g., concurrent indexing) or is this isolated to the query only? Is there a hot threads dump?

The more details to help here the better.

@nknize
Copy link
Collaborator

nknize commented Jun 8, 2023

For the offending queries you might also try warming the page cache on those indexes using the "index.store.preload": ["*"] index setting to make sure the index is hot. Although if this regression is seen in a production cluster I think that could be evicted based on the 30second Shard Idle default. That might be another area to look at, increase index.search.idle.after to try and prevent shard idle (e.g., file cache evictions). I think if you set this to -1 it will disable? (double check that)

@mikemccand
Copy link

Which Directory implementation is in use here @travisbenedict? If it's a buffered implementation (SimpleFSDirectory or NIOFSDirectory) can you try switching to MMapDirectory instead? The buffered reads are sometimes costly, e.g. see apache/lucene#12356.

@mikemccand
Copy link

Also, if possible, please render flame charts to SVG so they remain interactive after attaching to GitHub issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search
Projects
Status: 🆕 New
Development

No branches or pull requests

7 participants