-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move caching of the store to IndexShard. #30817
Conversation
In spite of the existing caching, I have seen a number of nodes hot threads where one thread had been spending all its cpu on computing the size of a directory. I am proposing to move the caching of the store size to `IndexShard` so that it has access to the existing logic regarding whether a shard is active or not in order to be able to cache the store size more agressively. The tricky bit is that an inactive shard might still be merged, which may have a significant impact on the store size. This should be especially useful for time-based data since most indices are typically inactive.
Pinging @elastic/es-distributed |
WIP because of the lack of test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this is much simpler yet, I think it's incorrect or has too many corner cases. like if a reader gets closed we don't refresh the stats since we don't see the deletes. It's not good enough. sorry for pushing down that route.
|
||
private final MeanMetric totalMerges = new MeanMetric(); | ||
private final CounterMetric totalMergesNumDocs = new CounterMetric(); | ||
private final CounterMetric totalMergesSizeInBytes = new CounterMetric(); | ||
private final CounterMetric currentMerges = new CounterMetric(); | ||
private final AtomicLong currentMerges = new AtomicLong(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this change?
@@ -66,11 +69,14 @@ | |||
private final Set<OnGoingMerge> onGoingMerges = ConcurrentCollections.newConcurrentSet(); | |||
private final Set<OnGoingMerge> readOnlyOnGoingMerges = Collections.unmodifiableSet(onGoingMerges); | |||
private final MergeSchedulerConfig config; | |||
private volatile long lastMergeMillis; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need these changes. I guess it would be enough to check Engine#getMergeStats()
and then do:
mergeStats.current > 0 || mergeStats.total != previousStats.total
?
} | ||
|
||
@Override | ||
protected boolean needsRefresh() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why this is so complex, wouldn't it be enough to override needsRefresh()
?
private MergeStats previousStats = new MergeStats();
if (super.needsRefresh()) {
boolean refresh = false;
if (isActive()) {
MergeStats mergeStats = getEngine().getMergeStats();
refresh = mergeStats.current > 0 || mergeStats.total != previousStats.total;
previousStats = mergeStats;
}
return refresh;
}
if (active.get() == false) { | ||
// We refresh when transitioning to an inactive state to make | ||
// it easier to cache the store size. | ||
refresh("transition to inactive"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 we should not refresh anything except of the internal reader. Visibility guarantees are important.
Closed in favor of the original PR. |
In spite of the existing caching, I have seen a number of nodes hot threads
where one thread had been spending all its cpu on computing the size of a
directory. I am proposing to move the caching of the store size to
IndexShard
so that it has access to the existing logic regarding whether a shard is active
or not in order to be able to cache the store size more agressively.
The tricky bit is that an inactive shard might still be merged, which may have
a significant impact on the store size.
This should be especially useful for time-based data since most indices are
typically inactive.