Wait for prewarm when relocating searchable snapshot shards #65531

original-brownbear · 2020-11-26T08:49:17Z

Add hooks to enable waiting for a condition before completing the clean files step for relocating searchable snapshot shards and use them to wait for pre-warm before responding to the clean files request.

Add hooks to enable waiting for a condition before relocation handoff. Make timeout on relocation unbounded and add ability to disable recovery liveness checker temporarily while running prewarm.

elasticmachine · 2020-11-26T08:49:20Z

Pinging @elastic/es-distributed (Team:Distributed)

original-brownbear · 2020-11-26T10:01:25Z

server/src/main/java/org/elasticsearch/indices/recovery/PeerRecoveryTargetService.java

+                // Due to relocation conditions on the shard it could take a while for the hand-off to complete so we disable the recovery
+                // monitor since we don't expect any transport messages from master for the duration of the handoff and activate it again
+                // after the handoff.
+                final Releasable disabledMonitor = recoveryRef.target().disableRecoveryMonitor();


This is a bit of a BwC issue I guess. If the hand-off request comes from 7.10 and doesn't wait indefinitely yet then it could timeout on the primary I guess but maybe we can just ignore it since it's so fringe?

original-brownbear · 2020-11-26T10:08:15Z

server/src/main/java/org/elasticsearch/indices/recovery/RecoveryTarget.java

+        recoveryMonitorEnabled = false;
+        return () -> {
+            setLastAccessTime();
+            recoveryMonitorEnabled = true;


This is a little low-tech relative to the tricky ref-counting in the IndexShard. I figured this was ok here since the hand-off request only comes in once (at least judging by the assertions we have in IndexShard) while the other API has a more "feel" to it and there are no hard guarantees on the index shard state listener only being invoked once (though the "loaded" flag on the directory effectively guarantees we only add one condition for now) and it wasn't that much extra effort since the API was supposed to be non-blocking anyway.

henningandersen

I did an initial read of the production code. I am in doubt about the approach taken, since we block all operations on the source during the relocation (in IndexShard.relocate). I think this will prevent other recoveries from initiating until the request is done. For searchable snapshots with replicas this is unfortunate. Also, any operation happening trying to acquire the primary permit will be queued up.

It may not be too important for searchable snapshots, but it seems counter intuitive to wait in this specific critical operation rather than outside it? Maybe we could hook this into finalizeRecovery instead?

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

henningandersen · 2020-11-30T10:48:01Z

server/src/main/java/org/elasticsearch/indices/recovery/RecoveriesCollection.java

+                }
+                lastSeenAccessTime = accessTime;
+            } else {
+                lastSeenAccessTime = System.nanoTime();


I think it might be simpler to just fake the progress inside RecoveryTarget by returning System.nanoTime()?

…n-finalize

original-brownbear · 2020-12-02T11:53:20Z

Thanks @henningandersen I adjusted this PR now to work via the clean files handler like we discussed. Let me know what you think :)

henningandersen

LGTM.

henningandersen · 2020-12-09T07:26:11Z

...ava/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsRelocationIntegTests.java

+        });
+
+        logger.info("--> sleep for 5s to ensure we are actually stuck at the FINALIZE stage and that the primary has not yet relocated");
+        TimeUnit.SECONDS.sleep(5L);


Could we instead find the shard using internalCluster().getInstance(IndicesService.class, node) and assertBusy that it has a pending after cleanup action?

++ thanks, you actually prevented a likely test failure here as well :) I moved the check for translog stage to a busy assert and then added the check for one clean files condition after. Otherwise we'd only have had 5s to arrive at TRANSLOG now at least we have 10 which should be a little safer.

...ts/src/test/java/org/elasticsearch/index/store/cache/CachedBlobContainerIndexInputTests.java

original-brownbear · 2020-12-09T10:45:26Z

Thanks Henning!

…65531) Add hooks to enable waiting for a condition before completing the clean files step for relocating searchable snapshot shards and use them to wait for pre-warm before responding to the clean files request.

…66096) Add hooks to enable waiting for a condition before completing the clean files step for relocating searchable snapshot shards and use them to wait for pre-warm before responding to the clean files request.

This commit introduces a change where searchable snapshots skip the TRANSLOG stage. Since elastic#65531 was introduced, the cleanFiles peer recovery stage is blocked until the prewarming completes (this is done to avoid search latency spikes due to a cold cache). In that phase, the RecoveryState stage is TRANSLOG which can be confusing as we don't replay any ops during searchable snapshots recoveries. In order to avoid that confusion we transition directly to FINALIZE stage.

This commit introduces a change where searchable snapshots skip the RecoveryState TRANSLOG stage. Since #65531 was introduced, the cleanFiles peer recovery phase is blocked until the prewarming completes (this is done to avoid search latency spikes due to a cold cache). In that phase, the RecoveryState stage is TRANSLOG which can be confusing as we don't replay any ops during searchable snapshots recoveries. In order to avoid that confusion we transition directly to FINALIZE stage.

This commit introduces a change where searchable snapshots skip the RecoveryState TRANSLOG stage. Since elastic#65531 was introduced, the cleanFiles peer recovery phase is blocked until the prewarming completes (this is done to avoid search latency spikes due to a cold cache). In that phase, the RecoveryState stage is TRANSLOG which can be confusing as we don't replay any ops during searchable snapshots recoveries. In order to avoid that confusion we transition directly to FINALIZE stage. Backport of elastic#70311

This commit introduces a change where searchable snapshots skip the RecoveryState TRANSLOG stage. Since #65531 was introduced, the cleanFiles peer recovery phase is blocked until the prewarming completes (this is done to avoid search latency spikes due to a cold cache). In that phase, the RecoveryState stage is TRANSLOG which can be confusing as we don't replay any ops during searchable snapshots recoveries. In order to avoid that confusion we transition directly to FINALIZE stage. Backport of #70311

Wait for Prewarm when Relocating Searchable Snapshot Shards

74eae5c

Add hooks to enable waiting for a condition before relocation handoff. Make timeout on relocation unbounded and add ability to disable recovery liveness checker temporarily while running prewarm.

original-brownbear added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.11.0 labels Nov 26, 2020

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Nov 26, 2020

original-brownbear commented Nov 26, 2020

View reviewed changes

original-brownbear added 6 commits November 29, 2020 21:36

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm

e688e74

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm

e57db07

add test

f489058

better test

95dadb2

way better test

d5ee3f3

reformat nicer

fa8dea6

original-brownbear requested review from tlrx and henningandersen November 30, 2020 06:31

henningandersen reviewed Nov 30, 2020

View reviewed changes

original-brownbear added 12 commits November 30, 2020 13:03

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm

eccb1b4

start

a428a8b

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm-o…

472fa1e

…n-finalize

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm-o…

fc01ad4

…n-finalize

bck

48c55f5

works nicely

40e18cd

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm-o…

9427179

…n-finalize

fixes

ed4e8c9

fix liveness check disabling

59ab566

fix comment

566884e

much simpler

3b41f93

cs

43816f5

original-brownbear requested a review from henningandersen December 2, 2020 11:52

henningandersen approved these changes Dec 9, 2020

View reviewed changes

original-brownbear added 2 commits December 9, 2020 09:51

Merge remote-tracking branch 'elastic/master' into wait-for-prewarm

ef5032e

adjust tests

b9063b5

original-brownbear merged commit e189a20 into elastic:master Dec 9, 2020

original-brownbear deleted the wait-for-prewarm branch December 9, 2020 10:45

original-brownbear mentioned this pull request Dec 9, 2020

Wait for Prewarm when Relocating Searchable Snapshot Shards (#65531) #66096

Merged

original-brownbear restored the wait-for-prewarm branch January 4, 2021 01:09

pugnascotia changed the title ~~Wait for Prewarm when Relocating Searchable Snapshot Shards~~ Wait for prewarm when relocating searchable snapshot shards Jan 5, 2021

fcofdez mentioned this pull request Jan 19, 2021

Skip translog phase on recovery state for searchable snapshots #67641

Closed

fcofdez mentioned this pull request Jan 19, 2021

Skip TRANSLOG stage for searchable snapshots recovery stage #67697

Closed

fcofdez mentioned this pull request Mar 18, 2021

Skip TRANSLOG stage for searchable snapshots recovery stage #70311

Merged

fcofdez mentioned this pull request Mar 22, 2021

[7.x] Skip TRANSLOG stage for searchable snapshots recovery state #70620

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

DaveCTurner mentioned this pull request Feb 7, 2023

Do we need the RecoveryMonitor? #93544

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wait for prewarm when relocating searchable snapshot shards #65531

Wait for prewarm when relocating searchable snapshot shards #65531

original-brownbear commented Nov 26, 2020 •

edited

Loading

elasticmachine commented Nov 26, 2020

original-brownbear Nov 26, 2020 •

edited

Loading

original-brownbear Nov 26, 2020

henningandersen left a comment

henningandersen Nov 30, 2020

original-brownbear commented Dec 2, 2020

henningandersen left a comment

henningandersen Dec 9, 2020

original-brownbear Dec 9, 2020

original-brownbear commented Dec 9, 2020

Wait for prewarm when relocating searchable snapshot shards #65531

Wait for prewarm when relocating searchable snapshot shards #65531

Conversation

original-brownbear commented Nov 26, 2020 • edited Loading

elasticmachine commented Nov 26, 2020

original-brownbear Nov 26, 2020 • edited Loading

Choose a reason for hiding this comment

original-brownbear Nov 26, 2020

Choose a reason for hiding this comment

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Nov 30, 2020

Choose a reason for hiding this comment

original-brownbear commented Dec 2, 2020

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Dec 9, 2020

Choose a reason for hiding this comment

original-brownbear Dec 9, 2020

Choose a reason for hiding this comment

original-brownbear commented Dec 9, 2020

original-brownbear commented Nov 26, 2020 •

edited

Loading

original-brownbear Nov 26, 2020 •

edited

Loading