Skip TRANSLOG stage for searchable snapshots recovery stage #67697

fcofdez · 2021-01-19T13:06:31Z

This commit introduces a change where searchable snapshots
skip the RecoveryState TRANSLOG stage. Since #65531 was introduced, the
cleanFiles peer recovery stage is blocked until the prewarming
completes (this is done to avoid search latency spikes due to a cold
cache). In that phase, the RecoveryState stage is
TRANSLOG which can be confusing as we don't replay
any ops during searchable snapshots recoveries. In order
to avoid that confusion we transition directly to
FINALIZE stage.

This commit introduces a change where searchable snapshots skip the TRANSLOG stage. Since elastic#65531 was introduced, the cleanFiles peer recovery stage is blocked until the prewarming completes (this is done to avoid search latency spikes due to a cold cache). In that phase, the RecoveryState stage is TRANSLOG which can be confusing as we don't replay any ops during searchable snapshots recoveries. In order to avoid that confusion we transition directly to FINALIZE stage.

…-peer-recovery-state-new

elasticmachine · 2021-01-19T15:05:07Z

Pinging @elastic/es-distributed (Team:Distributed)

fcofdez · 2021-01-19T15:06:32Z

...ava/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsRelocationIntegTests.java

-            // filter for relocations that are not in stage FINALIZE (they could end up in this stage without progress for good if the
-            // target node does not have enough cache space available to hold the primary completely
-            .filter(recoveryState -> recoveryState.getSourceNode() != null && recoveryState.getStage() != RecoveryState.Stage.FINALIZE)
+            .filter(recoveryState -> recoveryState.getSourceNode() != null)


Maybe we could check at the end that the pre-warming phase ended?

Yea we'd have to do something in that direction, otherwise https://github.com/elastic/elasticsearch/pull/67697/files#diff-be91049c27fa011e281944d790285ad97bbd72466aeeed55525c5b99061c9e13L113 will trip for small cache sizes that are randomly chosen in the parent test suite because we will get stuck in FINALIZE forever right?

original-brownbear

I think this is a neat change in general. I also couldn't find anything that is broken by it.
I'd like someone else to have a look here as well though. I couldn't find any side effects of hacking the stage like this outside of the assertion issues you fixed though and the behavior makes logical sense to me.

Maybe @henningandersen has a sec to double check that we can go this route in general?

original-brownbear · 2021-01-26T20:55:06Z

server/src/main/java/org/elasticsearch/indices/recovery/RecoveryState.java

@@ -101,6 +101,11 @@ public static Stage fromId(byte id) {
        }
    }

+    public enum IndexType {
+        REGULAR,
+        SEARCHABLE_SNAPSHOT;


This is a little nasty, I'm not 100% sure what our policy is here but I think we should avoid leaking searchable snapshots this explicitly in here. Maybe we could call this "NO_TRANSLOG" or so?

original-brownbear · 2021-01-26T21:01:54Z

...ava/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsRelocationIntegTests.java

-            // filter for relocations that are not in stage FINALIZE (they could end up in this stage without progress for good if the
-            // target node does not have enough cache space available to hold the primary completely
-            .filter(recoveryState -> recoveryState.getSourceNode() != null && recoveryState.getStage() != RecoveryState.Stage.FINALIZE)
+            .filter(recoveryState -> recoveryState.getSourceNode() != null)


Yea we'd have to do something in that direction, otherwise https://github.com/elastic/elasticsearch/pull/67697/files#diff-be91049c27fa011e281944d790285ad97bbd72466aeeed55525c5b99061c9e13L113 will trip for small cache sizes that are randomly chosen in the parent test suite because we will get stuck in FINALIZE forever right?

original-brownbear · 2021-01-26T21:39:13Z

BTW: I also wonder if we really need to be this invasive here. Maybe we could simply add a filter to for the stage to render in APIs (e.g. getDisplayStage) to RecoveryState (which by default would just be delegating to getStage but would be overridden to FINALIZE instead of TRANSLOG for relocation + translog stage + "not prewarmed" ) That seems a lot less invasive and would fix the APIs without functional impact on the rest of the functionality? (+ would maximally limit leaking searchable snapshots logic into core)

WDYT?

henningandersen · 2021-01-27T11:27:47Z

I wonder if we could instead move setting stage to TRANSLOG into the after clean files runnable?

original-brownbear · 2021-01-27T11:36:23Z

I wonder if we could instead move setting stage to TRANSLOG into the after clean files runnable?

++ that seems pretty side effect free and quick.

fcofdez · 2021-01-28T10:55:14Z

I wonder if we could instead move setting stage to TRANSLOG into the after clean files runnable?

But in that case the recovery stage would remain in VERIFY_INDEX, right?

Maybe we could simply add a filter to for the stage to render in APIs

I like this approach, I'll go ahead with that.

henningandersen · 2021-01-28T12:02:03Z

I think we could stay in INDEX with reasonable effort. I wonder if that would be more accurate, since the shard is not serving requests. But the display filter approach could also work out.

mark-vieira · 2021-02-03T00:16:17Z

@elasticmachine update branch

fcofdez · 2021-02-09T13:44:10Z

Sorry, this fell through the cracks. I've opened a new PR with the suggested approach (#68680)

fcofdez added 4 commits January 19, 2021 14:00

Take into account done phase

0ac2823

Spotless

f99e107

Merge remote-tracking branch 'origin/master' into searchable-snapshot…

bce9208

…-peer-recovery-state-new

fcofdez added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.12.0 v8.0.0 labels Jan 19, 2021

fcofdez marked this pull request as ready for review January 19, 2021 15:05

fcofdez requested review from original-brownbear and ywelsch January 19, 2021 15:05

fcofdez commented Jan 19, 2021

View reviewed changes

original-brownbear reviewed Jan 26, 2021

View reviewed changes

fcofdez closed this Feb 9, 2021

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip TRANSLOG stage for searchable snapshots recovery stage #67697

Skip TRANSLOG stage for searchable snapshots recovery stage #67697

fcofdez commented Jan 19, 2021 •

edited

Loading

elasticmachine commented Jan 19, 2021

fcofdez Jan 19, 2021

original-brownbear Jan 26, 2021

original-brownbear left a comment

original-brownbear Jan 26, 2021

original-brownbear Jan 26, 2021

original-brownbear commented Jan 26, 2021

henningandersen commented Jan 27, 2021

original-brownbear commented Jan 27, 2021

fcofdez commented Jan 28, 2021

henningandersen commented Jan 28, 2021

mark-vieira commented Feb 3, 2021

fcofdez commented Feb 9, 2021

Skip TRANSLOG stage for searchable snapshots recovery stage #67697

Skip TRANSLOG stage for searchable snapshots recovery stage #67697

Conversation

fcofdez commented Jan 19, 2021 • edited Loading

elasticmachine commented Jan 19, 2021

fcofdez Jan 19, 2021

Choose a reason for hiding this comment

original-brownbear Jan 26, 2021

Choose a reason for hiding this comment

original-brownbear left a comment

Choose a reason for hiding this comment

original-brownbear Jan 26, 2021

Choose a reason for hiding this comment

original-brownbear Jan 26, 2021

Choose a reason for hiding this comment

original-brownbear commented Jan 26, 2021

henningandersen commented Jan 27, 2021

original-brownbear commented Jan 27, 2021

fcofdez commented Jan 28, 2021

henningandersen commented Jan 28, 2021

mark-vieira commented Feb 3, 2021

fcofdez commented Feb 9, 2021

fcofdez commented Jan 19, 2021 •

edited

Loading