Advance max_seq_no before add operation to Lucene #38879

dnhatn · 2019-02-14T03:13:22Z

Today when processing an operation on a replica engine (or the following engine), we first add it to Lucene, then add it to translog, then finally marks its seq_no as completed. If a flush occurs after step1, but before step3, the max_seq_no in the commit's user_data will be smaller than the seq_no of some documents in the Lucene commit.

I found this issue while investigating the test failure in #31629.

elasticmachine · 2019-02-14T03:13:26Z

Pinging @elastic/es-distributed

ywelsch

Great find @dnhatn.

ywelsch · 2019-02-14T08:08:16Z

server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

@@ -868,6 +868,7 @@ public IndexResult index(Index index) throws IOException {
                    indexResult = plan.earlyResultOnPreFlightError.get();
                    assert indexResult.getResultType() == Result.Type.FAILURE : indexResult.getResultType();
                } else if (plan.indexIntoLucene || plan.addStaleOpToLucene) {
+                    localCheckpointTracker.advanceMaxSeqNo(plan.seqNoForIndexing);


Properly setting the max sequence number before indexing might not only be a useful property for Lucene-indexed stuff, but in general. I would prefer not to hide it in this sub-condition.
It bugs me that the planning process is mutating the state of the LocalCheckpointTracker in this way when being primary, but not when being replica. If we want the flow for the primary and replica to be the same, we could do this instead as part of the planning process in planIndexingAsNonPrimary. This will make it so that both primary and replica manage this property in the planning phase (will also require adaptation to FollowingEngine).
As a follow-up, I think we should also investigate whether we can remove this state mutation from the planning process, i.e., not have the planning process assign a sequence number or change the max seq no, but do it as an explicit step here in the index method, given that it is such a fundamental step in processing a request.

ywelsch · 2019-02-14T08:13:22Z

server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java

+            try (DirectoryReader reader = DirectoryReader.open(commit)) {
+                AtomicLong maxSeqNoFromDocs = new AtomicLong(SequenceNumbers.NO_OPS_PERFORMED);
+                Lucene.scanSeqNosInReader(reader, 0, maxNumDocs, n -> maxSeqNoFromDocs.set(Math.max(n, maxSeqNoFromDocs.get())));
+                assertThat(Long.parseLong(commit.getUserData().get(SequenceNumbers.MAX_SEQ_NO)),


can we check this property across all of our Engine tests?

dnhatn · 2019-02-14T15:14:16Z

@ywelsch it's ready again. Can you have another look? Thank you!

dnhatn · 2019-02-15T06:34:11Z

I believe the optimization using sequence numbers on the FollowingEngine has hit this bug in #38894.

ywelsch

LGTM

dnhatn · 2019-02-15T18:41:15Z

@ywelsch Thanks for reviewing.

Today when processing an operation on a replica engine (or the following engine), we first add it to Lucene, then add it to translog, then finally marks its seq_no as completed. If a flush occurs after step1, but before step-3, the max_seq_no in the commit's user_data will be smaller than the seq_no of some documents in the Lucene commit.

* master: Address some CCR REST test case flakiness (elastic#38975) Edits to text in Completion Suggester doc (elastic#38980) SQL: doc polishing [DOCS] Fixes broken formatting SQL: Polish the rest chapter (elastic#38971) Remove `nGram` and `edgeNGram` token filter names (elastic#38911) Add an exception throw if waiting on transport port file fails (elastic#37574) Improve testcluster distribution artifact handling (elastic#38933) Advance max_seq_no before add operation to Lucene (elastic#38879) Reduce global checkpoint sync interval in disruption tests (elastic#38931) [test] disable packaging tests for suse boxes Relax testStressMaybeFlushOrRollTranslogGeneration (elastic#38918) [DOCS] Edits warning in put watch API (elastic#38582) Fix serialization bug in ShardFollowTask after cutting this class over to extend from ImmutableFollowParameters. [DOCS] Updates methods for upgrading machine learning (elastic#38876)

Today when processing an operation on a replica engine (or the following engine), we first add it to Lucene, then add it to translog, then finally marks its seq_no as completed. If a flush occurs after step1, but before step-3, the max_seq_no in the commit's user_data will be smaller than the seq_no of some documents in the Lucene commit.

The max_seq_no of Lucene commit of the old indices (before 6.6.2) can be smaller than seq_no of some documents in the commit (see #38879). Although we fixed this bug in 6.6.2 and 7.0.0, a problematic index commit can still affect the newer version after a rolling upgrade or full cluster restart. In particular, if a FollowingEngine (or an internal engine with MSU enabled) restores from a problematic commit, then it can apply MSU optimization for existing documents. The symptom that we see here is the local checkpoint tracker assertion is violated. Closes #46311 Relates #38879

This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879

Advance max_seq_no before add operation to Lucene. This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879

This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879

This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879 (cherry picked from commit b126ec5) # Conflicts: # es/es-server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java # es/es-testing/src/main/java/org/elasticsearch/index/engine/EngineTestCase.java

This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879 (cherry picked from commit 505cdf8) # Conflicts: # es/es-testing/src/main/java/org/elasticsearch/index/engine/EngineTestCase.java

This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879 (cherry picked from commit b126ec5)

This commit fixes the issue of potential mismatches between the max_seq_no in the commit's user_data and the seq_no of some documents in the Lucene commit. The mismatch could arise when processing an operation on a replica engine, as we first added it to Lucene, then to the translog, to finally mark seq_no as completed. If a flush occurred after step1, but before the marking, then the max_seq_no in the commit's user_data would be smaller than the seq_no of some documents in the Lucene commit. Port of elastic/elasticsearch#38879 (cherry picked from commit 505cdf8)

Advance max_seq_no before add operation to Lucene

33c6985

dnhatn added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v6.7.0 v8.0.0 v7.2.0 v7.0.0-beta1 v6.6.2 labels Feb 14, 2019

dnhatn requested review from s1monw and ywelsch February 14, 2019 03:13

dnhatn added the >bug label Feb 14, 2019

ywelsch suggested changes Feb 14, 2019

View reviewed changes

make seq_no as seen when planning

64fea6b

dnhatn mentioned this pull request Feb 15, 2019

[CI] FollowerFailOverIT testAddNewReplicasOnFollower failed #38894

Closed

ywelsch approved these changes Feb 15, 2019

View reviewed changes

dnhatn merged commit 5624eee into elastic:master Feb 15, 2019

dnhatn deleted the max-seqno branch February 15, 2019 18:42

dnhatn added backport pending v7.0.0 and removed v7.0.0-beta1 labels Feb 15, 2019

dnhatn removed the backport pending label Feb 16, 2019

dnhatn mentioned this pull request Feb 27, 2019

Explicitly advance max_seq_no before indexing #39473

Closed

jakelandis added v7.0.0-rc2 and removed v7.0.0 labels Apr 3, 2019

dnhatn mentioned this pull request Sep 4, 2019

Always rebuild checkpoint tracker for old indices #46340

Merged

dnhatn removed the v7.2.0 label Sep 5, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advance max_seq_no before add operation to Lucene #38879

Advance max_seq_no before add operation to Lucene #38879

dnhatn commented Feb 14, 2019

elasticmachine commented Feb 14, 2019

ywelsch left a comment

ywelsch Feb 14, 2019

ywelsch Feb 14, 2019

dnhatn commented Feb 14, 2019

dnhatn commented Feb 15, 2019 •

edited

Loading

ywelsch left a comment

dnhatn commented Feb 15, 2019

Advance max_seq_no before add operation to Lucene #38879

Advance max_seq_no before add operation to Lucene #38879

Conversation

dnhatn commented Feb 14, 2019

elasticmachine commented Feb 14, 2019

ywelsch left a comment

Choose a reason for hiding this comment

ywelsch Feb 14, 2019

Choose a reason for hiding this comment

ywelsch Feb 14, 2019

Choose a reason for hiding this comment

dnhatn commented Feb 14, 2019

dnhatn commented Feb 15, 2019 • edited Loading

ywelsch left a comment

Choose a reason for hiding this comment

dnhatn commented Feb 15, 2019

dnhatn commented Feb 15, 2019 •

edited

Loading