Fix file reading in ccr restore service #38117

Tim-Brooks · 2019-01-31T19:12:29Z

Currently we use the raw byte array length when calling the IndexInput
read call to determine how many bytes we want to read. However, due to
how BigArrays works, the array length might be longer than the reference
length. This commit fixes the issue and uses the BytesRef length when
calling read. Additionally, it expands the index follow test to index
many more documents. These documents should potentially lead to large
enough segment files to trigger scenarios where this fix matters.

Currently we use the raw byte array length when calling the IndexInput read call to determine how many bytes we want to read. However, due to how BigArrays works, the array length might be longer than the reference length. This commit fixes the issue and uses the BytesRef length when calling read. Additionally, it expands the index follow test to index many more documents. These documents should potentially lead to large enough segment files to trigger scenarios where this fix matters.

elasticmachine · 2019-01-31T19:12:30Z

Pinging @elastic/es-distributed

Tim-Brooks · 2019-01-31T19:15:11Z

@dnhatn has told me that the assertTotalNumberOfOptimizedIndexing assertion is no longer needed as it is tested in a different place. This assertion was causing issues as I attempt to increased the number of documents indexed (bootstrap recovered documents are not "optimized").

No tests were failing prior to this fix because non of the segment files were big enough to trigger BigArrays to use arrays composed of multiple arrays. Indexing 800-1200 documents does create segment files big enough to cause failures prior to this fix.

ywelsch

LGTM

ywelsch · 2019-01-31T21:56:04Z

x-pack/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/IndexFollowingIT.java

@@ -101,9 +102,24 @@ public void testFollowIndex() throws Exception {
        assertAcked(leaderClient().admin().indices().prepareCreate("index1").setSource(leaderIndexSettings, XContentType.JSON));
        ensureLeaderYellow("index1");

-        final int firstBatchNumDocs = randomIntBetween(2, 64);
+        final int firstBatchNumDocs = randomIntBetween(800, 1200);


how long does it take to index this? Perhaps we can only do this rarely?

The total test takes 1-2 seconds to run with 800-1200 documents. But rarely seems fine since we just want this to fail every once in a while if there was a problem.

I changed this to:

// Sometimes we want to index a lot of documents to ensure that the recovery works with larger files if (rarely()) { firstBatchNumDocs = randomIntBetween(1800, 2000); } else { firstBatchNumDocs = randomIntBetween(10, 64); }

x-pack/plugin/ccr/src/test/java/org/elasticsearch/xpack/ccr/IndexFollowingIT.java

Currently we use the raw byte array length when calling the IndexInput read call to determine how many bytes we want to read. However, due to how BigArrays works, the array length might be longer than the reference length. This commit fixes the issue and uses the BytesRef length when calling read. Additionally, it expands the index follow test to index many more documents. These documents should potentially lead to large enough segment files to trigger scenarios where this fix matters.

Tim-Brooks added >bug v7.0.0 :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features v6.7.0 labels Jan 31, 2019

Tim-Brooks requested a review from ywelsch January 31, 2019 19:12

Tim-Brooks requested a review from dnhatn January 31, 2019 19:12

Fix checkstyle

951733f

Tim-Brooks requested a review from martijnvg January 31, 2019 19:39

Merge remote-tracking branch 'upstream/master' into fix_reader

b89de38

ywelsch approved these changes Jan 31, 2019

View reviewed changes

Tim-Brooks added 2 commits January 31, 2019 15:21

Changes

04d7afe

Merge remote-tracking branch 'upstream/master' into fix_reader

a28db59

Tim-Brooks merged commit 291c4e7 into elastic:master Feb 1, 2019

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Tim-Brooks deleted the fix_reader branch December 18, 2019 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix file reading in ccr restore service #38117

Fix file reading in ccr restore service #38117

Tim-Brooks commented Jan 31, 2019

elasticmachine commented Jan 31, 2019

Tim-Brooks commented Jan 31, 2019

ywelsch left a comment

ywelsch Jan 31, 2019

Tim-Brooks Jan 31, 2019

Fix file reading in ccr restore service #38117

Fix file reading in ccr restore service #38117

Conversation

Tim-Brooks commented Jan 31, 2019

elasticmachine commented Jan 31, 2019

Tim-Brooks commented Jan 31, 2019

ywelsch left a comment

Choose a reason for hiding this comment

ywelsch Jan 31, 2019

Choose a reason for hiding this comment

Tim-Brooks Jan 31, 2019

Choose a reason for hiding this comment