Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-9662: CheckIndex should be concurrent - parallelizing index check across segments #128

Merged
Merged
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
bbd0250
LUCENE-9662: Parallelize checking each index part within each segment
zacharymorn May 6, 2021
1ac1305
Support passing in executorService through checkIndex, and awaits shu…
zacharymorn May 7, 2021
9679cea
Callable to supplier wrapper.
dweiss May 7, 2021
e87d891
Tidy.
dweiss May 7, 2021
2009c70
Use callable and function to reduce the repetitive section of code
zacharymorn May 8, 2021
ed04964
Add segment and part ids
zacharymorn May 18, 2021
b7f3c0f
"Simplify" failFast logic a bit as RuntimeException will be thrown re…
zacharymorn May 18, 2021
22c87c2
Make soft deletes check follows the same style with others
zacharymorn May 18, 2021
99cea46
Remove no longer used failFast
zacharymorn May 18, 2021
c5c1e56
Remove nocommit
zacharymorn May 18, 2021
6ab669c
Address comments
zacharymorn May 19, 2021
a97de22
Add happy path test case
zacharymorn May 25, 2021
9883192
LUCENE-9974: The test-framework module should apply the test ruleset …
dweiss May 25, 2021
1bc1f8a
Use RandomizedTest.randomIntBetween for reproducibility
zacharymorn May 26, 2021
dadfe80
Fix bugs for segment file name and -threadCount, and cap default and …
zacharymorn May 26, 2021
57f542f
Use per part buffer for synchronization
zacharymorn May 26, 2021
71b5279
Remove no longer needed constants (they didnt seems to fail precommit)
zacharymorn May 26, 2021
cad119e
Remove unneeded newline
zacharymorn May 27, 2021
9c88f2f
fix typo
zacharymorn May 27, 2021
8f949dc
Move 11 to constant
zacharymorn May 28, 2021
51bcaa2
Revert changes for concurrency within segment
zacharymorn May 29, 2021
b39a856
Small refactoring to prep for method extraction
zacharymorn May 29, 2021
b892b21
IDE automatic method extraction
zacharymorn May 29, 2021
09ae809
Refactor
zacharymorn May 29, 2021
138b72e
Add concurrency across segments, reducing total time from 300+ second…
zacharymorn May 30, 2021
6cd6a46
Add back spacing between segment output
zacharymorn May 30, 2021
d1b19ea
sync on NumberFormat
zacharymorn May 30, 2021
bae7e7a
Fix bug to work with -segment flag
zacharymorn May 31, 2021
70dc71c
Enhance console output so that result from smaller segments get print…
zacharymorn May 31, 2021
817c050
Start larger segment earlier
zacharymorn Jun 1, 2021
2ef3e8d
Remove uneeded assignment
zacharymorn Aug 26, 2021
034f209
Fix typo
zacharymorn Aug 26, 2021
c7e3d4d
Merge branch 'main' into LUCENE-9662-CheckIndex-Concurrent
zacharymorn Aug 27, 2021
68fe4f5
Fix test to use the latest vector format
zacharymorn Aug 27, 2021
afea6a7
Change threadCount handling per feedback
zacharymorn Aug 29, 2021
6400304
Add change entry
zacharymorn Aug 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 184 additions & 12 deletions lucene/core/src/java/org/apache/lucene/index/CheckIndex.java
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.CompletionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.apache.lucene.codecs.Codec;
import org.apache.lucene.codecs.DocValuesProducer;
import org.apache.lucene.codecs.NormsProducer;
Expand Down Expand Up @@ -60,6 +64,7 @@
import org.apache.lucene.util.FixedBitSet;
import org.apache.lucene.util.IOUtils;
import org.apache.lucene.util.LongBitSet;
import org.apache.lucene.util.NamedThreadFactory;
import org.apache.lucene.util.StringHelper;
import org.apache.lucene.util.SuppressForbidden;
import org.apache.lucene.util.Version;
Expand Down Expand Up @@ -605,6 +610,15 @@ public Status checkIndex(List<String> onlySegments) throws IOException {
result.newSegments.clear();
result.maxSegmentName = -1;

// nocommit number of threads should be set dynamically
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say: set dynamically based on CPU count, but provide an option (command line) to override?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the checkIndex() method just take ExecutorService as a parameter? This way, e.g. the calling main method could configure the ExecutorService (possibly based on commandline options), unit tests can use a single thread, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestions! I've updated it to work for both checkIndex invoked from main, as well as invoked from other classes / methods.

ExecutorService executor =
Executors.newFixedThreadPool(10, new NamedThreadFactory("async-check-index"));

// nocommit the msg statements with infoStream inside this loop (as well as inside each
// index part checking methods such as testLiveDocs) may be
// interleaved when this section of code run concurrently. Is it ok to not synchronize for
// these msg statements and instead have each msg content to include segment id information for
// identification?
for (int i = 0; i < numSegments; i++) {
final SegmentCommitInfo info = sis.info(i);
long segmentName = Long.parseLong(info.info.name.substring(1), Character.MAX_RADIX);
Expand Down Expand Up @@ -639,6 +653,8 @@ public Status checkIndex(List<String> onlySegments) throws IOException {
SegmentReader reader = null;

try {
// nocommit these msg statements may require synchronization if printed without segment
// identifier
msg(infoStream, " version=" + (version == null ? "3.0" : version));
msg(infoStream, " id=" + StringHelper.idToString(info.info.getId()));
final Codec codec = info.info.getCodec();
Expand Down Expand Up @@ -731,40 +747,188 @@ public Status checkIndex(List<String> onlySegments) throws IOException {
}

if (checksumsOnly == false) {
// This redundant assignment is done to make compiler happy
SegmentReader finalReader = reader;

// Test Livedocs
segInfoStat.liveDocStatus = testLiveDocs(reader, infoStream, failFast);
CompletableFuture<Void> testliveDocs =
CompletableFuture.supplyAsync(
() -> {
try {
return testLiveDocs(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
liveDocStatus -> {
segInfoStat.liveDocStatus = liveDocStatus;
});

// Test Fieldinfos
segInfoStat.fieldInfoStatus = testFieldInfos(reader, infoStream, failFast);
CompletableFuture<Void> testFieldInfos =
CompletableFuture.supplyAsync(
() -> {
try {
return testFieldInfos(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
fieldInfoStatus -> {
segInfoStat.fieldInfoStatus = fieldInfoStatus;
});

// Test Field Norms
segInfoStat.fieldNormStatus = testFieldNorms(reader, infoStream, failFast);
CompletableFuture<Void> testFieldNorms =
CompletableFuture.supplyAsync(
() -> {
try {
return testFieldNorms(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
fieldNormStatus -> {
segInfoStat.fieldNormStatus = fieldNormStatus;
});

// Test the Term Index
segInfoStat.termIndexStatus =
testPostings(reader, infoStream, verbose, doSlowChecks, failFast);
CompletableFuture<Void> testTermIndex =
CompletableFuture.supplyAsync(
() -> {
try {
return testPostings(
finalReader, infoStream, verbose, doSlowChecks, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
termIndexStatus -> {
segInfoStat.termIndexStatus = termIndexStatus;
});

// Test Stored Fields
segInfoStat.storedFieldStatus = testStoredFields(reader, infoStream, failFast);
CompletableFuture<Void> testStoredFields =
CompletableFuture.supplyAsync(
() -> {
try {
return testStoredFields(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
storedFieldStatus -> {
segInfoStat.storedFieldStatus = storedFieldStatus;
});

// Test Term Vectors
segInfoStat.termVectorStatus =
testTermVectors(reader, infoStream, verbose, doSlowChecks, failFast);
CompletableFuture<Void> testTermVectors =
CompletableFuture.supplyAsync(
() -> {
try {
return testTermVectors(
finalReader, infoStream, verbose, doSlowChecks, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
termVectorStatus -> {
segInfoStat.termVectorStatus = termVectorStatus;
});

// Test Docvalues
segInfoStat.docValuesStatus = testDocValues(reader, infoStream, failFast);
CompletableFuture<Void> testDocValues =
CompletableFuture.supplyAsync(
() -> {
try {
return testDocValues(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
docValuesStatus -> {
segInfoStat.docValuesStatus = docValuesStatus;
});

// Test PointValues
segInfoStat.pointsStatus = testPoints(reader, infoStream, failFast);
CompletableFuture<Void> testPointvalues =
CompletableFuture.supplyAsync(
() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a utility method that accepts a callable and wraps in completion exception? This is a very repetitive block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also feel this is quite repetitive and cumbersome to look at. I gave it some more tries but it seems to be a bit difficult actually. The main issue here is the testXXX methods all throw checked IOException, and thus any callable I define that wrap around the method will need to handle the IOException there as well, instead of handling it inside the utility method, or the compiler may not be happy:

utilityMethod(() -> {
      try {
           testXXX
      } catch (IOException) {
           handle IOE
      })

Maybe some refactoring in those testXXX methods can work around the issue?

For the code in thenAccept, we should be able to simplify with generics and creating a new CheckIndexStatus that all status classes extend (right now they don't share a super class).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit that shows what I meant (and replaces three calls). Sorry for not replacing all of them. Maybe it can be made even less verbose if you fold all those blocks into a single function that accepts executor, first callable, follow-up callable and just returns completable future from the composition of these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok I see. I think I encountered the issue earlier when I tried to use Supplier directly instead of Callable there. I've updated it according to your suggestion (I used a Function for the follow-up lambda as it takes one argument)

try {
return testPoints(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
pointsStatus -> {
segInfoStat.pointsStatus = pointsStatus;
});

// Test VectorValues
segInfoStat.vectorValuesStatus = testVectors(reader, infoStream, failFast);
CompletableFuture<Void> testVectors =
CompletableFuture.supplyAsync(
() -> {
try {
return testVectors(finalReader, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
vectorValuesStatus -> {
segInfoStat.vectorValuesStatus = vectorValuesStatus;
});

// Test index sort
segInfoStat.indexSortStatus = testSort(reader, indexSort, infoStream, failFast);
CompletableFuture<Void> testSort =
CompletableFuture.supplyAsync(
() -> {
try {
return testSort(finalReader, indexSort, infoStream, failFast);
} catch (IOException e) {
throw new CompletionException(e);
}
},
executor)
.thenAccept(
indexSortStatus -> {
segInfoStat.indexSortStatus = indexSortStatus;
});

// Rethrow the first exception we encountered
// This will cause stats for failed segments to be incremented properly
// nocommit The error != null check ordering below requires sequencing the above async
// calls.
// Does the order really matter here or can be done differently?
CompletableFuture.allOf(
testliveDocs,
testFieldInfos,
testFieldNorms,
testTermIndex,
testStoredFields,
testTermVectors,
testDocValues,
testPointvalues,
testVectors,
testSort)
.join();
if (segInfoStat.liveDocStatus.error != null) {
throw new RuntimeException("Live docs test failed");
} else if (segInfoStat.fieldInfoStatus.error != null) {
Expand All @@ -783,6 +947,8 @@ public Status checkIndex(List<String> onlySegments) throws IOException {
throw new RuntimeException("Points test failed");
}
}

// nocommit parallelize this as well?
final String softDeletesField = reader.getFieldInfos().getSoftDeletesField();
if (softDeletesField != null) {
checkSoftDeletes(softDeletesField, info, reader, infoStream, failFast);
Expand Down Expand Up @@ -810,6 +976,8 @@ public Status checkIndex(List<String> onlySegments) throws IOException {
result.newSegments.add(info.clone());
}

executor.shutdown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add await; shutdown is non-blocking.

Copy link
Contributor Author

@zacharymorn zacharymorn May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry. Thought it would be ok with the join call before shutdown, but yeah even the idle threads may take time to shutdown. Added.


if (0 == result.numBadSegments) {
result.clean = true;
} else
Expand Down Expand Up @@ -931,6 +1099,8 @@ public static Status.LiveDocStatus testLiveDocs(
final Status.LiveDocStatus status = new Status.LiveDocStatus();

try {
// nocommit these msg statements may require synchronization if printed without segment
// identifier
if (infoStream != null) infoStream.print(" test: check live docs.....");
final int numDocs = reader.numDocs();
if (reader.hasDeletions()) {
Expand Down Expand Up @@ -969,6 +1139,8 @@ public static Status.LiveDocStatus testLiveDocs(
}
}
}
// nocommit these msg statements may require synchronization if printed without segment
// identifier
msg(
infoStream,
String.format(
Expand Down