Skip to content
This repository has been archived by the owner on Nov 14, 2024. It is now read-only.

[PaxosStateLog] Parallel reads and processing #4758

Merged
merged 6 commits into from
May 12, 2020
Merged

Conversation

gmaretic
Copy link
Contributor

@gmaretic gmaretic commented May 7, 2020

Goals (and why):
Reading entries one by one is going to be too slow for our migration, at least if we wish to proceed with migration on startup. This PR allows us to parallelise the reads.

Implementation Description (bullets):
We use a fixed thread pool for the lifetime of reader, and do the reads in parallel

Testing (What was existing testing like? What have you done to improve it?):
Unit tests

Concerns (what feedback would you like?):
Should we be more defensive, not allow specifying the number of threads? Anything else?

Where should we start reviewing?:
Tests?

Priority (whenever / two weeks / yesterday):
Early next week


private Optional<PaxosRound<V>> singleRead(long sequence) {
try {
return Optional.ofNullable(delegate.readRound(sequence))
Copy link
Contributor

@jkozlowski jkozlowski May 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't #readRound have a exclusive lock in PaxosStateLogImpl? How are you going to rewire this stuff, will there be a different impl just for reading the files that doesn't log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's something we have to discuss, but it's a solvable problem just outside the scope of this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think we should be good to change that to a R/W lock (though be careful when reading that class to make sure it's safe!)

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the read/write lock change.

In that case, this PR should have changelog to reflect (incredibly marginally) improved perf, and also a signpost for a change that, although we believe safe, would be obvious to bail from should a P0 arise.

try (PaxosStateLogBatchReader<PaxosValue> reader = createReader()) {
assertThat(reader.readBatch(START_SEQUENCE, BATCH_SIZE))
.isEqualTo(EXPECTED_ROUNDS.stream()
.filter(round -> round.sequence() % 2 != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be worth extracting a local predicate


private Optional<PaxosRound<V>> singleRead(long sequence) {
try {
return Optional.ofNullable(delegate.readRound(sequence))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think we should be good to change that to a R/W lock (though be careful when reading that class to make sure it's safe!)

@changelog-app
Copy link

changelog-app bot commented May 11, 2020

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

PaxosStateLogImpl now uses a reentrant read write lock for synchronisation instead of a reentrant lock. This slightly improves performance and allows for higher read throughput by parallelising reads.

Check the box to generate changelog(s)

  • Generate changelog entry

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a weird case for the concurrency where I'd like to be defensive. Otherwise this makes sense.

@@ -50,7 +51,7 @@

public class PaxosStateLogImpl<V extends Persistable & Versionable> implements PaxosStateLog<V> {

private final ReentrantLock lock = new ReentrantLock();
private final ReadWriteLock lock = new ReentrantReadWriteLock();
private final Map<Long, Long> seqToVersionMap = Maps.newHashMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Could we put @GuardedBy("lock.writeLock()"), or more defensively and probably better make this a concurrent hash map? Right now it's fine, but if later on a reader tries to use this (e.g. for caching) then memory visibility becomes a bit tricky.

Copy link
Contributor

@jeremyk-91 jeremyk-91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@gmaretic gmaretic merged commit 4c3ffbe into develop May 12, 2020
@delete-merged-branch delete-merged-branch bot deleted the psl/parallel-reads branch May 12, 2020 10:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants