Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved efficiency in DigestManager.verify() #3810

Merged
merged 1 commit into from
Feb 27, 2023

Conversation

merlimat
Copy link
Contributor

@merlimat merlimat commented Feb 26, 2023

Motivation

In DigestManager.verifyDigestAndReturnData() there are a couple of simple improvements:

  1. For the temp buffer where to store the digest, do not acquire a buffer from the pool each time. Instead, use a thread-local buffer.
  2. Instead of taking slices() at each time, extend the interface to take a "offset, length" pair and save the allocations.

Benchmarks:

Before

Benchmark                            (entrySize)   Mode  Cnt   Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  13.450 ± 3.634  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3   7.908 ± 2.637  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3   3.233 ± 0.882  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3   1.846 ± 0.047  ops/us

Allocations: 230 bytes per operation

After

Benchmark                            (entrySize)   Mode  Cnt   Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  46.312 ± 7.414  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3  13.379 ± 1.069  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3   3.787 ± 0.059  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3   1.956 ± 0.052  ops/us

Allocations: 64 bytes per operation

Note: these remaining 64 bytes are only due to the Java9 reflection access. It would not be present on the JNI-based CRC32c backend (though that does not work on Mac M1).

image

@merlimat merlimat added this to the 4.16.0 milestone Feb 26, 2023
@merlimat merlimat self-assigned this Feb 26, 2023
@codecov-commenter
Copy link

codecov-commenter commented Feb 26, 2023

Codecov Report

Merging #3810 (526d640) into master (80d3aac) will decrease coverage by 0.07%.
The diff coverage is 65.51%.

@@             Coverage Diff              @@
##             master    #3810      +/-   ##
============================================
- Coverage     60.47%   60.41%   -0.07%     
+ Complexity     5847     5844       -3     
============================================
  Files           473      473              
  Lines         40933    40937       +4     
  Branches       5235     5234       -1     
============================================
- Hits          24756    24731      -25     
- Misses        13976    13991      +15     
- Partials       2201     2215      +14     
Flag Coverage Δ
bookie 39.88% <62.06%> (-0.04%) ⬇️
remaining 29.61% <41.37%> (-0.07%) ⬇️
replication 41.37% <44.82%> (+0.03%) ⬆️
tls 21.03% <37.93%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...keeper/proto/checksum/DirectMemoryCRC32Digest.java 79.31% <ø> (-1.34%) ⬇️
.../bookkeeper/proto/checksum/DummyDigestManager.java 0.00% <0.00%> (ø)
...bookkeeper/proto/checksum/StandardCRC32Digest.java 0.00% <0.00%> (ø)
...pache/bookkeeper/proto/checksum/DigestManager.java 70.47% <66.66%> (+1.78%) ⬆️
...bookkeeper/proto/checksum/CRC32CDigestManager.java 100.00% <100.00%> (ø)
.../bookkeeper/proto/checksum/CRC32DigestManager.java 81.81% <100.00%> (ø)
...he/bookkeeper/proto/checksum/MacDigestManager.java 81.48% <100.00%> (ø)
...va/org/apache/bookkeeper/bookie/SkipListArena.java 73.01% <0.00%> (-7.94%) ⬇️
...g/apache/bookkeeper/proto/WriteEntryProcessor.java 69.01% <0.00%> (-5.64%) ⬇️
...kkeeper/client/RoundRobinDistributionSchedule.java 52.94% <0.00%> (-5.35%) ⬇️
... and 26 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@hangc0276
Copy link
Contributor

Please rebase the master @merlimat

private static final FastThreadLocal<ByteBuf> DIGEST_BUFFER = new FastThreadLocal<ByteBuf>() {
@Override
protected ByteBuf initialValue() throws Exception {
return PooledByteBufAllocator.DEFAULT.directBuffer(1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The required ByteBuf size for each digest manager:
CRC32CDigestManager -> 4
CRC32DigestManager -> 8
DummyDigestManager -> 0
MacDigestManager -> 20
Do we need to allocate 1024 bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case. Though this is only used for MAC at this point, and one per thread. It shouldn't be any noticeable waste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually was referring to my next change in the DigestManager :)

@merlimat merlimat merged commit 1f8de8f into apache:master Feb 27, 2023
hangc0276 pushed a commit that referenced this pull request Feb 27, 2023
### Motivation

In `DigestManager` there are several accesses to ThreadLocal variable per each entry processed.

The reason is the mainly due to `DigestManager` API which exposes a stateful `update()` method which can be invoked multiple times and keeps the current checksum as a thread-local variable.

If we exclude MAC digest which is 20 bytes, for other digests we can instead keep the current checksum in a local variable and pass it each time, avoiding all the thread-locals and also the need for writing the checksum result into a buffer.

### Benchmarks

#### Before #3810 

```
Benchmark                            (entrySize)   Mode  Cnt   Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  13.450 ± 3.634  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3   7.908 ± 2.637  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3   3.233 ± 0.882  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3   1.846 ± 0.047  ops/us
```

#### After #3810 

```
Benchmark                            (entrySize)   Mode  Cnt   Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  46.312 ± 7.414  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3  13.379 ± 1.069  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3   3.787 ± 0.059  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3   1.956 ± 0.052  ops/us
```

#### After this change



```
Benchmark                            (entrySize)   Mode  Cnt    Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  130.108 ± 4.854  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3   17.744 ± 0.238  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3    4.104 ± 0.181  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3    2.050 ± 0.066  ops/us
```
hezhangjian pushed a commit that referenced this pull request Mar 1, 2023
### Motivation

In #3810 the signature of `Crc32cIntChecksum.resumeChecksum()` was changed to accept `offset` & `len` in the buffer.

Since this method is also used externally (in Pulsar), we should leave also the old method signature to avoid breaking the API when upgrading BK.
@merlimat merlimat deleted the digest-manager branch December 7, 2023 06:02
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
### Motivation

In `DigestManager` there are several accesses to ThreadLocal variable per each entry processed.

The reason is the mainly due to `DigestManager` API which exposes a stateful `update()` method which can be invoked multiple times and keeps the current checksum as a thread-local variable.

If we exclude MAC digest which is 20 bytes, for other digests we can instead keep the current checksum in a local variable and pass it each time, avoiding all the thread-locals and also the need for writing the checksum result into a buffer.

### Benchmarks

#### Before apache#3810 

```
Benchmark                            (entrySize)   Mode  Cnt   Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  13.450 ± 3.634  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3   7.908 ± 2.637  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3   3.233 ± 0.882  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3   1.846 ± 0.047  ops/us
```

#### After apache#3810 

```
Benchmark                            (entrySize)   Mode  Cnt   Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  46.312 ± 7.414  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3  13.379 ± 1.069  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3   3.787 ± 0.059  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3   1.956 ± 0.052  ops/us
```

#### After this change



```
Benchmark                            (entrySize)   Mode  Cnt    Score   Error   Units
DigestManagerBenchmark.verifyDigest           64  thrpt    3  130.108 ± 4.854  ops/us
DigestManagerBenchmark.verifyDigest         1024  thrpt    3   17.744 ± 0.238  ops/us
DigestManagerBenchmark.verifyDigest         4086  thrpt    3    4.104 ± 0.181  ops/us
DigestManagerBenchmark.verifyDigest         8192  thrpt    3    2.050 ± 0.066  ops/us
```
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
### Motivation

In apache#3810 the signature of `Crc32cIntChecksum.resumeChecksum()` was changed to accept `offset` & `len` in the buffer.

Since this method is also used externally (in Pulsar), we should leave also the old method signature to avoid breaking the API when upgrading BK.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants