Skip to content
This repository has been archived by the owner on Nov 14, 2024. It is now read-only.

Optimize some cell and value extraction read paths #6073

Merged
merged 16 commits into from
Jun 8, 2022
Merged

Conversation

schlosna
Copy link
Contributor

@schlosna schlosna commented Jun 7, 2022

Goals (and why):
While looking through some JFRs for services using AtlasDB, noticed several read paths that could be optimized to reduce allocations and improve method execution and CPU utilization.

==COMMIT_MSG==
Optimize some cell and value extraction read paths
==COMMIT_MSG==

Implementation Description (bullets):

Testing (What was existing testing like? What have you done to improve it?):

Concerns (what feedback would you like?):

Where should we start reviewing?:

Priority (whenever / two weeks / yesterday):

@changelog-app
Copy link

changelog-app bot commented Jun 7, 2022

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Optimize some cell and value extraction read paths

Check the box to generate changelog(s)

  • Generate changelog entry

throw e;
private byte[] validateNameValid(byte[] name) {
if (isNameValid(name)) {
return name;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to keep this method small so that it gets JITed & inlined for the Cell.create happy path

for (Map.Entry<CassandraServer, List<Cell>> hostAndCells : hostsAndCells.entrySet()) {
if (log.isTraceEnabled()) {
if (isTraceEnabled) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general case I wouldn't normally pull log.isTraceEnabled() out to a local, but this popped in profile for relatively hot loop this will almost always be false, so pulled it out a local final boolean

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense

@schlosna schlosna requested a review from jeremyk-91 June 7, 2022 21:06
@Jolyon-S Jolyon-S self-requested a review June 8, 2022 10:46
Copy link
Contributor

@Jolyon-S Jolyon-S left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, nice changes - appreciate the work. I'm a little concerned about the breaks in Cell, though.

Comment on lines 137 to 146
hashCode = Arrays.hashCode(rowName) ^ Arrays.hashCode(columnName);
hashCode = 31 * Arrays.hashCode(rowName) + Arrays.hashCode(columnName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the comment on CellReference:

    /**
     * {@link Cell#hashCode()} implementation has a rather unfortunate case where it is always 0 if the row name and
     * the column name match. We did not want to change it to keep backwards compatibility, but we need a uniform
     * distribution here for all reasonable patterns.
     */

I appreciate that this hash code is crap, but I think we should not break this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, will revert and add comment here & test to ensure we hold that back compact invariant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated & added test coverage

Comment on lines 133 to 135
return this.hashCode() == other.hashCode()
&& Arrays.equals(rowName, other.rowName)
&& Arrays.equals(columnName, other.columnName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As below, I think it'd be safer to keep this the same.

for (Map.Entry<CassandraServer, List<Cell>> hostAndCells : hostsAndCells.entrySet()) {
if (log.isTraceEnabled()) {
if (isTraceEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that makes sense

@@ -120,9 +121,9 @@ void loadWithTs(
SafeArg.of("totalPartitions", totalPartitions));
}

List<Callable<Void>> tasks = new ArrayList<>();
List<Callable<Void>> tasks = new ArrayList<>(hostsAndCells.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while this is probably a better estimate, we're still possibly going to exceed this probably (as each hostsAndCells element may add several elements). Still an improvement, so fine by me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this isn't fully optimal but should be more reasonable estimate

Comment on lines +163 to +165
List<List<Cell>> batches = batcher.partitionIntoBatches(cells, cassandraServer, tableRef);
List<Callable<Void>> tasks = new ArrayList<>(batches.size());
for (final List<Cell> partition : batches) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

log.trace(
"Requesting {} cells from {} {}starting at timestamp {} on {}",
SafeArg.of("cells", hostsAndCells.values().size()),
SafeArg.of("cells", hostAndCells.getValue().size()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a drive by fix as this seemed to log incorrect value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - thanks!

@schlosna schlosna requested a review from Jolyon-S June 8, 2022 14:29
Copy link
Contributor

@Jolyon-S Jolyon-S left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! I really like all the things tested for Cell; I've got a few nits on that test class, but going to approve. Feel free to merge once those are resolved!

Comment on lines 59 to 62
assertThat(Cell.create(bytes("row"), bytes("col")))
.isNotEqualByComparingTo(Cell.create(bytes("row2"), bytes("col")));
assertThat(Cell.create(bytes("row"), bytes("col")))
.isNotEqualByComparingTo(Cell.create(bytes("row2"), bytes("col2")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing the case where row is the same and column is different

Comment on lines 63 to 64
assertThat(Cell.create(bytes("row1"), bytes("col"))).isLessThan(Cell.create(bytes("row2"), bytes("col")));
assertThat(Cell.create(bytes("row1"), bytes("col"))).isGreaterThan(Cell.create(bytes("row0"), bytes("col")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the compareTo method has two branches, and this only tests the first (where the row differs); perhaps worth testing the compareTo part when row is the same and col is less or greater?

private static byte[] bytes(String value) {
return value.getBytes(StandardCharsets.UTF_8);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spotless is failing on this

assertThat(Cell.create(bytes("row"), bytes("col")))
.describedAs("Cell unfortunately has a non-ideal hashCode where swapped "
+ "row and column values lead to the same hashCode and cannot be changed due "
+ "to backward compatibility. See goodHash")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: See CellReference's goodHash or CellReferenhce#goodHash

public final class CellTest {

@Test
public void create() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we prefix create, isNameValid, and compareTo with test?

log.trace(
"Requesting {} cells from {} {}starting at timestamp {} on {}",
SafeArg.of("cells", hostsAndCells.values().size()),
SafeArg.of("cells", hostAndCells.getValue().size()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - thanks!

@bulldozer-bot bulldozer-bot bot merged commit edbf024 into develop Jun 8, 2022
@bulldozer-bot bulldozer-bot bot deleted the ds/perf branch June 8, 2022 20:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants