-
Notifications
You must be signed in to change notification settings - Fork 15
Optimize some cell and value extraction read paths #6073
Conversation
Generate changelog in
|
throw e; | ||
private byte[] validateNameValid(byte[] name) { | ||
if (isNameValid(name)) { | ||
return name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to keep this method small so that it gets JITed & inlined for the Cell.create
happy path
for (Map.Entry<CassandraServer, List<Cell>> hostAndCells : hostsAndCells.entrySet()) { | ||
if (log.isTraceEnabled()) { | ||
if (isTraceEnabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general case I wouldn't normally pull log.isTraceEnabled()
out to a local, but this popped in profile for relatively hot loop this will almost always be false, so pulled it out a local final boolean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, nice changes - appreciate the work. I'm a little concerned about the breaks in Cell
, though.
hashCode = Arrays.hashCode(rowName) ^ Arrays.hashCode(columnName); | ||
hashCode = 31 * Arrays.hashCode(rowName) + Arrays.hashCode(columnName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the comment on CellReference
:
/**
* {@link Cell#hashCode()} implementation has a rather unfortunate case where it is always 0 if the row name and
* the column name match. We did not want to change it to keep backwards compatibility, but we need a uniform
* distribution here for all reasonable patterns.
*/
I appreciate that this hash code is crap, but I think we should not break this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, will revert and add comment here & test to ensure we hold that back compact invariant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated & added test coverage
return this.hashCode() == other.hashCode() | ||
&& Arrays.equals(rowName, other.rowName) | ||
&& Arrays.equals(columnName, other.columnName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As below, I think it'd be safer to keep this the same.
for (Map.Entry<CassandraServer, List<Cell>> hostAndCells : hostsAndCells.entrySet()) { | ||
if (log.isTraceEnabled()) { | ||
if (isTraceEnabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that makes sense
@@ -120,9 +121,9 @@ void loadWithTs( | |||
SafeArg.of("totalPartitions", totalPartitions)); | |||
} | |||
|
|||
List<Callable<Void>> tasks = new ArrayList<>(); | |||
List<Callable<Void>> tasks = new ArrayList<>(hostsAndCells.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this is probably a better estimate, we're still possibly going to exceed this probably (as each hostsAndCells
element may add several elements). Still an improvement, so fine by me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this isn't fully optimal but should be more reasonable estimate
List<List<Cell>> batches = batcher.partitionIntoBatches(cells, cassandraServer, tableRef); | ||
List<Callable<Void>> tasks = new ArrayList<>(batches.size()); | ||
for (final List<Cell> partition : batches) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
log.trace( | ||
"Requesting {} cells from {} {}starting at timestamp {} on {}", | ||
SafeArg.of("cells", hostsAndCells.values().size()), | ||
SafeArg.of("cells", hostAndCells.getValue().size()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a drive by fix as this seemed to log incorrect value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks! I really like all the things tested for Cell
; I've got a few nits on that test class, but going to approve. Feel free to merge once those are resolved!
assertThat(Cell.create(bytes("row"), bytes("col"))) | ||
.isNotEqualByComparingTo(Cell.create(bytes("row2"), bytes("col"))); | ||
assertThat(Cell.create(bytes("row"), bytes("col"))) | ||
.isNotEqualByComparingTo(Cell.create(bytes("row2"), bytes("col2"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're missing the case where row is the same and column is different
assertThat(Cell.create(bytes("row1"), bytes("col"))).isLessThan(Cell.create(bytes("row2"), bytes("col"))); | ||
assertThat(Cell.create(bytes("row1"), bytes("col"))).isGreaterThan(Cell.create(bytes("row0"), bytes("col"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the compareTo
method has two branches, and this only tests the first (where the row differs); perhaps worth testing the compareTo part when row is the same and col is less or greater?
private static byte[] bytes(String value) { | ||
return value.getBytes(StandardCharsets.UTF_8); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spotless is failing on this
assertThat(Cell.create(bytes("row"), bytes("col"))) | ||
.describedAs("Cell unfortunately has a non-ideal hashCode where swapped " | ||
+ "row and column values lead to the same hashCode and cannot be changed due " | ||
+ "to backward compatibility. See goodHash") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: See CellReference's goodHash
or CellReferenhce#goodHash
public final class CellTest { | ||
|
||
@Test | ||
public void create() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we prefix create
, isNameValid
, and compareTo
with test
?
log.trace( | ||
"Requesting {} cells from {} {}starting at timestamp {} on {}", | ||
SafeArg.of("cells", hostsAndCells.values().size()), | ||
SafeArg.of("cells", hostAndCells.getValue().size()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep - thanks!
Goals (and why):
While looking through some JFRs for services using AtlasDB, noticed several read paths that could be optimized to reduce allocations and improve method execution and CPU utilization.
==COMMIT_MSG==
Optimize some cell and value extraction read paths
==COMMIT_MSG==
Implementation Description (bullets):
Testing (What was existing testing like? What have you done to improve it?):
Concerns (what feedback would you like?):
Where should we start reviewing?:
Priority (whenever / two weeks / yesterday):