-
Notifications
You must be signed in to change notification settings - Fork 15
[TEX] Part 1b: TrackingKeyValueService: utilities for byte size (1) #6332
Conversation
Generate changelog in
|
atlasdb-api/src/main/java/com/palantir/atlasdb/keyvalue/api/Value.java
Outdated
Show resolved
Hide resolved
@@ -116,6 +116,10 @@ public byte[] getColumnName() { | |||
return columnName; | |||
} | |||
|
|||
public long sizeInBytes() { | |||
return Long.sum(rowName.length, columnName.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, is the Long.sum
method call to implicitly widen and avoid casting to long for addition (e.g. return ((long) rowName.length) + columnName.length;
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - it also looked more succinct to my eyes
Co-authored-by: David Schlosnagle <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some general nits/styles stuff
atlasdb-api/src/main/java/com/palantir/atlasdb/keyvalue/api/TableReference.java
Outdated
Show resolved
Hide resolved
@@ -68,6 +68,10 @@ public byte[] getRowName() { | |||
return row.clone(); | |||
} | |||
|
|||
public long getRowNameSize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is needed as we capture the row name size separately from the columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added only because getRowName
copies data (& also because the parameter type T
in row result won't necessarily implement Measurable
so we can't implement this in-class)
return Collections.nCopies(size, TIMESTAMP); | ||
} | ||
|
||
private static byte[] spawnBytes(int size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why do we need to fill the array with our default byte? I think all we really care about size, no?
private List<Long> MOCK_TIMESTAMPS; | ||
|
||
@Test | ||
public void candidateCellSizeWithLargerTimestampCollectionIsBigger() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your testing here is very thorough, but I think we can relax it a bit in favor of readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also "best practice" to try and test a single concept per unit test. So for this, I'd just make a couple like:
void candidateCellSizeHasCorrectSizeForOneTimestamp
, void candidateCellSizeHasCorrectSizeForMultipleTimestamps
, void candidateCellSizeIsEqualRegardlessOfLatestValueEmpty
.collect(Collectors.toSet()); | ||
private static Cell createCellWithByteSize(int size) { | ||
Preconditions.checkArgument(size > 1, "Size should be at least 2"); | ||
return Cell.create(spawnBytes(size / 2), spawnBytes(size - (size / 2))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm being a bit thick, but shouldn't size / 2
== size - (size / 2)
the vast majority of the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are equal when n is even
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you want them to not be equal when n
is odd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @ergo14 wants the cell to have exactly the right size (e.g., otherwise createCellWithByteSize(7)
returns a cell that has row and column of size 3
, i.e. an overall size of 6
) - it'll be one byte smaller if we just divide when size is odd
Arrays.fill(bytes, BYTE); | ||
return bytes; | ||
private static byte[] spawnBytes() { | ||
return new byte[CELL_NAME_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I'd just make this a static variable at this point
} | ||
|
||
private static Value createValue(int contentsSize) { | ||
return Value.create(spawnBytes(contentsSize), Value.INVALID_VALUE_TIMESTAMP); | ||
} | ||
|
||
private static byte[] spawnBytes(int size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seeing as this method just creates a new byte[size]
, and is only used in one place, consider removing this creation method. although this is a very stylistic nit, so totally feel free to ignore :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: also for this and the others, we should name it createBytes
rather than spawnBytes
to match codebase style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One key question around the question of how we compute the size. The rest is broadly fine.
@@ -146,6 +148,15 @@ public String toString() { | |||
return getQualifiedName(); | |||
} | |||
|
|||
@Override | |||
public long sizeInBytes() { | |||
return stringSizeInBytes(tableName) + stringSizeInBytes(namespace.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't the only strings that are passed around, though. The size of AbstractKeyValueService.internalTableName()
is probably closer to what you want.
package com.palantir.atlasdb.util; | ||
|
||
public interface Measurable { | ||
long sizeInBytes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe trivial, but I would add javadocs.
@Mock | ||
private List<Long> MOCK_TIMESTAMPS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only used in the overflow test from what I can see; I'd suggest making it a local to that method.
Also, nit: case is wrong: fields use camelCase (so mockTimestamps)
} | ||
|
||
private static Cell createCellWithByteSize(int size) { | ||
Preconditions.checkArgument(size > 1, "Size should be at least 2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since the assertion is at least 2, no need to rely on the discrete domain - just write size >= 2
?
|
||
@Test | ||
public void sizeInBytesForTableReferenceWithEmptyNamespaceIsSizeOfAsciiTableName() { | ||
assertThat(TableReference.createWithEmptyNamespace("").sizeInBytes()).isEqualTo(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non actionable: I'm not sure if this is a realistic case 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Released 0.757.0 |
General
Before this PR:
After this PR:
sizeInBytes methods in existing classes of interest.
==COMMIT_MSG==
==COMMIT_MSG==
Concerns / possible downsides (what feedback would you like?):
Are the tests enough? Pointers on how to test the rest
Is documentation needed?:
No
Compatibility
Does this PR create any API breaks (e.g. at the Java or HTTP layers) - if so, do we have compatibility?:
No
Does this PR change the persisted format of any data - if so, do we have forward and backward compatibility?:
No
The code in this PR may be part of a blue-green deploy. Can upgrades from previous versions safely coexist? (Consider restarts of blue or green nodes.):
Yes
Does this PR rely on statements being true about other products at a deployment - if so, do we have correct product dependencies on these products (or other ways of verifying that these statements are true)?:
No
Does this PR need a schema migration?
No
Testing and Correctness
What, if any, assumptions are made about the current state of the world? If they change over time, how will we find out?:
N/A
What was existing testing like? What have you done to improve it?:
Added tests for added iterators
If this PR contains complex concurrent or asynchronous code, is it correct? The onus is on the PR writer to demonstrate this.:
N/A
If this PR involves acquiring locks or other shared resources, how do we ensure that these are always released?:
N/A
Execution
How would I tell this PR works in production? (Metrics, logs, etc.):
N/A
Has the safety of all log arguments been decided correctly?:
N/A
Will this change significantly affect our spending on metrics or logs?:
N/A
How would I tell that this PR does not work in production? (monitors, etc.):
N/A
If this PR does not work as expected, how do I fix that state? Would rollback be straightforward?:
recall and rollback
If the above plan is more complex than “recall and rollback”, please tag the support PoC here (if it is the end of the week, tag both the current and next PoC):
N/A
Scale
Would this PR be expected to pose a risk at scale? Think of the shopping product at our largest stack.:
No
Would this PR be expected to perform a large number of database calls, and/or expensive database calls (e.g., row range scans, concurrent CAS)?:
No
Would this PR ever, with time and scale, become the wrong thing to do - and if so, how would we know that we need to do something differently?:
No
Development Process
Where should we start reviewing?:
Cell.java
If this PR is in excess of 500 lines excluding versions lock-files, why does it not make sense to split it?:
N/A