add 'TypeStrategy' to types #11888

clintropolis · 2021-11-08T09:11:52Z

Description

Following up on discussion in #11853, this PR introduces a new TypeStrategy interface which defines binary serde and comparison for values of any TypeSignature.

The binary serialization methods were consolidated from the static methods in Types, and the code I think is overall much more simple.

TypeStrategy deals in objects, so is not optimal for primitive values like longs and doubles, so the static methods for reading/writing these values with nulls remain but now live in TypeStrategies, along with all of the built-in implementations (string, long, double, float, array).

The comparator implementations match the orderings defined in druid-processing... we may wish to consider bringing some of them into druid-core, merging druid-core and druid-processing , or something else to try to consolidate things a bit better. The array comparator behaves mostly the same as postgres array comparison afaict, with the exception that we default to "nulls first" for most thing rather than "nulls last" so we do that instead.

ObjectByteStrategy is moved back into ObjectStrategy, it's registry replaced with one of TypeStrategy, and instead ComplexMetricsSerde has a new default implementation of a method called getTypeStrategy which wraps the ObjectStrategy methods, but could be overridden for more optimal implementations.

I think the utility of being able to get binary serde to use for things like buffer aggregators is on full display in the ExprEval serialization methods, so I think overall i'm in favor of this change, since basically anywhere you have a ColumnInspector (which includes RowSignature) you have the means to read, write, and compare values.

Key changed/added classes in this PR

TypeStrategy
TypeStrategies
Types
NullableTypeStrategy
ComplexMetricsSerde
ColumnType
ColumnTypeFactory
ExpressionType

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

… TypeSignature

cheddar · 2021-11-09T03:09:53Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+   * the values maximum size. If the value is null, the size will be {@link Byte#BYTES}, otherwise it will be
+   * {@link Byte#BYTES} + {@link #estimateSizeBytes(Object)}
+   */
+  default int estimateSizeBytesNullable(@Nullable T value)


I'm not sure about hand the null handling over to the implementation here. The strategy as I understand it is using a whole byte for whether there is a null value. A byte for whether it is null consumes 8x the space than it really needs, this can actually add up in unfortunate ways when there are lots of columns in the result set.

For example, a "better" implementation would be to start every row with a bitmap of all of the columns that are null. This would consume only a single bit per column rather than a byte per column and, if we were so inclined, could also be properly padded to try to enforce word-alignment.

Having an implementation that does it with a whole byte as a stepping stone is maybe okay. But, with the way this interface is built, the interface is forcing rather sub-optimal null-handling on the query engines that use this interface. I think it would be better to perhaps declare that a size of -1 means that the value is equivalent to null and have the thing external from this do something meaningful with that knowledge.

We would likely also need to adjust the signature of write to return an int to indicate the number of bytes written (once again, returning a -1 for "it was null")

Thanks, I agree, I pulled the null-byte wrapper handling into a special NullableTypeStrategy which can be used to wrap regular TypeStrategy implementations (which are now not expected to process nulls), and added lots of javadocs to suggest not using it unless you've got space to burn or no other options, so hopefully it will limit use.

gianm · 2021-11-17T00:04:36Z

Some high level thoughts:

estimateSizeBytesNullable appears to be the only reference to nullability. The other methods say they deal with non-null data. That seems inconsistent. Here I think I'm in favor of what @cheddar is suggesting, that this interface should not be null-aware, and null handling should be left to the caller. (Callers shouldn't need to use this interface if they get a null.)
To simplify life for callers that don't want to care about null handling, we could make a NullableTypeStrategy that accepts nulls and stores a null byte. It'd be composable with all the other TypeStrategy implementations, and lets certain callers stay simple while also letting other callers be fancy.
I'm confused about the intent of estimateSizeBytes: is it meant to be a maximum size? Like, can the estimate be an underestimate? Some of the javadocs and comments seem to suggest that is the case. If it's meant to be a max, we should call it specifically "max" instead of "estimate".
I'm not sure if anything could use it today, but, we should design this interface to support variable size serialization lengths, since we'll want it for various future data types. Those could work by having write take a max length, and returning -1 if it would need more than that. Then the caller could allocate more space and call write again. Or, maybe, write could return -the_amount_of_space_it_needs. In this case the "estimate" method really could return an estimate rather than a max.
Even with variable size support like (4) suggest, we will still have callers that can't deal with that. So there will need to be some API that supports those callers. I'm not sure what that'd look like; maybe a getMaxBytes method that returns the max, or -1 if there really isn't a known max? And if the max is -1, then callers that only support constant size serialization can't use the API?

clintropolis · 2021-11-29T21:34:30Z

estimateSizeBytesNullable appears to be the only reference to nullability. The other methods say they deal with non-null data. That seems inconsistent. Here I think I'm in favor of what @cheddar is suggesting, that this interface should not be null-aware, and null handling should be left to the caller. (Callers shouldn't need to use this interface if they get a null.)

Oops, this was a leftover from the first round of changes from @cheddar 's comment where I moved most of the null byte handling stuff out of TypeStrategy itself, but forgot to move this one out. Since then, I've moved all of the null byte handling into the special NullableTypeStrategy that should be used for callers that want to use the null byte from NullHandling, along with lots of javadocs to encourage other options if available since the byte per value is quite expensive.

To simplify life for callers that don't want to care about null handling, we could make a NullableTypeStrategy that accepts nulls and stores a null byte. It'd be composable with all the other TypeStrategy implementations, and lets certain callers stay simple while also letting other callers be fancy.

Did this, as well as adding a getNullableStrategy to TypeSignature that will automatically wrap the output of getStrategy with the NullableTypeStrategy

I'm confused about the intent of estimateSizeBytes: is it meant to be a maximum size? Like, can the estimate be an underestimate? Some of the javadocs and comments seem to suggest that is the case. If it's meant to be a max, we should call it specifically "max" instead of "estimate".

Before the most recent set of changes, estimate was actually being used to get the exact size of some value in bytes, in order to check that variably sized values did not exceed this maximum. This has been reworked as suggested in 4) to instead pass maxSizeBytes directly into the write family of methods, and now has been relaxed to only needing to be an estimate. The only thing still using it in the current PR is the heap based ExpressionLambdaAggregator, which uses it to check that the aggregator does not explode the heap in an unbounded way.

I'm not sure if anything could use it today, but, we should design this interface to support variable size serialization lengths, since we'll want it for various future data types. Those could work by having write take a max length, and returning -1 if it would need more than that. Then the caller could allocate more space and call write again. Or, maybe, write could return -the_amount_of_space_it_needs. In this case the "estimate" method really could return an estimate rather than a max.

Variably sized serialization was previously already supported (but required externally verifying that there was enough space with the estimate method). But, I liked the ideas here, so I have modified the write methods to now accept a maxSizeBytes parameter and went with -the_amount_of_extra_space_it_needs approach if this value (or buffer size) is exceeded, so that callers could choose to explode if the value was not able to be completely written, as well as know how much additional space would be required to fully write the value.

Even with variable size support like (4) suggest, we will still have callers that can't deal with that. So there will need to be some API that supports those callers. I'm not sure what that'd look like; maybe a getMaxBytes method that returns the max, or -1 if there really isn't a known max? And if the max is -1, then callers that only support constant size serialization can't use the API?

Variably sized support is in place, I'm not sure we need a getMaxBytes method as attempting to call write now requires specifying a max bytes, even for constant sized values, callers with constant sized serialization can either specify the correct size or some size larger than the actual size, which is sort of ugly, but also primitive numerics shouldn't really be using TypeStrategy directly, instead they should be using the static methods to avoid converting the java primitives, so maybe it isn't a huge deal.

cryptoe · 2021-12-01T12:27:20Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategies.java

+    @Override
+    public int compare(Object[] o1, Object[] o2)
+    {
+      final int iter = Math.max(o1.length, o2.length);


should this be math.min as might have a scenario we can get indexOutofBoundException no ?

oops - yes, good catch, will add some tests since there obviously aren't any 😅

gianm · 2021-11-12T01:01:08Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+    final int oldPosition = buffer.position();
+    buffer.position(offset);
+    T value = read(buffer);
+    buffer.position(oldPosition);


Consider two changes:

Using try/finally so the buffer position is still valid if read throws an exception

Updating the javadoc to say that even though the buffer position is unchanged upon method exit, it may be changed temporarily while the method is running, so it isn't concurrency-safe.

gianm · 2021-11-12T01:03:42Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+    buffer.position(offset);
+    write(buffer, value);
+    final int size = buffer.position() - offset;
+    buffer.position(oldPosition);


Consider two changes:

Using try/finally so the buffer position is still valid if write throws an exception

Updating the javadoc to say that even though the buffer position is unchanged upon method exit, it may be changed temporarily while the method is running, so it isn't concurrency-safe.

gianm · 2021-11-30T02:55:17Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+  /**
+   * Estimate the size in bytes that writing this value to memory would require. This method is not required to be
+   * exactly correct, but many implementations might be. Implementations should err on the side of over-estimating if
+   * exact sizing is not efficient


I'd include something here about how this estimate is expected to be used. (Like an example)

That'll help people understand when they should use this method, and also understand how to implement it. I'm suggesting this because you gotta be careful with "estimate" methods: people have wildly differing ideas of how accurate they need to be, so it pays to be specific in the doc comments.

added to javadocs to better explain usage

gianm · 2021-12-02T13:07:23Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+
+
+  /**
+   * Read a non-null value from the {@link ByteBuffer} at the current {@link ByteBuffer#position()}. This will move


"non-null" isn't true for the NullableTypeStrategy impl, which is still a TypeStrategy, so this method contract should be adjusted.

split NullableTypeStrategy out to be standalone so the contract isn't confusing, it wasn't really used outside of tests as a TypeStrategy so it doesn't seem like a loss and there is no longer ambiguity of whether or not nulls are handled so i think is an improvement

gianm · 2021-12-02T13:10:15Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+   * Write a non-null value to the {@link ByteBuffer} at position {@link ByteBuffer#position()}. This will move the
+   * underlying position by the size of the value written, and returns the number of bytes written.
+   *
+   * If writing the value would take more than 'maxSizeBytes' (or the buffer limit, whichever is smaller), this method


Should we spec this to be an error if maxSizeBytes is greater than buffer.remaining()? Seems like a weird thing for a caller to do. Or is there a reason a caller might do that?

i had it this way so that callers that don't know or care about size can pass in an arbitrary value like Integer.MAX_VALUE, but it didn't seem very useful so i've made it an exception now, the old behavior was a bit strange coupled with the negative size written return values for when write doesn't have enough room since the caller knows that they definitely need to add that many bytes to maxSizeBytes to have enough space to complete the write

gianm · 2021-12-02T13:13:28Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+   * If writing the value would take more than 'maxSizeBytes' (or the buffer limit, whichever is smaller), this method
+   * will return a negative value indicating the number of additional bytes that would be required to fully write the
+   * value. Partial results may be written to the buffer when in this state, and the position may be left at whatever
+   * point the implementation ran out of space while writing the value.


Does this mean the caller is expected to save the position? Or is the caller safe to assume it can roll back by -retval?

i.e., should callers do this?

int written = write(buf, value, maxBytes); if (written < 0) { buf.position(buf.position() + written); // roll back buf, undoing the write }

or this?

int oldPosition = buf.position(); int written = write(buf, value, maxBytes); if (written < 0) { buf.position(oldPosition); // roll back buf, undoing the write }

the latter, caller should save the position before writing to handle an incomplete write, added javadocs to clarify

gianm · 2021-12-02T13:21:21Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+ * to set and reset buffer positions as appropriate for the offset based methods, but may be overridden if a more
+ * optimized implementation is needed.
+ */
+public interface TypeStrategy<T> extends Comparator<T>


Will stuff serialized with this interface be persisted to disk, sent between servers that might be running different versions of code, used in caches, etc? Or will it always be in-memory, single-server, & ephemeral?

Implementers are going to want to know the answer, so they can design their formats accordingly. (If persistence is required then they might want to add a version byte, avoid using native endianness, etc.)

added more javadocs indicating that this is primarily for transient stuffs, and if you want persistence should bring your own version and endian tracking and stuff

jihoonson · 2022-01-06T05:28:05Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategies.java

+   *
+   * @return number of bytes written (always 9)
+   */
+  public static int writeNullableLong(ByteBuffer buffer, int offset, long value)


nit: should it be called writeLongWithNonNullMark or something rather than NullableLong? It reads like the long parameter can be null.

changed to readNotNullNullable.../writeNotNullNullable... to reduce confusion

jihoonson · 2022-01-06T05:52:13Z

core/src/main/java/org/apache/druid/segment/column/TypeStrategy.java

+   *
+   * @return number of bytes written
+   */
+  int write(ByteBuffer buffer, T value, int maxSizeBytes);


Should we document in the javadoc that this method can explode when maxSizeBytes > buffer.remaining()?

jihoonson · 2022-01-06T06:12:17Z

processing/src/main/java/org/apache/druid/segment/column/ObjectStrategyComplexTypeStrategy.java

+  @Override
+  public int write(ByteBuffer buffer, T value, int maxSizeBytes)
+  {
+    byte[] bytes = objectStrategy.toBytes(value);


Should this method also call TypeStrategies.checkMaxSize() first?

jihoonson

+1 after CI. Thank you @clintropolis!

clintropolis · 2022-01-11T01:11:22Z

thanks for review all 👍

* add TypeStrategy - value comparators and binary serialization for any TypeSignature (cherry picked from commit e583033)

clintropolis added 4 commits November 8, 2021 00:36

add TypeStrategy - value comparators and binary serialization for any…

2ac6654

… TypeSignature

cast expression buffer agg

5f72bc6

more javadoc

58cbbeb

better name

01c245b

clintropolis added the Area - Querying label Nov 8, 2021

dont cast unless different type

683354a

cheddar reviewed Nov 9, 2021

View reviewed changes

clintropolis added 3 commits November 10, 2021 02:17

Merge remote-tracking branch 'upstream/master' into type-strategy

ee0bd59

move nullable out of strategy, javadoc warnings

82ded8b

better name

4e5787c

clintropolis added 3 commits November 23, 2021 03:23

Merge remote-tracking branch 'upstream/master' into type-strategy

1f3ca49

Merge remote-tracking branch 'upstream/master' into type-strategy

2092e06

more better

6336ff5

cryptoe reviewed Dec 1, 2021

View reviewed changes

fix array comparator, add type strategy comparator tests

03d8f6a

gianm reviewed Dec 2, 2021

View reviewed changes

clintropolis added 3 commits December 3, 2021 12:01

Merge remote-tracking branch 'upstream/master' into type-strategy

da18dd0

split nullable type strategy to standalone type, more javadocs

64ac2b0

max bytes larger than the buffer limit is now an exception

5366575

clintropolis mentioned this pull request Dec 23, 2021

Grouping on arrays as arrays #12078

Merged

9 tasks

Merge remote-tracking branch 'upstream/master' into type-strategy

abef8fb

jihoonson reviewed Jan 6, 2022

View reviewed changes

clintropolis added 3 commits January 6, 2022 00:00

review stuff

ddb1676

remove assert

98933e9

cache strategy stuff on BaseTypeSignature

ea23aa7

jihoonson approved these changes Jan 7, 2022

View reviewed changes

clintropolis merged commit e583033 into apache:master Jan 11, 2022

clintropolis deleted the type-strategy branch January 11, 2022 01:12

clintropolis mentioned this pull request Jan 13, 2022

fix array type strategy write size tracking #12150

Merged

3 tasks

abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022

sachinsagare pushed a commit to pinterest/druid that referenced this pull request Oct 28, 2022

add 'TypeStrategy' to types (apache#11888)

de3ed8b

* add TypeStrategy - value comparators and binary serialization for any TypeSignature (cherry picked from commit e583033)

sachinsagare pushed a commit to sachinsagare/druid that referenced this pull request Nov 2, 2022

add 'TypeStrategy' to types (apache#11888)

ccf3b33

* add TypeStrategy - value comparators and binary serialization for any TypeSignature (cherry picked from commit e583033)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add 'TypeStrategy' to types #11888

add 'TypeStrategy' to types #11888

clintropolis commented Nov 8, 2021 •

edited

Loading

cheddar Nov 9, 2021

clintropolis Nov 29, 2021

gianm commented Nov 17, 2021

clintropolis commented Nov 29, 2021

cryptoe Dec 1, 2021 •

edited

Loading

clintropolis Dec 1, 2021

gianm Nov 12, 2021

gianm Nov 12, 2021

gianm Nov 30, 2021

clintropolis Dec 7, 2021

gianm Dec 2, 2021

clintropolis Dec 7, 2021

gianm Dec 2, 2021

clintropolis Dec 7, 2021

gianm Dec 2, 2021 •

edited

Loading

clintropolis Dec 7, 2021

gianm Dec 2, 2021

clintropolis Dec 7, 2021

jihoonson Jan 6, 2022

clintropolis Jan 6, 2022

jihoonson Jan 6, 2022

jihoonson Jan 6, 2022

jihoonson left a comment

clintropolis commented Jan 11, 2022



		/**
		* Read a non-null value from the {@link ByteBuffer} at the current {@link ByteBuffer#position()}. This will move

add 'TypeStrategy' to types #11888

add 'TypeStrategy' to types #11888

Conversation

clintropolis commented Nov 8, 2021 • edited Loading

Description

Key changed/added classes in this PR

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gianm commented Nov 17, 2021

clintropolis commented Nov 29, 2021

cryptoe Dec 1, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gianm Dec 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jihoonson left a comment

Choose a reason for hiding this comment

clintropolis commented Jan 11, 2022

clintropolis commented Nov 8, 2021 •

edited

Loading

cryptoe Dec 1, 2021 •

edited

Loading

gianm Dec 2, 2021 •

edited

Loading