Replies: 8 comments 19 replies
-
CC @mbasmanova |
Beta Was this translation helpful? Give feedback.
-
Do you mean by "making TimestampWithTimezone primitive" that it would stop using the type extensibility framework? In general, we try only adding to the main library the types (but also functions, aggregates, operators) that are sufficiently generic to be used by more than one system integrating with Velox. Is the TimestampWithTimezone type Presto-specific? |
Beta Was this translation helpful? Give feedback.
-
It might be worth revisiting the decision to store timestamps with nanosecond precision. These are not supported by Presto or Spark and Substrait also supports only microsecond precision. https://substrait.io/types/type_classes/
|
Beta Was this translation helpful? Give feedback.
-
@kagamiori, @mbasmanova, we should prioritize this work since that will simplify Parquet Reader timestamp and timestamptz support. #4680 |
Beta Was this translation helpful? Give feedback.
-
+1. Please describe the changes you are planning to make and we can take it further. |
Beta Was this translation helpful? Give feedback.
-
@mbasmanova Don't we want to fix the TIMESTAMP scalar type in Velox as well in this fashion? That was my understanding. |
Beta Was this translation helpful? Give feedback.
-
I also just noticed that the Presto Timestamp With Timezone says
But none of the Hive file formats support a TimezoneTz type where the timezone is part of the value. |
Beta Was this translation helpful? Give feedback.
-
FYI, there is a correctness issue for queries that use this type: #10338 |
Beta Was this translation helpful? Give feedback.
-
TimestampWithTimeZone
is a primitive type in Presto. In contrast, it is implemented as aRow<int64_t, int16_t>
in Velox and hence treated as a complex type implicitly. This discussion aims at exploring the possibility of making TimestampWithTimezone in Velox a primitive type and the tradeoff of this potential change.Motivation
Making TimestampWithTimezone a primitive type brings better consistency between Velox and Presto, and reduces the chance of errors. For example, a TimestampWithTimezone value can be either NULL or non-NULL. But with the current implementation as
Row<int64_t, int16_t>
, it is possible to create a TimestampWithTimezone vector where only the first or the second field of a value is NULL. This is unexpected and likely not handled by existing UDFs.Changes needed
To make TimestampWithTimezone primitive, we need to redefine the type for it. After that, we need to update datetime functions that have TimestampWithTimezone argument or return types. We also need to add support for serializing and deserializing TimestampWithTimezone vectors.
Beta Was this translation helpful? Give feedback.
All reactions