-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predicates on to_timestamp do not work as expected with "naive" timestamp strings #765
Comments
The core of the problem is that
So that certainly suggests we should not be applying any normalization to timestamps if there is no specific timezone set; Instead, we should return the raw "naive" timestamp (which corresponds to the arrow semantics for Now this leaves open the question of "what do we do if the timestamp has an explicit timezone in it"? For example, If the desired output timezone is I feel this is very similar to the question that @velvia was getting at in #686 I will continue the conversation there. |
#686 (comment) is the proposal for handling timezones more properly |
@alamb i think i'm in utc+8, here's what i have ❯ select to_timestamp('2021-07-20 23:29:30');
+------------------------------------------+
| totimestamp(Utf8("2021-07-20 23:29:30")) |
+------------------------------------------+
| 2021-07-20 15:29:30 |
+------------------------------------------+
1 row in set. Query took 0.001 seconds. I assume that you are in negative time zone so that the result timestamp is larger |
this is where the offset comes from [ /// Interprets a naive_datetime (with no explicit timezone offset)
/// using the local timezone and returns the timestamp in UTC (0
/// offset)
fn naive_datetime_to_timestamp(naive_datetime: &NaiveDateTime) -> i64 {
// Note: Use chrono APIs that are different than
// naive_datetime_to_timestamp to compute the utc offset to
// try and double check the logic
let utc_offset_secs = match Local.offset_from_local_datetime(naive_datetime) {
LocalResult::Single(local_offset) => {
local_offset.fix().local_minus_utc() as i64
}
_ => panic!("Unexpected failure converting to local datetime"),
};
let utc_offset_nanos = utc_offset_secs * 1_000_000_000;
naive_datetime.timestamp_nanos() - utc_offset_nanos
}](https://github.com/apache/arrow-rs/blob/e2bf158946e5d81912bc9166d87b86f0ad442afb/arrow/src/compute/kernels/cast_utils.rs#L135-L155) |
@alamb would you mind helping check whether this is the root cause? |
@waitingkuo -- it certainly sounds plausible |
Describe the bug
Given a
TimestampNanosecondArray
which pretty-prints as follows:Queries involving a predicate such as
time < to_timestamp('2021-07-20 23:29:30')
do not filter any rows (even though they should filter the row with2021-07-20 23:30:30
)To Reproduce
Expected behavior
The query should produce a single row with timestamp
2021-07-20 23:28:50
However the actual query returns both rows
Additional context
We saw this in IOx: https://github.com/influxdata/influxdb_iox/issues/2071
The text was updated successfully, but these errors were encountered: