-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to arrow 26, change timezones #4039
Conversation
@@ -219,7 +219,7 @@ pub fn return_type( | |||
} | |||
BuiltinScalarFunction::Now => Ok(DataType::Timestamp( | |||
TimeUnit::Nanosecond, | |||
Some("UTC".to_owned()), | |||
Some("+00:00".to_owned()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is necessary because by default (without the chrono-tz feature) arrow-rs doesn't know how to interpret this timezone. With the recent improvements to timezone handling, it now complains - (it prints "Unknown Timezone" when rending the timestamps). TBC this is an improvement over silently ignoring the timezone which it did before
FYI @waitingkuo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically applied this same pattern to a few other tests and now things looks great 👍
I'm getting failures in |
Somehow the implementation spilled memory, because it believes memory usage exceeded the limit. Maybe some change happened in arrow 26 not shrinking the buffer somewhere? |
@tustvold @Dandandan it seems like the limit is never applied for multi-column sorts when |
Ah, great find. @tustvold might be worth to block the RC to fix this issue? |
Aah nice find will workaround. Given it can easily be worked around by slicing the returned array I don't think we need to block the RC |
First thing I recommend we do is to file an arrow-rs ticket explaining the issue and then we can discuss a new RC based on that issue. Doing that now |
Filed apache/arrow-rs#2990 -- will work on that shortly |
Proposed fix apache/arrow-rs#2991 |
I created a new release candidate for arrow 26.0.0 |
Arrow 26 has been released -- apache/arrow-rs#2953 |
I plan on creating DataFusion 14.0.0 rc1 once this is merged. |
I am working on updating this PR and polishing it for merge |
let arr_micros = TimestampMicrosecondArray::from_opt_vec(ts_micros, None); | ||
let arr_millis = TimestampMillisecondArray::from_opt_vec(ts_millis, None); | ||
let arr_secs = TimestampSecondArray::from_opt_vec(ts_secs, None); | ||
let arr_nanos = TimestampNanosecondArray::from(ts_nanos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@@ -237,25 +235,26 @@ fn timestamp_nano_ts_none_predicates() -> Result<()> { | |||
// a scan should have the now()... predicate folded to a single | |||
// constant and compared to the column without a cast so it can be | |||
// pushed down / pruned | |||
let expected = "Projection: test.col_int32\n Filter: test.col_ts_nano_utc < TimestampNanosecond(1666612093000000000, Some(\"UTC\"))\ | |||
\n TableScan: test projection=[col_int32, col_ts_nano_none]"; | |||
let expected = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this also fixes #3938 🎉
sorry @alamb, I merged a PR that created a small conflict here |
No worries -- I just fixed it |
Benchmark runs are scheduled for baseline = 4a67d0d and contender = dd081d6. dd081d6 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
🚀 FYI @Ted-Jiang |
* Update to arrow 26 * Update datafusion-cli * Update datafusion-cli lockfile * More test fixes * Re-add dyn_cmp_dict dev-dependencies * Update datafusion-cli lock * fix: Update tests * Update datafusion optimizer tests * Fix test Co-authored-by: Andrew Lamb <[email protected]>
* Update to arrow 26 * Update datafusion-cli * Update datafusion-cli lockfile * More test fixes * Re-add dyn_cmp_dict dev-dependencies * Update datafusion-cli lock * fix: Update tests * Update datafusion optimizer tests * Fix test Co-authored-by: Andrew Lamb <[email protected]>
Which issue does this PR close?
re apache/arrow-rs#2953
Rationale for this change
Get latest arrow
Fixes #3938
What changes are included in this PR?
Are there any user-facing changes?