Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-14442: [R] fix behaviour when converting timestamps with "" as tzone #12240

Closed
wants to merge 56 commits into from
Closed
Show file tree
Hide file tree
Changes from 54 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
eaf2fe9
added failing tests for timestamps
dragosmg Jan 24, 2022
d7fee58
a more robust test
dragosmg Jan 24, 2022
9f6e2ba
added with and without tz examples
dragosmg Jan 25, 2022
eba6e74
add the timezone if missing
dragosmg Jan 25, 2022
ec5e3a3
simplify unit test
dragosmg Jan 26, 2022
2dd042a
import rlang::inform
dragosmg Jan 26, 2022
4cf3f4a
add rlang::import to namespace
dragosmg Jan 26, 2022
3d9534e
account for POSIXct vectors without the `"tzone"` attribute
dragosmg Jan 26, 2022
f38a293
use `missing()` instead of `is.null()` to check for a missing attribute
dragosmg Jan 26, 2022
0238aa5
add a test for message and do not use `missing()` to check for missin…
dragosmg Jan 26, 2022
1a5fe71
use `any()` instead of `|` when looking for a missing `"tzone"` attri…
dragosmg Jan 26, 2022
2a2065a
remove tests for array from POSIXct without timezone
dragosmg Jan 26, 2022
69df927
no longer inform
dragosmg Jan 26, 2022
135af60
remove message expectation
dragosmg Jan 26, 2022
ea4ef87
remove inaccurate comment
dragosmg Jan 26, 2022
d9aa10a
additional unit tests for POSIXct -> array
dragosmg Jan 26, 2022
0bd4eab
switch test to `"Pacific/Marquesas"` as improbable timezone
dragosmg Jan 26, 2022
e1b1199
remove handling of missing or empty tzone from Array
dragosmg Feb 1, 2022
5466d39
changed rare timezone in tests to `"Pacific/Marquesas"`
dragosmg Feb 1, 2022
2742681
added systzone symbol
dragosmg Feb 1, 2022
a1540dc
update the definition of `InferArrowTypeFromVector`
dragosmg Feb 1, 2022
05ebbb9
update the type infer from `REALSXP` too
dragosmg Feb 1, 2022
463bce3
found what looks to be the right incantation for calling R functionality
dragosmg Feb 1, 2022
388ff7e
stop going through `symbols::` for `Sys.timezone()`
dragosmg Feb 1, 2022
79460d1
call `Sys.timezone()` directly
dragosmg Feb 1, 2022
6b27059
testing `TZ = NA` and `TZ = NULL` both result in a `NULL` `"tzone"` a…
dragosmg Feb 2, 2022
6a633b9
skip 1 test on windows
dragosmg Feb 4, 2022
09f3382
skip test 2 on windows
dragosmg Feb 4, 2022
5d455a9
skip test 3 on windows
dragosmg Feb 4, 2022
1816861
skip test 4 & 5 on windows
dragosmg Feb 4, 2022
bf5b374
skip tests 6, 7, 8, 9 & 10 on windows
dragosmg Feb 4, 2022
f9552f4
skip tests 11, 12, 13, and 14 on windows
dragosmg Feb 4, 2022
b0ffa4f
checking to see if `ignore_attr` helps when comparing objects
dragosmg Feb 4, 2022
68ccb24
ignore attributes only on windows
dragosmg Feb 4, 2022
8bac047
one more ignore attributes on windows
dragosmg Feb 4, 2022
e3f6a5b
defined and used `on_windows()` to selectively ignore attributes in e…
dragosmg Feb 4, 2022
28d193f
ignoring attributes for some more of the failing tests
dragosmg Feb 4, 2022
528647a
some more attributes ignored
dragosmg Feb 4, 2022
5638c6b
ignoring more attribute checks on windows
dragosmg Feb 4, 2022
166c708
un-skip tests on win
dragosmg Feb 7, 2022
6ca5a4e
ignore attributes on windows for yday extractions
dragosmg Feb 8, 2022
41b580f
added a bunch of TODOs related to ARROW-13168
dragosmg Feb 22, 2022
daac293
trying something
dragosmg Feb 22, 2022
0415048
skip 1 test + datetime column OS aware
dragosmg Feb 22, 2022
d543415
try Jon's suggestion
dragosmg Feb 22, 2022
3343026
some cleanup
dragosmg Mar 30, 2022
fa510ed
more cleanup
dragosmg Mar 30, 2022
d9d1d15
one more
dragosmg Mar 30, 2022
f472103
another one
dragosmg Mar 30, 2022
c071d07
Merge branch 'master' into timestampts_missing_timezone
dragosmg Mar 30, 2022
876c483
removed the `on_windows()` skip helper
dragosmg Mar 31, 2022
5903fb2
use `withr::with_timezone()`
dragosmg Mar 31, 2022
de58c8d
erroring with `with_timezone("" ...)` instead of `with_envvar(c(TZ = …
dragosmg Apr 1, 2022
e31bf0a
add test with a different timezone
dragosmg Apr 1, 2022
e5039a8
removed `rlang::inform()` import
dragosmg Apr 4, 2022
fb19ce8
switched back to `withr::with_envvar(c(TZ = ""))` instead of `withr::…
dragosmg Apr 4, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion r/R/arrow-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
#' @importFrom rlang eval_tidy new_data_mask syms env new_environment env_bind set_names exec
#' @importFrom rlang is_bare_character quo_get_expr quo_get_env quo_set_expr .data seq2 is_interactive
#' @importFrom rlang expr caller_env is_character quo_name is_quosure enexpr enexprs as_quosure
#' @importFrom rlang is_list call2 is_empty as_function
#' @importFrom rlang is_list call2 is_empty as_function inform
dragosmg marked this conversation as resolved.
Show resolved Hide resolved
#' @importFrom tidyselect vars_pull vars_rename vars_select eval_select
#' @useDynLib arrow, .registration = TRUE
#' @keywords internal
Expand Down
6 changes: 4 additions & 2 deletions r/src/type_infer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,8 @@ std::shared_ptr<arrow::DataType> InferArrowTypeFromVector<INTSXP>(SEXP x) {
} else if (Rf_inherits(x, "POSIXct")) {
auto tzone_sexp = Rf_getAttrib(x, symbols::tzone);
if (Rf_isNull(tzone_sexp)) {
dragosmg marked this conversation as resolved.
Show resolved Hide resolved
dragosmg marked this conversation as resolved.
Show resolved Hide resolved
return timestamp(TimeUnit::MICRO);
auto systzone_sexp = cpp11::package("base")["Sys.timezone"];
return timestamp(TimeUnit::MICRO, CHAR(STRING_ELT(systzone_sexp(), 0)));
} else {
return timestamp(TimeUnit::MICRO, CHAR(STRING_ELT(tzone_sexp, 0)));
}
Expand All @@ -88,7 +89,8 @@ std::shared_ptr<arrow::DataType> InferArrowTypeFromVector<REALSXP>(SEXP x) {
if (Rf_inherits(x, "POSIXct")) {
auto tzone_sexp = Rf_getAttrib(x, symbols::tzone);
if (Rf_isNull(tzone_sexp)) {
return timestamp(TimeUnit::MICRO);
auto systzone_sexp = cpp11::package("base")["Sys.timezone"];
return timestamp(TimeUnit::MICRO, CHAR(STRING_ELT(systzone_sexp(), 0)));
} else {
return timestamp(TimeUnit::MICRO, CHAR(STRING_ELT(tzone_sexp, 0)));
}
Expand Down
29 changes: 25 additions & 4 deletions r/tests/testthat/test-Array.R
Original file line number Diff line number Diff line change
Expand Up @@ -260,18 +260,39 @@ test_that("array supports POSIXct (ARROW-3340)", {
expect_array_roundtrip(times2, timestamp("us", "US/Eastern"))
})

test_that("array supports POSIXct without timezone", {
# Make sure timezone is not set
withr::with_envvar(c(TZ = ""), {
test_that("array uses local timezone for POSIXct without timezone", {
withr::with_timezone("", {
times <- strptime("2019-02-03 12:34:56", format = "%Y-%m-%d %H:%M:%S") + 1:10
expect_array_roundtrip(times, timestamp("us", ""))
expect_equal(attr(times, "tzone"), NULL)
expect_array_roundtrip(times, timestamp("us", Sys.timezone()))

# Also test the INTSXP code path
skip("Ingest_POSIXct only implemented for REALSXP")
times_int <- as.integer(times)
attributes(times_int) <- attributes(times)
expect_array_roundtrip(times_int, timestamp("us", ""))
})

# If there is a timezone set, we record that
withr::with_timezone("Pacific/Marquesas", {
dragosmg marked this conversation as resolved.
Show resolved Hide resolved
times <- strptime("2019-02-03 12:34:56", format = "%Y-%m-%d %H:%M:%S") + 1:10
expect_equal(attr(times, "tzone"), "Pacific/Marquesas")
expect_array_roundtrip(times, timestamp("us", "Pacific/Marquesas"))
dragosmg marked this conversation as resolved.
Show resolved Hide resolved

times_with_tz <- strptime(
"2019-02-03 12:34:56",
format = "%Y-%m-%d %H:%M:%S",
tz = "Asia/Katmandu") + 1:10
expect_equal(attr(times, "tzone"), "Asia/Katmandu")
expect_array_roundtrip(times, timestamp("us", "Asia/Katmandu"))
})

# and although the TZ is NULL in R, we set it to the Sys.timezone()
withr::with_timezone(NA, {
times <- strptime("2019-02-03 12:34:56", format = "%Y-%m-%d %H:%M:%S") + 1:10
expect_equal(attr(times, "tzone"), NULL)
expect_array_roundtrip(times, timestamp("us", Sys.timezone()))
})
})

test_that("Timezone handling in Arrow roundtrip (ARROW-3543)", {
Expand Down
4 changes: 0 additions & 4 deletions r/tests/testthat/test-dplyr-funcs-datetime.R
Original file line number Diff line number Diff line change
Expand Up @@ -651,7 +651,6 @@ test_that("extract yday from date", {
})

test_that("leap_year mirror lubridate", {

compare_dplyr_binding(
.input %>%
mutate(x = leap_year(date)) %>%
Expand Down Expand Up @@ -679,7 +678,6 @@ test_that("leap_year mirror lubridate", {
))
)
)

})

test_that("am/pm mirror lubridate", {
Expand All @@ -699,10 +697,8 @@ test_that("am/pm mirror lubridate", {
),
format = "%Y-%m-%d %H:%M:%S"
)

)
)

})

test_that("extract tz", {
Expand Down