GH-35040: [Python] Skip test_cast_timestamp_to_string on Windows because it requires tz database #35735

jorisvandenbossche · 2023-05-24T06:41:10Z

Rationale for this change

Fix up of #35395, skipping one of the tests added in that PR on Windows, because the test requires access to a tz database.

…s because it requires tz database

github-actions · 2023-05-24T06:41:30Z

Closes: [C++] Change Scalar::CastTo to do safe cast by default and allow to specify cast options ? #35040

jorisvandenbossche · 2023-05-24T06:42:47Z

@github-actions crossbow submit -g wheel

github-actions · 2023-05-24T06:46:43Z

Revision: 64c0d83

Submitted crossbow builds: ursacomputing/crossbow @ actions-fe7da097ff

Task	Status
wheel-clean
wheel-macos-big-sur-cp310-arm64
wheel-macos-big-sur-cp311-arm64
wheel-macos-big-sur-cp38-arm64
wheel-macos-big-sur-cp39-arm64
wheel-macos-mojave-cp310-amd64
wheel-macos-mojave-cp311-amd64
wheel-macos-mojave-cp37-amd64
wheel-macos-mojave-cp38-amd64
wheel-macos-mojave-cp39-amd64
wheel-manylinux-2-28-cp310-amd64
wheel-manylinux-2-28-cp310-arm64
wheel-manylinux-2-28-cp311-amd64
wheel-manylinux-2-28-cp311-arm64
wheel-manylinux-2-28-cp37-amd64
wheel-manylinux-2-28-cp37-arm64
wheel-manylinux-2-28-cp38-amd64
wheel-manylinux-2-28-cp38-arm64
wheel-manylinux-2-28-cp39-amd64
wheel-manylinux-2-28-cp39-arm64
wheel-manylinux-2014-cp310-amd64
wheel-manylinux-2014-cp310-arm64
wheel-manylinux-2014-cp311-amd64
wheel-manylinux-2014-cp311-arm64
wheel-manylinux-2014-cp37-amd64
wheel-manylinux-2014-cp37-arm64
wheel-manylinux-2014-cp38-amd64
wheel-manylinux-2014-cp38-arm64
wheel-manylinux-2014-cp39-amd64
wheel-manylinux-2014-cp39-arm64
wheel-windows-cp310-amd64
wheel-windows-cp311-amd64
wheel-windows-cp37-amd64
wheel-windows-cp38-amd64
wheel-windows-cp39-amd64

pitrou · 2023-05-24T10:34:20Z

@rok does this look right? We are able to somehow use a timezone database on Windows, IIRC?

jorisvandenbossche · 2023-05-24T10:46:33Z

Yeah, the better fix would actually be to detect if the database is available, and only skip if not found on windows. Because this patch skips the tests altogether, even when it can actually work (eg on appveyor the tests pass because we download the tzdata during appveyor setup). But just to note: we currently do this plain skip for other tests requiring the database as well, so that is an existing issue with our test setup.

I assume checking if the path %USERPROFILE%\Downloads\tzdata exists might be a sufficient check at the moment (since that's the place where tz.cpp will look, and we currently don't yet allow customizing that in pyarrow).

danepitkin · 2023-05-24T17:23:54Z

Thanks for fixing! There are other tests in this file that also use pytest.importorskip("pytz"), should they be updated as well? It's not clear to me why this test case behaves differently.

rok

For completeness - I believe db download for windows was added by @wjones127 in #12536.

python/pyarrow/tests/test_scalars.py

wjones127 · 2023-05-25T14:59:35Z

For completeness - I believe db download for windows was added by @wjones127 in #12536.

That was just for C++ and R. I don't think we figured out how to do it for Python, since we need the text form of the IANA database while Python's timezone packages provide the binary form.

raulcd

We already have 3 tests being skipped on Windows for the same reason.
https://github.com/apache/arrow/blob/main/python/pyarrow/tests/test_compute.py#L1851
https://github.com/apache/arrow/blob/main/python/pyarrow/tests/test_compute.py#L2042
https://github.com/apache/arrow/blob/main/python/pyarrow/tests/test_compute.py#L2042
I am +1 to merge

jorisvandenbossche · 2023-05-31T12:17:46Z

python/pyarrow/tests/test_compute.py

+    if sys.platform != "win32":
+        # Locale-dependent formats don't match on Windows
+        formats.extend(["%c", "%x", "%X"])


The error on Appveyor we got was:

@pytest.mark.skipif(sys.platform == "win32" and not util.windows_has_tzdata(), reason="Timezone database is not installed on Windows") def test_strftime(): times = ["2018-03-10 09:00", "2038-01-31 12:23", None] timezones = ["CET", "UTC", "Europe/Ljubljana"] formats = ["%a", "%A", "%w", "%d", "%b", "%B", "%m", "%y", "%Y", "%H", "%I", "%p", "%M", "%z", "%Z", "%j", "%U", "%W", "%c", "%x", "%X", "%%", "%G", "%V", "%u"] for timezone in timezones: ts = pd.to_datetime(times).tz_localize(timezone) for unit in ["s", "ms", "us", "ns"]: tsa = pa.array(ts, type=pa.timestamp(unit, timezone)) for fmt in formats: options = pc.StrftimeOptions(fmt) result = pc.strftime(tsa, options=options) expected = pa.array(ts.strftime(fmt)) > assert result.equals(expected) E assert False E + where False = <built-in method equals of pyarrow.lib.StringArray object at 0x0000023767338600>(<pyarrow.lib.StringArray object at 0x0000023767338830>\n[\n "Sat Mar 10 09:00:00 2018",\n "Sun Jan 31 12:23:00 2038",\n null\n]) E + where <built-in method equals of pyarrow.lib.StringArray object at 0x0000023767338600> = <pyarrow.lib.StringArray object at 0x0000023767338600>\n[\n "03/10/18 09:00:00",\n "01/31/38 12:23:00",\n null\n].equals pyarrow\tests\test_compute.py:1872: AssertionError

So it seems that we create a string like "Sat Mar 10 09:00:00 2018", but the python version we compare with gives "03/10/18 09:00:00". According to docs for %c, the former (our result) is actually correct.
But since we are checking matching results in Python in this test, just skipping the ones where those don't match.

jorisvandenbossche · 2023-05-31T13:16:38Z

I updated the PR to actually check if the tzdata database was present, and only skip the tests if not. Because we do actually already download the tzdb on Appveyor (not on the nightly wheel builds, so therefore the tests were failing there).
While Python doesn't yet allow you to configure the path (and use the data from the tzdata package), as mentioned above, if you manually download it to the correct location, the C++ code will already discover it (as was done for R).

jorisvandenbossche · 2023-06-06T10:32:40Z

@github-actions crossbow submit -g wheel

jorisvandenbossche · 2023-06-06T12:39:21Z

@github-actions crossbow submit -g wheel

github-actions · 2023-06-06T12:43:13Z

Revision: 9720e36

Submitted crossbow builds: ursacomputing/crossbow @ actions-5e510fd809

Task	Status
wheel-clean
wheel-macos-big-sur-cp310-arm64
wheel-macos-big-sur-cp311-arm64
wheel-macos-big-sur-cp38-arm64
wheel-macos-big-sur-cp39-arm64
wheel-macos-mojave-cp310-amd64
wheel-macos-mojave-cp311-amd64
wheel-macos-mojave-cp37-amd64
wheel-macos-mojave-cp38-amd64
wheel-macos-mojave-cp39-amd64
wheel-manylinux-2-28-cp310-amd64
wheel-manylinux-2-28-cp310-arm64
wheel-manylinux-2-28-cp311-amd64
wheel-manylinux-2-28-cp311-arm64
wheel-manylinux-2-28-cp37-amd64
wheel-manylinux-2-28-cp37-arm64
wheel-manylinux-2-28-cp38-amd64
wheel-manylinux-2-28-cp38-arm64
wheel-manylinux-2-28-cp39-amd64
wheel-manylinux-2-28-cp39-arm64
wheel-manylinux-2014-cp310-amd64
wheel-manylinux-2014-cp310-arm64
wheel-manylinux-2014-cp311-amd64
wheel-manylinux-2014-cp311-arm64
wheel-manylinux-2014-cp37-amd64
wheel-manylinux-2014-cp37-arm64
wheel-manylinux-2014-cp38-amd64
wheel-manylinux-2014-cp38-arm64
wheel-manylinux-2014-cp39-amd64
wheel-manylinux-2014-cp39-arm64
wheel-windows-cp310-amd64
wheel-windows-cp311-amd64
wheel-windows-cp37-amd64
wheel-windows-cp38-amd64
wheel-windows-cp39-amd64

ursabot · 2023-06-07T01:58:41Z

Benchmark runs are scheduled for baseline = 4ec231b and contender = c78ef0f. c78ef0f is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Failed] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.77% ⬆️0.06%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Failed ⬇️0.27% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] c78ef0f1 ec2-t3-xlarge-us-east-2
[Finished] c78ef0f1 test-mac-arm
[Finished] c78ef0f1 ursa-i9-9960x
[Failed] c78ef0f1 ursa-thinkcentre-m75q
[Failed] 4ec231b2 ec2-t3-xlarge-us-east-2
[Failed] 4ec231b2 test-mac-arm
[Finished] 4ec231b2 ursa-i9-9960x
[Failed] 4ec231b2 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

apacheGH-35040: [Python] Skip test_cast_timestamp_to_string on Window…

64c0d83

…s because it requires tz database

jorisvandenbossche requested a review from AlenkaF as a code owner May 24, 2023 06:41

jorisvandenbossche requested a review from danepitkin May 24, 2023 06:41

github-actions bot added Component: Python awaiting committer review Awaiting committer review labels May 24, 2023

jorisvandenbossche mentioned this pull request May 24, 2023

GH-35658: [Packaging] Sync conda recipes with feedstocks #35637

Merged

wgtmac approved these changes May 24, 2023

View reviewed changes

pitrou requested a review from rok May 24, 2023 10:33

rok reviewed May 24, 2023

View reviewed changes

python/pyarrow/tests/test_scalars.py Outdated Show resolved Hide resolved

github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels May 24, 2023

raulcd approved these changes May 27, 2023

View reviewed changes

github-actions bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels May 27, 2023

rok approved these changes May 27, 2023

View reviewed changes

jorisvandenbossche added 2 commits May 31, 2023 11:51

actually check if tz database is available or not on Windows

27a9115

don't test locale specific ones on windows

1028a02

jorisvandenbossche commented May 31, 2023

View reviewed changes

github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels May 31, 2023

This comment was marked as outdated.

Sign in to view

Merge remote-tracking branch 'upstream/main' into apacheGH-35040-fix-ci

9720e36

github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 6, 2023

jorisvandenbossche merged commit c78ef0f into apache:main Jun 6, 2023

jorisvandenbossche deleted the GH-35040-fix-ci branch June 6, 2023 15:24

mattijn mentioned this pull request Jun 7, 2023

Sanitize timestamps in arrow tables vega/altair#3076

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-35040: [Python] Skip test_cast_timestamp_to_string on Windows because it requires tz database #35735

GH-35040: [Python] Skip test_cast_timestamp_to_string on Windows because it requires tz database #35735

jorisvandenbossche commented May 24, 2023 •

edited

Loading

github-actions bot commented May 24, 2023

jorisvandenbossche commented May 24, 2023

github-actions bot commented May 24, 2023

pitrou commented May 24, 2023

jorisvandenbossche commented May 24, 2023 •

edited

Loading

danepitkin commented May 24, 2023

rok left a comment

wjones127 commented May 25, 2023

raulcd left a comment

jorisvandenbossche May 31, 2023

jorisvandenbossche commented May 31, 2023

jorisvandenbossche commented Jun 6, 2023

This comment was marked as outdated.

jorisvandenbossche commented Jun 6, 2023

github-actions bot commented Jun 6, 2023

ursabot commented Jun 7, 2023

GH-35040: [Python] Skip test_cast_timestamp_to_string on Windows because it requires tz database #35735

GH-35040: [Python] Skip test_cast_timestamp_to_string on Windows because it requires tz database #35735

Conversation

jorisvandenbossche commented May 24, 2023 • edited Loading

Rationale for this change

github-actions bot commented May 24, 2023

jorisvandenbossche commented May 24, 2023

github-actions bot commented May 24, 2023

pitrou commented May 24, 2023

jorisvandenbossche commented May 24, 2023 • edited Loading

danepitkin commented May 24, 2023

rok left a comment

Choose a reason for hiding this comment

wjones127 commented May 25, 2023

raulcd left a comment

Choose a reason for hiding this comment

jorisvandenbossche May 31, 2023

Choose a reason for hiding this comment

jorisvandenbossche commented May 31, 2023

jorisvandenbossche commented Jun 6, 2023

This comment was marked as outdated.

jorisvandenbossche commented Jun 6, 2023

github-actions bot commented Jun 6, 2023

ursabot commented Jun 7, 2023

jorisvandenbossche commented May 24, 2023 •

edited

Loading

jorisvandenbossche commented May 24, 2023 •

edited

Loading