Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-35040: [Python] Skip test_cast_timestamp_to_string on Windows because it requires tz database #35735

Merged
merged 4 commits into from
Jun 6, 2023

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented May 24, 2023

Rationale for this change

Fix up of #35395, skipping one of the tests added in that PR on Windows, because the test requires access to a tz database.

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit -g wheel

@github-actions
Copy link

Revision: 64c0d83

Submitted crossbow builds: ursacomputing/crossbow @ actions-fe7da097ff

Task Status
wheel-clean Github Actions
wheel-macos-big-sur-cp310-arm64 Github Actions
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp38-arm64 Github Actions
wheel-macos-big-sur-cp39-arm64 Github Actions
wheel-macos-mojave-cp310-amd64 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-macos-mojave-cp37-amd64 Github Actions
wheel-macos-mojave-cp38-amd64 Github Actions
wheel-macos-mojave-cp39-amd64 Github Actions
wheel-manylinux-2-28-cp310-amd64 Github Actions
wheel-manylinux-2-28-cp310-arm64 Github Actions
wheel-manylinux-2-28-cp311-amd64 Github Actions
wheel-manylinux-2-28-cp311-arm64 Github Actions
wheel-manylinux-2-28-cp37-amd64 Github Actions
wheel-manylinux-2-28-cp37-arm64 Github Actions
wheel-manylinux-2-28-cp38-amd64 Github Actions
wheel-manylinux-2-28-cp38-arm64 Github Actions
wheel-manylinux-2-28-cp39-amd64 Github Actions
wheel-manylinux-2-28-cp39-arm64 Github Actions
wheel-manylinux-2014-cp310-amd64 Github Actions
wheel-manylinux-2014-cp310-arm64 Github Actions
wheel-manylinux-2014-cp311-amd64 Github Actions
wheel-manylinux-2014-cp311-arm64 Github Actions
wheel-manylinux-2014-cp37-amd64 Github Actions
wheel-manylinux-2014-cp37-arm64 Github Actions
wheel-manylinux-2014-cp38-amd64 Github Actions
wheel-manylinux-2014-cp38-arm64 Github Actions
wheel-manylinux-2014-cp39-amd64 Github Actions
wheel-manylinux-2014-cp39-arm64 Github Actions
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@pitrou pitrou requested a review from rok May 24, 2023 10:33
@pitrou
Copy link
Member

pitrou commented May 24, 2023

@rok does this look right? We are able to somehow use a timezone database on Windows, IIRC?

@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented May 24, 2023

Yeah, the better fix would actually be to detect if the database is available, and only skip if not found on windows. Because this patch skips the tests altogether, even when it can actually work (eg on appveyor the tests pass because we download the tzdata during appveyor setup). But just to note: we currently do this plain skip for other tests requiring the database as well, so that is an existing issue with our test setup.

I assume checking if the path %USERPROFILE%\Downloads\tzdata exists might be a sufficient check at the moment (since that's the place where tz.cpp will look, and we currently don't yet allow customizing that in pyarrow).

@danepitkin
Copy link
Member

Thanks for fixing! There are other tests in this file that also use pytest.importorskip("pytz"), should they be updated as well? It's not clear to me why this test case behaves differently.

Copy link
Member

@rok rok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness - I believe db download for windows was added by @wjones127 in #12536.

python/pyarrow/tests/test_scalars.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels May 24, 2023
@wjones127
Copy link
Member

For completeness - I believe db download for windows was added by @wjones127 in #12536.

That was just for C++ and R. I don't think we figured out how to do it for Python, since we need the text form of the IANA database while Python's timezone packages provide the binary form.

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels May 27, 2023
Comment on lines +1862 to +1864
if sys.platform != "win32":
# Locale-dependent formats don't match on Windows
formats.extend(["%c", "%x", "%X"])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error on Appveyor we got was:

    @pytest.mark.skipif(sys.platform == "win32" and not util.windows_has_tzdata(),
                        reason="Timezone database is not installed on Windows")
    def test_strftime():
        times = ["2018-03-10 09:00", "2038-01-31 12:23", None]
        timezones = ["CET", "UTC", "Europe/Ljubljana"]
    
        formats = ["%a", "%A", "%w", "%d", "%b", "%B", "%m", "%y", "%Y", "%H",
                   "%I", "%p", "%M", "%z", "%Z", "%j", "%U", "%W", "%c", "%x",
                   "%X", "%%", "%G", "%V", "%u"]
    
        for timezone in timezones:
            ts = pd.to_datetime(times).tz_localize(timezone)
            for unit in ["s", "ms", "us", "ns"]:
                tsa = pa.array(ts, type=pa.timestamp(unit, timezone))
                for fmt in formats:
                    options = pc.StrftimeOptions(fmt)
                    result = pc.strftime(tsa, options=options)
                    expected = pa.array(ts.strftime(fmt))
>                   assert result.equals(expected)
E                   assert False
E                    +  where False = <built-in method equals of pyarrow.lib.StringArray object at 0x0000023767338600>(<pyarrow.lib.StringArray object at 0x0000023767338830>\n[\n  "Sat Mar 10 09:00:00 2018",\n  "Sun Jan 31 12:23:00 2038",\n  null\n])
E                    +    where <built-in method equals of pyarrow.lib.StringArray object at 0x0000023767338600> = <pyarrow.lib.StringArray object at 0x0000023767338600>\n[\n  "03/10/18 09:00:00",\n  "01/31/38 12:23:00",\n  null\n].equals
pyarrow\tests\test_compute.py:1872: AssertionError

So it seems that we create a string like "Sat Mar 10 09:00:00 2018", but the python version we compare with gives "03/10/18 09:00:00". According to docs for %c, the former (our result) is actually correct.
But since we are checking matching results in Python in this test, just skipping the ones where those don't match.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels May 31, 2023
@jorisvandenbossche
Copy link
Member Author

I updated the PR to actually check if the tzdata database was present, and only skip the tests if not. Because we do actually already download the tzdb on Appveyor (not on the nightly wheel builds, so therefore the tests were failing there).
While Python doesn't yet allow you to configure the path (and use the data from the tzdata package), as mentioned above, if you manually download it to the correct location, the C++ code will already discover it (as was done for R).

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit -g wheel

@github-actions

This comment was marked as outdated.

@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit -g wheel

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 6, 2023
@github-actions
Copy link

github-actions bot commented Jun 6, 2023

Revision: 9720e36

Submitted crossbow builds: ursacomputing/crossbow @ actions-5e510fd809

Task Status
wheel-clean Github Actions
wheel-macos-big-sur-cp310-arm64 Github Actions
wheel-macos-big-sur-cp311-arm64 Github Actions
wheel-macos-big-sur-cp38-arm64 Github Actions
wheel-macos-big-sur-cp39-arm64 Github Actions
wheel-macos-mojave-cp310-amd64 Github Actions
wheel-macos-mojave-cp311-amd64 Github Actions
wheel-macos-mojave-cp37-amd64 Github Actions
wheel-macos-mojave-cp38-amd64 Github Actions
wheel-macos-mojave-cp39-amd64 Github Actions
wheel-manylinux-2-28-cp310-amd64 Github Actions
wheel-manylinux-2-28-cp310-arm64 Github Actions
wheel-manylinux-2-28-cp311-amd64 Github Actions
wheel-manylinux-2-28-cp311-arm64 Github Actions
wheel-manylinux-2-28-cp37-amd64 Github Actions
wheel-manylinux-2-28-cp37-arm64 Github Actions
wheel-manylinux-2-28-cp38-amd64 Github Actions
wheel-manylinux-2-28-cp38-arm64 Github Actions
wheel-manylinux-2-28-cp39-amd64 Github Actions
wheel-manylinux-2-28-cp39-arm64 Github Actions
wheel-manylinux-2014-cp310-amd64 Github Actions
wheel-manylinux-2014-cp310-arm64 Github Actions
wheel-manylinux-2014-cp311-amd64 Github Actions
wheel-manylinux-2014-cp311-arm64 Github Actions
wheel-manylinux-2014-cp37-amd64 Github Actions
wheel-manylinux-2014-cp37-arm64 Github Actions
wheel-manylinux-2014-cp38-amd64 Github Actions
wheel-manylinux-2014-cp38-arm64 Github Actions
wheel-manylinux-2014-cp39-amd64 Github Actions
wheel-manylinux-2014-cp39-arm64 Github Actions
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@jorisvandenbossche jorisvandenbossche merged commit c78ef0f into apache:main Jun 6, 2023
@jorisvandenbossche jorisvandenbossche deleted the GH-35040-fix-ci branch June 6, 2023 15:24
@ursabot
Copy link

ursabot commented Jun 7, 2023

Benchmark runs are scheduled for baseline = 4ec231b and contender = c78ef0f. c78ef0f is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Failed] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.77% ⬆️0.06%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Failed ⬇️0.27% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] c78ef0f1 ec2-t3-xlarge-us-east-2
[Finished] c78ef0f1 test-mac-arm
[Finished] c78ef0f1 ursa-i9-9960x
[Failed] c78ef0f1 ursa-thinkcentre-m75q
[Failed] 4ec231b2 ec2-t3-xlarge-us-east-2
[Failed] 4ec231b2 test-mac-arm
[Finished] 4ec231b2 ursa-i9-9960x
[Failed] 4ec231b2 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants