Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test_proxy.py::test_sizeof_cudf #824

Closed
pentschev opened this issue Jan 10, 2022 · 3 comments · Fixed by #825
Closed

Failing test_proxy.py::test_sizeof_cudf #824

pentschev opened this issue Jan 10, 2022 · 3 comments · Fixed by #825

Comments

@pentschev
Copy link
Member

pentschev commented Jan 10, 2022

I'm seeing dask_cuda/tests/test_proxy.py::test_sizeof_cudf failing locally:

___________________________________________________ test_sizeof_cudf ____________________________________________________

    def test_sizeof_cudf():
        cudf = pytest.importorskip("cudf")
        a = cudf.datasets.timeseries().reset_index()
        a_size = sizeof(a)
        pxy = proxy_object.asproxy(a)
        assert a_size == pytest.approx(sizeof(pxy))
        pxy._pxy_serialize(serializers=("dask",))
        assert a_size == pytest.approx(sizeof(pxy))
        assert pxy._pxy_get().is_serialized()
        pxy._pxy_cache = {}
>       assert a_size == pytest.approx(sizeof(pxy))
E       assert 86832408 == 87156456 ± 8.7e+01
E         +86832408
E         -87156456 ± 8.7e+01

dask_cuda/tests/test_proxy.py:609: AssertionError
================================================ short test summary info ================================================
FAILED dask_cuda/tests/test_proxy.py::test_sizeof_cudf - assert 86832408 == 87156456 ± 8.7e+01
=========================================== 1 failed, 91 deselected in 5.19s ============================================

The issue above is present with the following cuDF build:

cudf                      22.02.00a220110 cuda_11_py38_g0d5ec7f890_245    rapidsai-nightly
dask-cudf                 22.02.00a220110 cuda_11_py38_g0d5ec7f890_245    rapidsai-nightly
libcudf                   22.02.00a220110 cuda11_g0d5ec7f890_245    rapidsai-nightly

But an old environment with the following passes:

cudf                      22.02.00a220106 cuda_11_py38_g33f7f0d032_227    rapidsai-nightly
dask-cudf                 22.02.00a220106 cuda_11_py38_g33f7f0d032_227    rapidsai-nightly
libcudf                   22.02.00a220106 cuda11_g33f7f0d032_227    rapidsai-nightly

@shwina @quasiben did we have any changes in cuDF that could have affected the way sizeof in any of its objects?

cc @madsbk

@shwina
Copy link
Contributor

shwina commented Jan 10, 2022

Yes, rapidsai/cudf#9544.

cc @vyasr

@pentschev
Copy link
Member Author

No, the one above has been taken care of via #767 . Notice how the build that's passing above is from January 6th and rapidsai/cudf#9544 was merged last November. It seems that the change affecting test_proxy.py::test_sizeof_cudf has been introduced sometime since January 6th.

@pentschev
Copy link
Member Author

@madsbk it's not immediately clear whether there has been some change in cuDF that could have caused this. The closest candidate we have is rapidsai/rmm#931, but that was merged January 5th and my January 6th build was still passing, thus I think that's not it. Given the test is checking for approximate return values, any ideas whether this could have been a change from some other ProxyObject dependency?

@rapids-bot rapids-bot bot closed this as completed in #825 Jan 11, 2022
rapids-bot bot pushed a commit that referenced this issue Jan 11, 2022
Closes #824 by increasing the tolerance of `sizeof` test of `cudf.DataFrame`.

In the test we clear the `ProxyObject` cache, which makes `sizeof` measure the serialized dataframe. I order to avoid deserializing on every `sizeof` call, we accept this discrepancy between  `sizeof` serialized and deserialized objects.

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #825
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants