You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seeing incorrect shuffle results (fewer rows written) when using dask-cuda 24.06 and above.
Narrowed it down to explicit comms changes in: rapidsai/dask-cuda#1323
Can also confirm that with explicit-comms disabled don't run into the issue of incorrect results.
Steps/Code to reproduce bug
Nothing minimal yet.
Expected behavior
Correct number of resulting rows.
Environment overview (please complete the following information)
Environment location: bare-metal
Method of NeMo-Curator install: from source
Environment details
If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:
OS version: ubuntu 22.04
Dask version (2025.5.1, dask-cuda 24.06)
Python version 3.10
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
#147 Skips explicit comms for 24.06. I haven't had a chance to test rapidsai/dask-cuda#1356 with newer 24.08 versions to see if explicit comms works as expected in newer versions. I'd like to keep this open until that's verified.
Describe the bug
Seeing incorrect shuffle results (fewer rows written) when using dask-cuda 24.06 and above.
Narrowed it down to explicit comms changes in: rapidsai/dask-cuda#1323
Can also confirm that with explicit-comms disabled don't run into the issue of incorrect results.
Steps/Code to reproduce bug
Nothing minimal yet.
Expected behavior
Correct number of resulting rows.
Environment overview (please complete the following information)
Environment details
If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: