You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remove the hash-merge utility and switch it to use mainline dask.merge
We added the hash-merge utility because of the difference of implementation b/w dask_cudf repartition and the shuffle function being used by dask's merge.
The earlier dask.dataframe shuffle implementation used a different code than the dask_cudf and was more memory-hungry.
We have upstreamed our repartition function to dask-mainline since then, and it should now have the same performance characteristics as this merge, so we should use that instead after we scale test to verify similar results.
The text was updated successfully, but these errors were encountered:
Remove the hash-merge utility and switch it to use mainline dask.merge
We added the hash-merge utility because of the difference of implementation b/w dask_cudf repartition and the shuffle function being used by dask's merge.
The earlier
dask.dataframe
shuffle implementation used a different code than thedask_cudf
and was more memory-hungry.We have upstreamed our repartition function to dask-mainline since then, and it should now have the same performance characteristics as this merge, so we should use that instead after we scale test to verify similar results.
The text was updated successfully, but these errors were encountered: