You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We found an issue while working on #3669 where the Scalar object returned from min() and max() had to be sent to the client first before other workers.
We tried using Client.replicate, from_delayed after calling client.submit with the min/max as arguments to create a new cudf DataFrame on each partition, creating a new DataFrame and expanding it, and a variety of other solutions that should have worked, but didn't. In all these cases, the min/max value was not sent to all workers, and in some cases, it was not sent to any workers at all, even after calling persist() and wait.
This is likely due to a bug in dask, or potentially something else we're not doing correctly. In any case, this needs further investigation since it could cause issues on multi-node workflows.
The text was updated successfully, but these errors were encountered:
We found an issue while working on #3669 where the
Scalar
object returned frommin()
andmax()
had to be sent to the client first before other workers.We tried using
Client.replicate
,from_delayed
after callingclient.submit
with the min/max as arguments to create a new cudf DataFrame on each partition, creating a newDataFrame
and expanding it, and a variety of other solutions that should have worked, but didn't. In all these cases, the min/max value was not sent to all workers, and in some cases, it was not sent to any workers at all, even after callingpersist()
andwait
.This is likely due to a bug in
dask
, or potentially something else we're not doing correctly. In any case, this needs further investigation since it could cause issues on multi-node workflows.The text was updated successfully, but these errors were encountered: