You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, Dask and Dask-CUDA has no way of handling OOM errors other then restarting tasks or workers. Instead they spill preemptively based on some very conservative memory thresholds. For instance, most workflows in Dask-CUDA starts spilling when half the GPU memory is in use.
By using a new RMM resource adaptor like rapidsai/rmm#892, we should be able to implement on demand memory spilling.
The text was updated successfully, but these errors were encountered:
There's also a related Dask issue about spilling when MemoryErrors are thrown ( dask/distributed#3612 ). IIRC RMM throws a MemoryError when it runs out of memory
Use rapidsai/rmm#892 to implement spilling on demand. Requires use of [RMM](https://github.com/rapidsai/rmm) and JIT-unspill enabled.
The `device_memory_limit` still works as usual -- when known allocations gets to `device_memory_limit`, Dask-CUDA starts spilling preemptively. However, with this PR it is should be possible to increase `device_memory_limit` significantly since memory spikes will be handled by spilling on demand.
Closes#755
Authors:
- Mads R. B. Kristensen (https://github.com/madsbk)
Approvers:
- Peter Andreas Entschev (https://github.com/pentschev)
URL: #756
Currently, Dask and Dask-CUDA has no way of handling OOM errors other then restarting tasks or workers. Instead they spill preemptively based on some very conservative memory thresholds. For instance, most workflows in Dask-CUDA starts spilling when half the GPU memory is in use.
By using a new RMM resource adaptor like rapidsai/rmm#892, we should be able to implement on demand memory spilling.
The text was updated successfully, but these errors were encountered: