You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Every time a new key is inserted in Worker.data, if the managed memory (output of sizeof) exceeds the target threshold, keys are spilled from the bottom of the LRU cache until the managed memory goes below target.
This is a synchronous process that does not release the event loop. This isn't great, but it's bounded in the sense that it's never going to spill more bytes than the size of the key that has just been inserted.
Every 100ms (distributed.worker.memory.monitor-interval), measure the process memory through psutil. If the process memory exceeds the spill threshold, start spilling keys until the process memory goes below the target threshold (hysteresis cycle). Re-measure process memory, call garbage collection, and release the event loop multiple times in this process, which can potentially take many seconds.
The intent of this design is to have a very responsive, cheap, but inaccurate first threshold and a slow-to-notice, expensive, but accurate second one. The design however is problematic:
when unmanaged memory (process - managed) is very high, e.g. due to a leak, high heap from the running user functions, or underestimated output of sizeof(). In the extreme cases of memory leaking, you're going to reach the spill threshold without having ever hit the target threshold and then spill the whole contents of Worker.data all at once.
when unmanaged memory is negative, due to overestimated output of sizeof(). This will cause target to start spilling too soon, when there's plenty of memory still available.
Proposed design
In zict:
Add an offset property to zict.LRU. This property is added to total_weights for the purpose of eviction.
In distributed.worker_memory:
Every 100ms, measure process memory and calculate unmanaged memory.
If process memory is above the spill threshold and there is data in Worker.fast, garbage collect and re-measure it.
Update Worker.data.fast.offset to the amount of unmanaged memory.
Manually trigger spilling in zict.
In distributed.worker_state_machine._transition_to_memory, distributed.Worker.execute, and distributed.Worker.get_data: no change, but now the offset is considered every time a key is inserted in fast.
Notes
This could cause zict to synchronously spill many GiBs at once, without ever releasing the event loop. This change should be paired with Asynchronous Disk Access in Workers #4424.
Leaving the current thresholds unchanged, you'll start spilling a lot earlier. Effectively, target is the new spill. I think it's safe to bump both by 0.1 (making spill the same as pause)
We should rename "spill" to "aggressive_gc" to clarify its new meaning.
The text was updated successfully, but these errors were encountered:
Current design
Every time a new key is inserted in Worker.data, if the managed memory (output of sizeof) exceeds the
target
threshold, keys are spilled from the bottom of the LRU cache until the managed memory goes below target.This is a synchronous process that does not release the event loop. This isn't great, but it's bounded in the sense that it's never going to spill more bytes than the size of the key that has just been inserted.
Every 100ms (
distributed.worker.memory.monitor-interval
), measure the process memory through psutil. If the process memory exceeds thespill
threshold, start spilling keys until the process memory goes below thetarget
threshold (hysteresis cycle). Re-measure process memory, call garbage collection, and release the event loop multiple times in this process, which can potentially take many seconds.The intent of this design is to have a very responsive, cheap, but inaccurate first threshold and a slow-to-notice, expensive, but accurate second one. The design however is problematic:
spill
threshold without having ever hit the target threshold and then spill the whole contents of Worker.data all at once.target
to start spilling too soon, when there's plenty of memory still available.Proposed design
In zict:
offset
property tozict.LRU
. This property is added tototal_weights
for the purpose of eviction.In distributed.worker_memory:
spill
threshold and there is data inWorker.fast
, garbage collect and re-measure it.Worker.data.fast.offset
to the amount of unmanaged memory.In
distributed.worker_state_machine._transition_to_memory
,distributed.Worker.execute
, anddistributed.Worker.get_data
: no change, but now the offset is considered every time a key is inserted in fast.Notes
The text was updated successfully, but these errors were encountered: