Historically, the Dask scheduler did not implement any particular logic to manage distributed data after it's been created. This can lead to imbalances in memory allocation throughout the cluster, excessive memory consumption, and counter-intuitive out-of-memory issues.
This talk introduces a new feature of Dask, the Active Memory Manager daemon, which aims to resolve all these long-standing issues by removing unnecessary replicas and moving around the rest to even out the memory load among workers. The same system also allows for more robust worker retirement, adaptive downscaling in the middle of a computation, and a redesign of the OOM worker pause.
I worked 10 years on the technical infrastructure underlying Monte Carlo simulations for finance. I'm currently busy full time improving the dask and dask.distributed open source packages.
visit the speaker at: Github