Parallelised calculations using dask. #30

euronion · 2019-07-30T11:54:29Z

Keeping this here as a reminder. Not a pressing issue, but something that has been bothering me for some time.

I have not yet observed that atlite is using more than one processing core for calculations (e.g. wind speed to wind power conversion).
I would assume that this should be easy to implement (as in the backend we are using dask and xarray, but I haven't been able to identify the culprit.

Checking for parallelised acitvity

Initially I was only monitoring the CPU util. using htop:
There only one core seems to be active.

A better way is to use a dask dashboard and a local client:

from dask.distributed import Client
import xarray as xr

client = Client()
client

I also do not see any dask activity during calculations there.

Some background info

When checking cutout.data this returns an xarray Dataset using numpy arrays.
When loading the dataset (xarray.open_dataset(...) ) with the chunks option or rechunking the dataset after the cutout was loaded using cutout.data.chunk(...)
then shows some dask activity during calculations (but slow, I guess a lot of overhead from spawning and orchestrating the different workers and threads).

Low priority

With the new version #20 calculations have become significantly faster and caching of datasets obsolete (can now be done implicitly by xarray and dask).
So this is only interesting when doing conversion of large datasets or repeated conversions.

Current versions

> from platform import python_version
> python_version()
3.7.3

> import xarray
> xarray.__version__
'0.12.3'

> import dask
> dask.__version__
'2.1.0'

The text was updated successfully, but these errors were encountered:

coroa · 2019-07-30T14:15:21Z

The "culprit" is two-fold:

windows !== False (including the default), will construct an iterator over DataArrays backed by numpy arrays (dask arrays are explicitly loaded into memory), unless the @requires_windowed decorator has allow_dask=True.
https://github.com/FRESNA/atlite/blob/25727ceccfddffa9edd39e0e413a0d6144bf00d4/atlite/data.py#L121-L124
This opt-in mechanism is to help with the transition from pre-dask code. For instance heat_demand would not work with dask, as it modifies the values attribute (which creates a (read-only??) copy when running on a dask-backed array)
https://github.com/FRESNA/atlite/blob/25727ceccfddffa9edd39e0e413a0d6144bf00d4/atlite/convert.py#L232
instead this would have to be rewritten as heat_demand = heat_demand.clip(lower=0.). The pv module has several of these inplace patterns, since they generally run faster than creating copies with clip, at the draw back of excluding dask.

coroa · 2019-07-30T14:17:00Z

To play with dask, you can use windows = False now.

euronion · 2019-07-30T14:44:35Z

I see, thanks for pointing this semi-hiden feature out!

Setting windows = False and defining chunks={...} indeed allows to convert_wind using dask (default performance for me is however lower, obviously requires tweaking).

Shall we leave the issue open as a reminder (for documentation & porting the non-dask conversion functions)?

coroa · 2019-07-30T15:04:32Z

Yes, leave it open.

euronion · 2019-08-15T12:51:13Z

[...] instead this would have to be rewritten as heat_demand = heat_demand.clip(lower=0.).
The pv module has several of these inplace patterns, since they generally run faster than
creating copies with clip, at the draw back of excluding dask.

Following up on the performance question, the two for comparison:

Using `ds.values`

# 1. Unloaded
%%time
d['temperature'].values[d['temperature'].values < 10] = 10
CPU times: user 595 ms, sys: 1.13 s, total: 1.73 s
Wall time: 3.89 s

# 2. Loaded
%%timeit
d['temperature'].values[d['temperature'].values < 10] = 10
87.8 ms ± 375 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using `ds.clip(...)`

1. Unloaded
%%time
d['temperature'] = d['temperature'].clip(min=10)
CPU times: user 604 ms, sys: 1.24 s, total: 1.85 s
Wall time: 3.89 s

# 2. Loaded
%%timeit
d['temperature'] = d['temperature'].clip(min=10)
197 ms ± 4.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Conclusion

#1. Unloaded was after a fresh start of the script (notebook) with the cutout prepared
and assigned but the data still on disk.
#2. Loaded is in both cases exectuted right after the #1. Unloaded and thus has the data already in memory.

The performance difference exists but is neglible IMHO. I strongly suggest we just switch all occurences to clipping actions, as this would make the code more homogeneous and in the future probably significantly easier.

coroa · 2019-08-15T14:54:45Z

The pv module is the difficult beast, beware. I don't think it is as easy there! but feel free to start a new branch.

coroa · 2019-09-19T19:12:34Z

It would probably make sense to retry the evaluation of wind generation with dask with the new commit 1294373 on v0.2 which parallelizes interpolation with dask.

coroa added type: enhancement v0.1 labels Jul 30, 2019

coroa mentioned this issue Jul 30, 2019

Preparation of v0.2 #20

Merged

14 tasks

coroa mentioned this issue Sep 19, 2019

Use dask for PV time-series #48

Closed

fneum added this to the Release v0.2 milestone Mar 3, 2020

FabianHofmann closed this as completed Jun 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelised calculations using dask. #30

Parallelised calculations using dask. #30

euronion commented Jul 30, 2019

coroa commented Jul 30, 2019 •

edited

Loading

coroa commented Jul 30, 2019

euronion commented Jul 30, 2019

coroa commented Jul 30, 2019

euronion commented Aug 15, 2019

coroa commented Aug 15, 2019

coroa commented Sep 19, 2019

Parallelised calculations using dask. #30

Parallelised calculations using dask. #30

Comments

euronion commented Jul 30, 2019

Checking for parallelised acitvity

Some background info

Low priority

Current versions

coroa commented Jul 30, 2019 • edited Loading

coroa commented Jul 30, 2019

euronion commented Jul 30, 2019

coroa commented Jul 30, 2019

euronion commented Aug 15, 2019

Using ds.values

Using ds.clip(...)

Conclusion

coroa commented Aug 15, 2019

coroa commented Sep 19, 2019

coroa commented Jul 30, 2019 •

edited

Loading

Using `ds.values`

Using `ds.clip(...)`