Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelised calculations using dask. #30

Closed
euronion opened this issue Jul 30, 2019 · 7 comments
Closed

Parallelised calculations using dask. #30

euronion opened this issue Jul 30, 2019 · 7 comments

Comments

@euronion
Copy link
Collaborator

Keeping this here as a reminder. Not a pressing issue, but something that has been bothering me for some time.

I have not yet observed that atlite is using more than one processing core for calculations (e.g. wind speed to wind power conversion).
I would assume that this should be easy to implement (as in the backend we are using dask and xarray, but I haven't been able to identify the culprit.

Checking for parallelised acitvity

Initially I was only monitoring the CPU util. using htop:
There only one core seems to be active.

A better way is to use a dask dashboard and a local client:

from dask.distributed import Client
import xarray as xr

client = Client()
client

I also do not see any dask activity during calculations there.

Some background info

When checking cutout.data this returns an xarray Dataset using numpy arrays.
When loading the dataset (xarray.open_dataset(...) ) with the chunks option or rechunking the dataset after the cutout was loaded using cutout.data.chunk(...)
then shows some dask activity during calculations (but slow, I guess a lot of overhead from spawning and orchestrating the different workers and threads).

Low priority

With the new version #20 calculations have become significantly faster and caching of datasets obsolete (can now be done implicitly by xarray and dask).
So this is only interesting when doing conversion of large datasets or repeated conversions.

Current versions

> from platform import python_version
> python_version()
3.7.3

> import xarray
> xarray.__version__
'0.12.3'

> import dask
> dask.__version__
'2.1.0'
@coroa
Copy link
Member

coroa commented Jul 30, 2019

The "culprit" is two-fold:

@coroa
Copy link
Member

coroa commented Jul 30, 2019

To play with dask, you can use windows = False now.

@euronion
Copy link
Collaborator Author

I see, thanks for pointing this semi-hiden feature out!

Setting windows = False and defining chunks={...} indeed allows to convert_wind using dask (default performance for me is however lower, obviously requires tweaking).

Shall we leave the issue open as a reminder (for documentation & porting the non-dask conversion functions)?

@coroa
Copy link
Member

coroa commented Jul 30, 2019

Yes, leave it open.

@euronion
Copy link
Collaborator Author

[...] instead this would have to be rewritten as heat_demand = heat_demand.clip(lower=0.).
The pv module has several of these inplace patterns, since they generally run faster than
creating copies with clip, at the draw back of excluding dask.

Following up on the performance question, the two for comparison:

Using ds.values

# 1. Unloaded
%%time
d['temperature'].values[d['temperature'].values < 10] = 10
CPU times: user 595 ms, sys: 1.13 s, total: 1.73 s
Wall time: 3.89 s

# 2. Loaded
%%timeit
d['temperature'].values[d['temperature'].values < 10] = 10
87.8 ms ± 375 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using ds.clip(...)

1. Unloaded
%%time
d['temperature'] = d['temperature'].clip(min=10)
CPU times: user 604 ms, sys: 1.24 s, total: 1.85 s
Wall time: 3.89 s

# 2. Loaded
%%timeit
d['temperature'] = d['temperature'].clip(min=10)
197 ms ± 4.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Conclusion

#1. Unloaded was after a fresh start of the script (notebook) with the cutout prepared
and assigned but the data still on disk.
#2. Loaded is in both cases exectuted right after the #1. Unloaded and thus has the data already in memory.

The performance difference exists but is neglible IMHO. I strongly suggest we just switch all occurences to clipping actions, as this would make the code more homogeneous and in the future probably significantly easier.

@coroa
Copy link
Member

coroa commented Aug 15, 2019

The pv module is the difficult beast, beware. I don't think it is as easy there! but feel free to start a new branch.

@coroa
Copy link
Member

coroa commented Sep 19, 2019

It would probably make sense to retry the evaluation of wind generation with dask with the new commit 1294373 on v0.2 which parallelizes interpolation with dask.

@fneum fneum added this to the Release v0.2 milestone Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants