Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask compatibility #76

Closed
wants to merge 338 commits into from
Closed

Dask compatibility #76

wants to merge 338 commits into from

Conversation

FabianHofmann
Copy link
Contributor

@FabianHofmann FabianHofmann commented Jun 3, 2020

This PR introduces a full dask compatible version. It builds on branch v0.2, but has further improvements on performance and structure.
It goes into PR #20

closes #61
solves #48

euronion and others added 30 commits August 7, 2019 17:10
README: Finish switch to README.md from README.rst.
resource: Bugfix in retrieving local turbines.
FabianHofmann and others added 29 commits May 28, 2020 00:52
…n to cutout.data.attr,

				slices are directly processes in coordinates creation
	 - re-enable tmp_dir in cutout prepare
	 - to_netcdf has different mode 'a'/'w' depending on whether file exists
gebco.py: fix output of get_data
sarah.py: set interpolate always to true
- modify input variables of get_coords function.
	- update docstrings for cutout class
	- remove support for cutou_dir, add warning and pointer to migration function
	- remove support for data argument as this requires further TODOs and can worked around very easily
	- remove default for module, this argument must be given
	- abolish is_view cases
	- add assertions for argument requirements 'x', 'y', 'time', 'module' when building new cutout
	- Improve argument exception
	- Make cutout representation better
	- Ensure projection in cutout building
tests: run tests for era5 and mixed ['sarah', 'era5'] cutouts
cutout.py: reenable data as an optional argument
fix typo in migration function
	- clean imports
	- fully intergrate chunks as an cutout parameter and property
	- set chunking as standard loading of cutout
data.py - use cutout.chunks property
era5.py - use cutout.chunks property
sarah.py
	- use cutout.chunks property
	- make module dask friendly, this commit removes all .values call which cause dask to be unable to chunk.
	- direct import of numpy functions which are often used
convert.py:
	- restructure convert_and_aggregate function, this makes the function faster if only a layout is given.
	- change show_progress to bool only
	- change layout to be xr.DataArray only
From xarray version 0.15.1, .values cannot be assigned. You should
use the .assign_coords() method instead.

See release notes for xarray 0.15.1 "breaking changes":

http://xarray.pydata.org/en/stable/whats-new.html#v0-15-1-23-mar-2020
…erent modules

	is tested when initializing the cutout. The property 'projection'
	will then only look at the projection of first module.
convert.py:
	- try out dense operation for indicator matrix multiplication
	- replace aggregate_matrix function with tensor dot (still figuring out performance)
gis.py:
	- argument shapes can now also be geopandas frame
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v0.2] Large memory overhead in cutout.pv
5 participants