Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daskify NIR reflectance calculations #59

Merged
merged 13 commits into from
Apr 9, 2019
Merged

Conversation

pnuu
Copy link
Member

@pnuu pnuu commented Nov 29, 2018

This PR makes the internals of near-infrared calculations to use dask. All the existing code unit tests pass, so hopefully this won't break anything for existing non-SatPy users.

@coveralls
Copy link

coveralls commented Nov 29, 2018

Coverage Status

Coverage increased (+19.2%) to 74.146% when pulling 86086df on pnuu:feature-daskify into 1ad2b38 on pytroll:master.

pyspectral/radiance_tb_conversion.py Outdated Show resolved Hide resolved
pyspectral/radiance_tb_conversion.py Outdated Show resolved Hide resolved
@pnuu pnuu requested a review from adybbroe November 30, 2018 13:50
Copy link
Collaborator

@adybbroe adybbroe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@adybbroe adybbroe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@adybbroe adybbroe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose you could remove the lines commented out

if np.isscalar(tb_near_ir):
tb_nir = np.array([tb_near_ir, ])
else:
tb_nir = np.array(tb_near_ir)
tb_nir = np.asanyarray(tb_near_ir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does np.asanyarray convert dask arrays to numpy arrays?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does:

In [1]: import dask.array as da                                                                                                                                      

In [2]: a = da.arange(5., chunks=2)                                                                                                                                  

In [3]: import numpy as np                                                                                                                                           

In [4]: type(np.asanyarray(a))                                                                                                                                       
Out[4]: numpy.ndarray

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djhoese what do you suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned on slack, I'd prefer if everything could be kept as dask arrays for as long as possible and if things have to be computed they should be done in a map_blocks or delayed function. As @pnuu mentioned on slack converting to numpy here made the code faster and use less memory in his test cases. I'm fairly certain this is because the array is used later on in a non-dask-friendly way and the dask array is actually being computed multiple times. I think I pointed them out on slack during the PCW.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this part of pyspectral wasn't dask friendly before, could we perhaps merge this PR now and make sure things works and then work further to improve the daskfriendliness?
@mraspaud @djhoese @pnuu ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with merging. Then we could also merge pytroll/satpy#529 so this PR would be usable via SatpY.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is working now ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there hasn't been anything that didn't work since I last touched this, just that it might not be quite optimal in performance and how things should be done with dask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine. +0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, I will just try do some system testing off line first to be sure. Any suggested recipes/tests @pnuu ?

@adybbroe
Copy link
Collaborator

adybbroe commented Apr 9, 2019

I tested four RGBs: cloudtop_daytime, day_microphysics, day_microphysics_winter, natural_color on SEVIRI. And they are identical before and after the PRs (including the satpy PR corresponding to this one).

@adybbroe adybbroe merged commit d784e99 into pytroll:master Apr 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement PCW Pytroll Contributers Week work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants