Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with xr.apply_ufunc() and jmd95 #9

Open
shanicetbailey opened this issue Dec 7, 2020 · 0 comments
Open

Performance issues with xr.apply_ufunc() and jmd95 #9

shanicetbailey opened this issue Dec 7, 2020 · 0 comments

Comments

@shanicetbailey
Copy link

shanicetbailey commented Dec 7, 2020

I was trying to compute the density equation (for sigma_2) using SOSE model data and fastjmd95, but ran into some performance issues, particularly with this line:

drhodt = xr.apply_ufunc(jmd95numba.drhodt, ds.SALT, ds.THETA, pref,
                                         output_dtypes=[ds.THETA.dtype],
                                         dask='parallelized').reset_coords(drop=True))

Workers (using 30 max) would die off for some reason. This lead to running a matrix of computations to try and isolate the underlying problem - is it just xr.apply_ufunc() having difficulty, or a combination of ufunc() and jmd95?

Please see my nb for full view of issue at hand and run times of the following options: https://nbviewer.jupyter.org/github/ocean-transport/WMT-project/blob/master/SOSE-budgets/optimization-computing-issue.ipynb

  1. xr.apply_ufunc()
  2. dsa.map_blocks()
  3. xr.map_blocks()
  4. fastjmd95
  5. dummy_function (choose simple function (.sum()) to check if fastjmd95 is also having issues)
  6. SOSE model data
  7. randomized data (to check if problem is also rooted from model data)

Run times:

  1. 4min 4s: xr.apply_ufunc()-fastjmd95-model data
  2. all tasks go to one worker and it never executes: xr.apply_ufunc()-fastjmd95-randomized data
  3. 29.7 s: xr.apply_ufunc()-sum()-model data
  4. 15.6 s: xr.apply_ufunc()-sum()-randomized data
  5. 51.2 s: dsa.map_blocks()-fastjmd95-model data
  6. 1min 53s: dsa.map_blocks()-fastjmd95-randomized data
  7. 27.7 s: dsa.map_blocks()-sum()-model data
  8. 13.3 s: dsa.map_blocks()-sum()-randomized data

Please help in trying to figure out what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant