-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor bootstrap: first resample, then metric #355
Conversation
I'll review this once #354 is merged. |
I am unsure why I dont see the speedup here. it is definitely faster than before the asv continuous -f 1.1 d82f115748f325c240ea7f2f185081ee365f1701 HEAD -b benchmarks_perfect_model.ComputeSmall.time_compute_perfect_model
[ 50.00%] · For climpred commit 3a7cf0a8 <pr-355/bradyrx/AS_bootstrap_metric_refactor> (round 2/2):
[ 50.00%] ·· Benchmarking conda-py3.6-cftime-dask-numpy-xarray
[ 75.00%] ··· benchmarks_perfect_model.ComputeSmall.time_compute_perfect_model ok
[ 75.00%] ··· =========== =========== ============
-- comparison
----------- ------------------------
metric m2m m2c
=========== =========== ============
rmse 52.7±10ms 13.0±0.5ms
pearson_r 46.5±40ms 17.5±4ms
crpss 45.2±10ms 27.3±10ms
=========== =========== ============
[ 75.00%] · For climpred commit d82f1157 <master~6> (round 2/2):
[ 75.00%] ·· Building for conda-py3.6-cftime-dask-numpy-xarray..
[ 75.00%] ·· Benchmarking conda-py3.6-cftime-dask-numpy-xarray
[100.00%] ··· benchmarks_perfect_model.ComputeSmall.time_compute_perfect_model ok
[100.00%] ··· =========== =========== ===========
-- comparison
----------- -----------------------
metric m2m m2c
=========== =========== ===========
rmse 38.3±4ms 29.0±20ms
pearson_r 44.2±8ms 16.7±1ms
crpss 49.2±20ms 21.1±6ms
=========== =========== ===========
BENCHMARKS NOT SIGNIFICANTLY CHANGED. |
do you get a speedup @bradyrx ? we should somehow compare a version from two weeks ago with this branch. hwoever, I change the keyword |
climpred/bootstrap.py
Outdated
else: | ||
pers_output = False | ||
|
||
else: # if resample_dim == 'member': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this in the new stuff. first bootstrapping, then metric. later on and in the classes we can tear those calls apart
incredibly fast fosi = load_dataset('FOSI-SST')
le = load_dataset('CESM-LE')
dple = load_dataset('CESM-DP-SST')
%time bs = climpred.bootstrap.bootstrap_hindcast(dple,le,fosi,dim='init',resample_dim='member',iterations=1000)
/Users/aaron.spring/Coding/climpred/climpred/utils.py:141: UserWarning: Assuming annual resolution due to numeric inits. Change init to a datetime if it is another resolution.
'Assuming annual resolution due to numeric inits. '
CPU times: user 1.89 s, sys: 82.9 ms, total: 1.97 s
Wall time: 1.99 s |
Sorry for the iterations keyword change. It seems to be causing problems. Awesome that this ran so fast! 1.99s for 1000 iterations. Now, that's just 1D. Do you have a comparison of the time for an earlier version? I.e. what is the time of that call on the currently released version? Maybe you can make a fake hindcast case (or use an MPI hindcast) with a 2D grid and many iterations using dask if you have time. |
This is great. And yes, we can separate calls, etc. on the classes implementation. Is this ready for review? Ping me when it is. I'll be gone this afternoon through Sunday, so I can review first thing Monday. |
Ready for review. |
|
I think we should also check whether the new reaample works one a dim without labels. I remember this failed. |
on may macbook:
on supercomputer:
|
@aaronspring, thanks for all of the work on this. You've done a lot here! Glad we can speed up the bootstrapping a lot in this manner. Go ahead and squash & merge if it's ready, unless there's anything else that needs to be done here. |
This won’t be the last commit on bootstrapping parallel but it’s a bi g leap |
I didn’t know about the autoclose feature. Nice but unexpected and potentially dangerous... fine with me now |
Yeah I think it looks for "closes #" in the PR or anything similar. |
Description
init
which fails otherwiseasv
transform_coords=False
only in xr.DataArrays (not in xr.Datasets)l
-->lead
fix, and f-string only if needed_get_chunksize
withda.data.npartitions
works on #145 #60 #369
Type of change
Please delete options that are not relevant.
asv
to detect performance changes)How Has This Been Tested?
Please describe the tests that you ran to verify your changes. This could point to a cell in the updated notebooks. Or a snippet of code with accompanying figures here.
Checklist (while developing)
pytest
, if necessary.Pre-Merge Checklist (final steps)
References
Please add any references to manuscripts, textbooks, etc.