-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unit tests are slow #213
Comments
I did implement creating-in-memory in #132. I thought it was being used in most places (there's a save-to-disk argument to the input-creation functions that defaults to not writing). It's possible we need to switch more of the tests over. But I think the combining-into-a-single-Dataset operation may be relatively expensive. The data is created with a lot of small variations on the inputs in different tests or when parameterizing - it might still be worth caching the created datasets, and maybe trying to reduce the number of variations, but it's probably quite a faff to do! |
I had an idea for caching the input data, to avoid re-creating it. Working on it now, PR hopefully coming soon... |
Looking at how long different tests take, the biggest chunks of time are the parallel interpolation tests (which you'd expect to be computationally heavy because they have four FFT transforms as well as the interpolation stencil operation), and the |
xarray 2022.10 causes further slow down of the tests -- deep copying seems to have gotten x20 slower See, e.g. #252 (comment)
Sounds good |
The deep copying is more expensive because it's now copying DataArrays in attributes -- and the Regions have two or three nested levels of this. So ways of removing that nesting are probably worth investigating. Can the Regions make use of views somehow perhaps? |
hmm... wonder if we actually need to deep-copy at all? I suspect the idea at the time was that it would be safer, but maybe it's not needed... |
It seems that prior to 2022.10, deep copying didn't deep copy the attributes (and therefore regions?), so presumably a shallow copy should be fine now? I switched some deep copies to shallow ones and it did make a substantial difference to the one test I've been running, but it's still x10 slower than before. |
Can I ask why xBOUT stores arrays of data in the attrs? Whilst xarray's |
The This change is a 4-liner, and substantially reduces the number of errors we get from xarray-2022.9.0. I'll make a draft PR as a starting point for fixing #275. |
The unit tests take really quite a long time to run. I used the pytest profiling extension, and it looks like creating and opening datasets takes the vast majority of the time.
Is there a lot of reading/writing to disk? Would it be possible to just do things in memory?
The text was updated successfully, but these errors were encountered: