-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New multimodel module can be slow and buggy for certain recipes #1201
Comments
First, let's try to stay factual and polite. SHOUTING is not generally considered that. On the points:
|
@zklaus: The different length of the time axis seems to be connected to leap years. I just wrote in the other issue:
Warning and the
Error is most probably, that there are is a different number of days in the 26 years the recipe tries to look at, either 9490 days or 9497 days (probably depending on if the calendar contains leap years or not). |
(esmvaltool202006) b380216@mistralpp3% grep -in Running.extract_region.air_temperature.*time: /work/bd1083/b380216/output/recipe_pv_capacity_factor_20210629_095915/run/main_log_debug.txt |
yes @katjaweigel that's exactly that, well spotted! About @zklaus point about run times, I agree, in principle, if it was a run time difference of tens of percent, but a factor 4 or 5 is unacceptable. I am aware of this happening in certain situations but I've done quite a few tests prior to this being merged and noticed it consistently, albeit those tests were done with significantly lower amounts of data. This is a solid case with plenty of data, and unfortunately this runtime difference is a showstopper in my books, especially that the gain from less required memory is relatively marginal 👍 |
agreed! but we need a checker inside multimodel to either ignore or fix that AuxCoord (or others that might differ). Failing on a coordinate that is not used in the statistic itself is not the desired behavior |
On the runtime: @valeriupredoi, your opinion is noted and will of course be taken into consideration. On the AuxCoord: This coordinate is used. What is the correct height of the resulting data? 2.0m? 1.5m? The weighted average? I don't see that the multimode preprocessor is in a good position to make that call. Since the overwhelming majority of models does use 2m and this is the suggested height of the standards, the best approach to me seems to add a fix to ACCESS-1.3 to change the height coordinate in the On the leap day issue: How did the old preprocessor handle this? The only clear solution seems to be some form of time regridding, but I think that should be done by the user explicitly, thus raising an error (though possibly with a better message), is the right thing, imho. |
Currently, the preprocessor setting:
Doesn't solve the leap year issue, its fails with:
and there are still cubes with 9490 and 9497 days. |
Note that work on the performance is still in progress in #968, no performance improvement should be expected from the current implementation. The improvement that is included in |
the old preprocessor would find the overlap via |
If we're talking issues with the new mmm handling that weren't there before the refactoring, I also found one testing recipe_perfmetrics_land_CMIP5.yml as mentioned in this comment: ESMValGroup/ESMValTool#2198 (comment) |
right on, but how can you explain the slowness? |
good call @bettina-gier 👍 |
Note that all the problems we have discussed so far (including the one brought up by @bettina-gier) are pre-existing bugs that have passed us unnoticed until now. Reverting would therefore not solve any issue but just cover up wrong results. I am at the moment not convinced that is the right way to go about it. Let's try to focus our energy now on understanding the problems and move to a solution strategy after that. |
Klaus, those are not masked problems but rather things we don't really care for enough to fix them in the old module and we don't care enough to cause failures in the new module. We have a broken MM module on several aspects and the time we need to address those aspects is much longer than the time to revert and have a working module. We care about functionality foremost and then about improving performance and fixing knickknacks - not that the performance of the new MM is better, as Bouwe says, it's still work in progress |
V, I disagree with that assessment and don't think you can speak for all of us here. As a point in case, trying to advance the issue of the uneven time axis, MOHC's HadGEM2-ES uses a 360 day calendar. Over the 25 year period of this recipe, that's a difference of 25*5+25/4=131 days or over 4 months of data that get chopped off the end of all models when you add HadGEM to the list of datasets. Worse, since in all calendars the time is kept as days since a reference point, the same time point (as integer number of days since the reference) refers to different days of the year, towards the end of the period in completely different seasons that all get mixed up here. Whether Summer days get mixed into Winter or Winter days into Summer depends on the position of HadGEM in the list of datasets. I think scientists care about this sort of thing. |
I don't speak for the others - I speak for the core functionality of a key module in ESMValTool which is currently busted and it needs serious repairs to get both its functionality and performance to standards. Interim solution is to replace it with the old module which is known to work (and has been tested against reference results) while we perform the needed fixes and performance optimizations, it's as simple as that |
I gave a concrete example of where the old module seems to lead to very wrong results. Do you think that example is wrong? Do you think people are aware of this? |
seems is the key word, the module has been tested thoroughly against reference results, unlike the new one. Have a go using the revert branch when you have some time and if that's a bug indeed, then we list it here and we weigh in the arguments for or against reverting |
OK @zklaus @katjaweigel @bettina-gier I have got the revert done in #1202 (just need to fix the docs build but that's be done if we decide to revert) - if you guys want to test/or just run your recipes to actually get results, go for it! I think we need to talk about this asap, lucky we have the ESMValTool call on Thu, @bouweandela sign me up for this pls 😁 |
Sorry to hear that you're experiencing issues with the updated multi-model code (but good to see some real use case testing coming in!) My intention has always been for the code to be more transparent, so that it will be easier to understand, debug, and optimise. While I understand it's annoying that this recipe doesn't work, let's not rush into reverting the update. I agree with @zklaus that we should prefer solving these bugs that are surfacing right now properly, rather than hiding them away. In the old branch, there were no tests, notes, docstrings, etc. that would inform fellows about these "intentional workarounds". In updating the mmstats code over the past year or so, we stumbled upon many of these implicit assumptions, e.g. also in #685, and these things keep surfacing all the time, which is just as annoying, just not all at once. With respect to the specific issues mentioned here:
|
@Peter9192 thanks for the summary and the links to the related issues!
|
I think so too. Also relevant: #744 (and see #781 for an overview of many related developments)
Not sure what you mean by 'marking' here. I see no label or so on the PR you linked. I think we've made a lot of progress in improving the development process, but this could've been smoother, I agree.
I think there's several reasons for that. For one, there's no easy way to do that. That's why we've spent a lot of time in improving the tests and the test infrastructure (e.g. the testbot, the regression tests with sample data, and #1023). Bouwe is currently trying to set up infrastructure to run all recipes on Mistral, but this is still challenging. And for previous releases, we actually did a lot of manual tests with recipes, so I would have expected issues to surface during this release as well. But of course in this case it depends on the effort everyone puts into testing this with their frequently used recipes. |
@Peter9192 Thanks for your answer! |
cheers @Peter9192 🍺 Here's how I see it:
If we decide to use the older v2.2.0 MM module we should have a quick and dirty v2.3.1 out asap, followed by the bugfix release 2.3.2 when the new MM (and fx stuff) have been merged |
I like the idea of going back to the old version and waiting until the "shakedown" is finished. A bugfix release once we are happy with the new version sounds like the way to go. |
Would it be an idea to have both functions side by side? i.e. users can call Then you can have the benefit of having no breakage of the old recipes giving users some time to migrate, while also having the new implementation available in the release for the shakedown. Once the new implementation has matured, the old version can be removed. |
Here is how I see it:
The recipes should surely be fixed, no matter what we decide about the multimodel preprocessor. |
that's the right way from a software cycle point, absolutely, and in fact it'd be easy since we don't have to change anything in Tool - it'd also allow us to do a direct apples-to-apples comparison between the two modules, I like the idea 👍 |
@zklaus
|
There certainly may be changes necessary in the core and a bugfix release is a real possibility. We may even decide to revert the multi-model statistics if weighing the issues and solutions makes that the best choice. But we should be clear where the problems lie, so as
Yes. Longitudes and latitudes must have bounds. I am not sure whether this is a simple dataset issue that needs a fix or a bug in the regridding. That should be discussed in the appropriate issue.
I think the issue is with different calendars, so at the moment one needs to avoid using multi-model statistics for daily data with mixed calendars. That seems quite sensible to me. A long-term solution involves probably conversions between calendars. This is commonly done by duplicating or removing days, but the details should be discussed elsewhere. I would probably rather place such a thing in a (very useful!)
You are completely right about these points.
I'm not sure how much backlash there really would be. If it's only about a few CMIP5 datasets (that have a history of issues with their surface temperature), perhaps not many people care. Though this may be more widely spread.
Perhaps. I am really not sure either way. But should that also be done for the other offered statistics? What about standard deviation or extreme percentiles? Either way, again, let's discuss the right issue. If we decide that averaging 1.5m and 2m tas is the way to go, let's state it that way and make clear what exactly is the behavior we want. |
I totally agree with Katja on the tas height issue. |
@zklaus I'm not blaming the changes in the core for the issues themselves. I'm just not happy, that there were not enough test before the release to discover them and that there was no warning about possible "breaking changes". It would be good, if everything in the tool would be working, when it is released and at the moment that is not the case with the new core. So I wonder how that could be solved best? |
@ruthlorenz, @katjaweigel, it's good to know that more models are affected (and I confirmed this for both CMIP5 and CMIP6). I opened a new issue to discuss specifically the question on how to treat this. It would be great if you could help to resolve the questions over there in #1204. @katjaweigel, on tests vs releases, first, keep in mind that the current released version of the tool, 2.2.0, requires a core version <2.3.0, so there is no released version of the tool that has these problems. Whether a change is breaking or not can be difficult to determine until it breaks something, more so if the actual behavior is correct. I agree that we should try to keep all models in the recipes, and even though that is sometimes not possible (for example, Manuel Schlund ran into a problem with a dataset that has meanwhile been retracted), just removing models because we can't deal with them anymore is certainly not the way to go. |
I think all issues raised here are now being addressed in other issues. An overview can be found in ESMValGroup/ESMValTool#2218. Closing this. If you disagree, please feel free to reopen with a comment explaining the need. |
Unfortunately I must consider reverting to the older multimodel module asap - the new module is EXTREMELY slow, buggy in different spots and not that much more efficient for memory usage. Take this recipe for instance:
I propose we retire the current implement to a development branch and temporarily re-instate the 2.2.0 module
The text was updated successfully, but these errors were encountered: