-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update tests for multimodel statistics preprocessor #1023
Conversation
One test that is failing (marked with Another issue that came up is related to the standard deviation and is described in #1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job @stefsmeets! I think these modified tests make the requirements for a new implementation much more explicit. As far as I can see, the new tests cover all of the old tests except for two of them. Could you have a look at how we could add those within the new test framework?
@pytest.mark.xfail(reason='Multimodel statistics returns the original cubes.') | ||
def test_edge_case_time_no_overlap_fail(): | ||
"""Test case when time coords do not overlap using span='overlap'. | ||
|
||
Expected behaviour: `multi_model_statistics` should fail if time | ||
points are not overlapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the desired behaviour here is to fail. I would say so if I was designing it from scratch, but maybe this is actually considered a 'feature' by some users/recipes?? Any thoughts on this @valeriupredoi @LisaBock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schlunma what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first intuition would also be to let it fail when there is no overlap, but I guess returning an empty cube (or no cube) could also be a reasonable option. I've never used that "feature", and I don't really know if anyone uses it. I would be fine with changing this behavior if it is not used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's leave it like this then 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember the purpose of the test in general is to verify the integrity of the functional behavior, not to assume things on its own, so if the function fails if there is no overlap, the test should check for that 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point @valeriupredoi , the point of this PR is to establish the definition so that we have a better description of the functionality that we want. Currently there is no test that explicitly checks for this scenario, so the behaviour is not defined.
Therefore we want to clarify the desired behaviour with this PR. Since this return statement is inconsistent with the normal output (a list is returned rather than a dict with statistics), I marked this with XFAIL to indicate a known problem (which will be fixed with #968 😁).
If this is behaviour that we need with this function, that is also OK for me, then I will adjust the test to ensure that a list of cubes is given back if there is no overlap and update #968 accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, good call! I am perfectly fine with this as it is here, discussing further in the other PR is cool, the only bit that I think needs addressing is that array equality func below, other than that - good work here 🍺
Hi guys, great job you're doing here! I have a question regarding this PR: Should it already include tests for all functionalities that we want the future If yes, do we want want to include tests that ensure that the preprocessor works with cube that have arbitrary coordinates? I think that's a really desired features (#1018 #891 #890). Right now, I think all the tests include the |
Hi @schlunma yes I think this test-driven development makes a lot of sense! It helps to map all the requirements for a new implementation. |
Cool, sounds great! I will add some tests later today. |
What's maybe even more important than tests for new desired features, is tests for 'hidden' requirements. We recently saw an example of this in #975. Are there any other 'features' or 'requirements' you can think are currently not covered by the tests? |
Ahh, nice, so that should be covered then.
Not really. One thing that recently came up is #1009 (handling of |
One suggestion: Since we want to support cubes with no time coordinate, I think that the |
I like the idea, but I'm not sure we'll address this in #968 already. I think it sounds more like a follow up. But we can certainly open an issue about it and maybe already start thinking about a test that belongs to it. |
Hi @ESMValGroup/esmvaltool-coreteam , I think this PR is ready to be merged! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefsmeets good job, tests are looking much more comprehensive now! Just a wee change related to the array equality checker func and a comment related to the overlap check from me 🍺
@pytest.mark.xfail(reason='Multimodel statistics returns the original cubes.') | ||
def test_edge_case_time_no_overlap_fail(): | ||
"""Test case when time coords do not overlap using span='overlap'. | ||
|
||
Expected behaviour: `multi_model_statistics` should fail if time | ||
points are not overlapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember the purpose of the test in general is to verify the integrity of the functional behavior, not to assume things on its own, so if the function fails if there is no overlap, the test should check for that 👍
if np.ma.isMaskedArray(this) or np.ma.isMaskedArray(other): | ||
np.testing.assert_array_equal(this.mask, other.mask) | ||
|
||
np.testing.assert_allclose(this, other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why you add the same function as the one from the tests' init file here? well, you are adding the allclose check to it, but why do you need it? please use the imported function and if there is a need for an almost equal check, add it in the test that needs it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @valeriupredoi , I'm not entirely sure how to import this file without messing with sys.path
. The default way for defining imports in pytest
is in conftest.py
, which we do not use atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to import per se it's already imported at module run level, just use it from there without duplicating it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like you said, the functions are different (array_allclose
vs array_equal
), so the other one cannot be used directly. Instead, I changed the function name to better reflect its function. Does that work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK we should move this one upstream in the init at some point and just use it from there
oh and another comment: could you please tell me what's the coverage percentage now compared to the previous module? |
we should account for a |
One way to do it is to add an additional (not the default) option that says |
yip, agreed on both counts! |
For Note that the coverage says very little about the quality of the tests. |
shweet! yes, it's a number that measures brute coverage, just wanted to check there are no omissions in statement checks that were done before, this PR introduces a better quality test module 👍 |
cheers for the good work @stefsmeets 🍺 |
Description
Hi everyone, we are developing a new implementation of multi-model statistics preprocessor in #968. The goal of this PR is to to better chart out the requirements for the preprocessor, and to have a more complete description of the functionality through the unit tests, so that no functionality / features are lost.
The current set of tests does not cover all the functionality and is difficult to extend as it still makes use of the old
unittest
framework. This PR re-implements the functionality of the old tests usingpytest
.Before you get started
Checklist
pre-commit
oryamllint
checksTo help with the number pull requests: