-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fx files are selected differently depending on what OS the code runs on #1159
Comments
OK I found the bugger: this line loops through all possible MIPs - for the test fail the explanation is that on a Debian-based OS (Ubuntu like on my machine and Github Actions) the order in which the tables hence mip list is passes is random (but fixed each time the test runs) and it happens that @sloosvel @schlunma could you guys pls give priority to this issue, it's quite important for the FX functionality, and my apologies I've not spotted this at review point. @zklaus you may want to flag this for release, mate |
I don't know, as far as I have seen in our experiments we only generate the fx vars in one of the possible mips. I think this test is more complicated because somehow it returns files in all of the possible mips. Whereas in a "real life" application the code would go through all mips but only find one possible output. Although I am not sure if in other institutes the fx vars are outputed in all the possible mips. But this should be solvable by specifying in the recipe which one you want to use. |
@sloosvel the data is inherently different - there are |
I have assigned the milestone. I will still cut the release branch soon, but this week is reserved for testing and fixing exactly this kind of issue before next week's release, so I agree that it would be good to address this soon. I also agree that the order can not be relied upon since it is arbitrary in dicts. You may be interested to learn that it is arbitrary in globbing as well, see, e.g., here. |
cheers @zklaus - I would suggest we try and understand the functional behaviour of selecting the fx mip first - do we want to use and load the first available fx mip or we want to select a preferred one? This will influence how #1160 gets solved too I think. Unfortunately I don't have the data knowledge to propose a solution, that's why I'm asking you guys that have worked more closely with this sort of data than meself. Definitely not the last come in the loop selection though 😁 |
possibly ping @ledm since he's been seeing FX stuff for a living, in oceans 🐟 |
Looking at the data request for |
cheers for the clarifications @zklaus 🍺 Here's an idea - can we set a "preferred" mip dictionary mapping variable mip to a set of preferred fx mips? This way we'd always select the ones that do make sense first then look for some exotic fx mips after? |
I had a more detailed look at this. I think that #999 introduced a bug here: ESMValCore/esmvalcore/_recipe.py Lines 344 to 360 in fc87d72
In the old implementation, there used to be a I agree that a default order for checking the ESMValCore/esmvalcore/_recipe.py Line 376 in fc87d72
|
I think in terms of preferred order, we should use fx files from |
I don't know. The mips are something that depend on the tables, that change from project to project. And even some variables exist in some projects, some others don't. This criteria would work fine for CMIP6, but may be too specific for all the projects. |
OK I suggest we fix that test in a way it doesn't fail on any OS, and keep the discussion here. If build tests run on a Debian machine, Klaus will be having issues building the package 👍 |
If it's just to fix the test, specifying the mip in the content of the recipe should be enough. Would that work for you? |
silly GH closing the issue on its own 🤭 I just merged a temporary fix that skips the tests that are failing in #1169 (cheers for reviewing it, Saskia!). This issue is still ongoing though... |
To summarize the issue:
Is this accurate? If that is the case, I suggest improving the documentation to clarify this and fix the test permanently by including the explicit fx table there as well. I think @schlunma's #1216 is a good starting point for documentation improvements. |
Bug summary
this line loops through all possible MIPs - for the test fail the explanation is that on a Debian-based OS (Ubuntu like on my machine and Github Actions) the order in which the tables hence mip list is passes is random (but fixed each time the test runs) and it happens that
IyrAnt
is the last actual mip in the loop that is valid, whereas for CentOS (JASMIN and the CI machine), the tables hence the mips are ordered by UNIX order type: numerals, capitals then lower case last, sofx
is the last valid mip so the test passes. This, however, uncovers a potential vulnerability - this is a rather random process of selecting the mip, and is not solid in my view - iike in the test case, forsftgif
we have bothfx
andIyrAnt
files available, and the file selection is performed depending on the OS where the code runs on - this is dodgy. Any ideas how to select the file that we need and is consistent across OS's? This potentially impactful sincefx
is not time-dependent whereas egOyr
orIyrAnt
are time-dependent!Test fail that discovered the issue
This is an odd one: that test fails only on Ubuntu and OSX machines. The Ubuntu fail is reported in this closed test PR with the test failing on both my machine and GA both Ubuntu and the OSX fail is here - the test is failing since the
filename
key returns a file from a list of files that does not havefx
in it. I am not sure if this is an issue with the actual preprocessor or if it's just the test object that misbehaves on different platforms. I'd like to investigate more but I need to write some slides now instead 😖 This is, however, a sneaky bug 🪲 @zklaus the anaconda package build may fail coz of this (depending what machine the tests are run on)The text was updated successfully, but these errors were encountered: