Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed search for fx files when no mip is given #1216

Merged
merged 9 commits into from
Jul 12, 2021
Merged

Fixed search for fx files when no mip is given #1216

merged 9 commits into from
Jul 12, 2021

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented Jul 6, 2021

This PR fixes the automatic retrieval of fx files when no mip is specified.

I agree with @sloosvel that a predefined order for the mip tables is not useful due to the many different project tables. Therefore I just implemented an alphabetic ordering, i.e., the first mip that has data is automatically selected. To bypass this, the user has to define a specific mip in the recipe. I added a paragraph to the documentation that describes this process.

This should go into 2.3.1 since this fixes a bug that (1) lets the test fails and (2) might lead to the tool wrongly complaining about missing data. #1169 only disabled the tests and didn't actually fix the bug.

Closes #1159.

Link to documentation: https://esmvaltool--1216.org.readthedocs.build/projects/ESMValCore/en/1216/recipe/preprocessor.html#fx-variables-as-cell-measures-or-ancillary-variables


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@schlunma schlunma added the bug Something isn't working label Jul 6, 2021
@schlunma schlunma added this to the v2.3.1 milestone Jul 6, 2021
@schlunma schlunma self-assigned this Jul 6, 2021
@codecov
Copy link

codecov bot commented Jul 6, 2021

Codecov Report

Merging #1216 (8cd5dcf) into main (695fc48) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1216      +/-   ##
==========================================
+ Coverage   85.51%   85.53%   +0.01%     
==========================================
  Files         188      188              
  Lines        9136     9148      +12     
==========================================
+ Hits         7813     7825      +12     
  Misses       1323     1323              
Impacted Files Coverage Δ
esmvalcore/_recipe.py 91.25% <100.00%> (+0.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 695fc48...8cd5dcf. Read the comment docs.

@zklaus
Copy link

zklaus commented Jul 7, 2021

The issue with this approach is that there are experiments, where there are two or more fx files for the same variable and it is impossible to determine a generic right order, e.g. here. I think #1159 (comment) is relevant. In other words, just specify the correct table in the recipe.

Copy link

@zklaus zklaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any generic ordering can meaningfully resolve existing ambiguity in the table selection.

@zklaus
Copy link

zklaus commented Jul 7, 2021

I do like the documentation improvements. Let's work that into 2.4.0.

@zklaus zklaus modified the milestones: v2.3.1, v2.4.0 Jul 7, 2021
@schlunma
Copy link
Contributor Author

schlunma commented Jul 7, 2021

I don't think any generic ordering can meaningfully resolve existing ambiguity in the table selection.

I agree, that's why I added this note to the documentation:

.. note::
   Some fx variables exist in more than one table (e.g., ``volcello`` exists in
   CMIP6's ``Ofx`` and ``Omon`` table or ``sftgif`` exists in CMIP6's ``fx``
   and ``IyrAnt`` table). If a specific table is desired, this needs to be
   specified by ``mip``, otherwise the table is selected based on the
   alphabetical order and data availability.

Do you want the tool to fail if there's is ambiguity? Or issue a warning?

I still thinks this should go into 2.3.1 since this merely fixes a bug that was introduced in 2.3.0.

@zklaus
Copy link

zklaus commented Jul 7, 2021

I think the tool should fail if there is ambiguity.

As for the bug, are you referring to #1159? If so, in my view, that doesn't reflect the introduction of a new bug, but the uncovering of an existing bug in the test, namely the ambiguity. The solution seems clear: specify the correct table in the recipe used in the test. I am not sure why @sloosvel's comment there went unanswered.

If you mean another bug, I am sorry I missed it. In that case, could you point me to the right issue, please?

@schlunma
Copy link
Contributor Author

schlunma commented Jul 7, 2021

The tests were designed for the pre-#999 situation and did not fail there. This old implementation did only check tables that included fx in their name, so the new version is definitely an improvement here. However, the point is that the tests didn't fail on any machine before #999 and failed after it on some machines.

#1169 "solved" this by disabling the tests. I don't think that's the way to go. This PR fixes the tests by introducing a machine-independent ordering for the tables.

I agree with you that in case of ambiguity the tool should fail, but this would be a new feature and should go into v2.4 (I can open another PR for that).

@zklaus
Copy link

zklaus commented Jul 7, 2021

It seems we agree that the bug is in the test. We also agree that disabling the test is not really the right way to address this. But there seems to be a clear solution: fix the test. Why would we want to change the behavior of the tool to work around a buggy test?

@schlunma
Copy link
Contributor Author

schlunma commented Jul 7, 2021

@sloosvel
Copy link
Contributor

sloosvel commented Jul 7, 2021

I think the changes make the documentation much better.

I agree that that break should not have been removed, it went over my head. And in case of ambiguity (which I have yet to see an exp with the same fx var in multiple mips, but who knows), specifying the mip solves the issue so it's much simpler than trying to come up with a search order that would fit all projects.

@valeriupredoi
Copy link
Contributor

@sloosvel - @zklaus provides an example of exactly that - areacellg in two mips, here and there are others (I believe even volcello that could be fx or Ofx if it's time variant or not, Klaus correct me if I'm wrong pls). I like this approach but I think we need the run to stop if it encounters multiple mips for the same fx variable, and prompt the user to make up their mind and add valid mip. This saves us from any future headaches due to ESGF data standards variations. Note the current lazy approach to not produce these sort of variables for all exps and mips and shortshifting by making use of just a small data production eg fx vars are produced for one ensemble only or one experiment only

@zklaus
Copy link

zklaus commented Jul 7, 2021

(I believe even volcello that could be fx or Ofx if it's time variant or not, Klaus correct me if I'm wrong pls).

Almost right: volcello can be in one or more of Ofx, Omon, Oyr, and Odec.

@schlunma
Copy link
Contributor Author

schlunma commented Jul 7, 2021

All right, I will alter the code so that it raises an error if multiple mips are encountered that include a given fx variable. I still believe that this should go into the release 2.3.1, since it's clearly a fix of a broken feature.

@valeriupredoi
Copy link
Contributor

cheers @zklaus and cheers @schlunma too, and I too believe this should be part of 2.3.1 but Klaus is the release bossman 👍

@zklaus
Copy link

zklaus commented Jul 7, 2021

Alright, I am not against putting this in 2.3.1, but it still needs a bit of work (I spotted some things in the documentation at least and I think we don't have any formal review yet), so it might come down to a question of timing.

@schlunma schlunma modified the milestones: v2.4.0, v2.3.1 Jul 7, 2021
@schlunma
Copy link
Contributor Author

schlunma commented Jul 7, 2021

Alright, now the tool raises an error if an fx variable is found in multiple tables. I also adapted the documentation accordingly.

Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent work, cheers @schlunma 🍺 One nag for me, I'd change that from a Note to a Warning 👍

doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
Copy link

@zklaus zklaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, only made it just to the actual code; have to continue later. Anyway, I think the most important comment is the last one.

doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
esmvalcore/_recipe.py Show resolved Hide resolved
Copy link

@zklaus zklaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if something is very weird; I had some trouble with the github web interface.

doc/recipe/preprocessor.rst Show resolved Hide resolved
esmvalcore/_recipe.py Show resolved Hide resolved
doc/recipe/preprocessor.rst Outdated Show resolved Hide resolved
@schlunma
Copy link
Contributor Author

schlunma commented Jul 8, 2021

I don't know why the tests are failing, I though these

TypeError: cannot compare cftime.DatetimeGregorian(300, 1, 16, 12, 0, 0, 0) and datetime.datetime(300, 1, 16, 12, 0) (different calendars)

errors were fixed by #1173?

@valeriupredoi
Copy link
Contributor

pls merge main in here, if you done that since yesterday afternoon then the test container needs to rebuild with the new environment from #1197 (I belive it's done that last night) - give it a day, it'll do it, if not bugger @bouweandela see if it's stuck. Note that you can run the tests locally too and see it works

@schlunma
Copy link
Contributor Author

schlunma commented Jul 8, 2021

After the merge of main the error got more interesting 😄

ValueError: Tag 'andela_bouwe' does not exist in section 'authors'

@valeriupredoi
Copy link
Contributor

yes, @bouweandela is famous for causing SegFaults 🤣 It's a SegFault, Manu, am rerunning it just for you 😁

@schlunma
Copy link
Contributor Author

schlunma commented Jul 9, 2021

The error didn't disappear after 3 restarts, so I guess there is something wrong here... Tests run fine locally.

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Jul 9, 2021

hang on, lemme investigate 🔍

@valeriupredoi
Copy link
Contributor

running tests locally I am getting this:

E       assert "Requested fx variable 'volcello' for dataset 'CanESM5' of project 'CMIP6' is available in more than one CMOR table for 'CMIP6': ['Ofx', 'Oyr', 'Odec', 'Omon']" in "Requested fx variable 'volcello' for dataset 'CanESM5' of project 'CMIP6' is available in more than one CMOR table for 'CMIP6': ['Oyr', 'Ofx', 'Odec', 'Omon']"
E        +  where "Requested fx variable 'volcello' for dataset 'CanESM5' of project 'CMIP6' is available in more than one CMOR table for 'CMIP6': ['Oyr', 'Ofx', 'Odec', 'Omon']" = <[RecursionError('maximum recursion depth exceeded while getting the repr of an object') raised in repr()] RecipeError object at 0x7fc19b08a4c0>.message

tests/integration/test_recipe.py:2799: AssertionError

and no SegFault after 2 runs, after updating to Python 3.9.6 too

@schlunma
Copy link
Contributor Author

schlunma commented Jul 9, 2021

Oh, that's again related to the machine-dependent ordering. Lemme change that real quick.

@valeriupredoi
Copy link
Contributor

you know what? That recursion depth error might have given me a massive clue why we get all those SegFaults anyway

@valeriupredoi
Copy link
Contributor

Oh, that's again related to the machine-dependent ordering. Lemme change that real quick.

you not running Ubuntu too on your machine?

@schlunma
Copy link
Contributor Author

schlunma commented Jul 9, 2021

Was doing the tests on mistral which runs on Red Hat Enterprise Linux

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Jul 9, 2021

weird coz I think that Circle runs on CentOS - ah yes, not weird, I am silly, they're the same curry

@schlunma
Copy link
Contributor Author

schlunma commented Jul 9, 2021

Thanks so much V., that seems to have done the trick! 🎉

@zklaus Any further comments from your side? I hope I addressed them all.

@valeriupredoi
Copy link
Contributor

so funny - now the tests pass fine and no more SegFault - I am going to get to the bottom of those SegFaults so help me Lord Vader! 😁

@schlunma
Copy link
Contributor Author

schlunma commented Jul 9, 2021

Haha, I hope you will!

@zklaus zklaus merged commit cde17af into main Jul 12, 2021
@zklaus zklaus deleted the fix_fx_test branch July 12, 2021 14:57
@zklaus
Copy link

zklaus commented Jul 12, 2021

Nice work, @schlunma ! 🍻

zklaus pushed a commit that referenced this pull request Jul 21, 2021
* Fixed tests related to fx file retrieval

* Raise error when fx variable is present in multiple tables

* Update doc/recipe/preprocessor.rst

Co-authored-by: Valeriu Predoi <[email protected]>

* Updated documentation on fx variables

* Used :ref: to link area_statistics and volume_statistics

* Optimized fx doc

* Only raise ambiguity error if fx files are found in mutiple tables

* Fixed machine-dependent error-message

Co-authored-by: Valeriu Predoi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fx files are selected differently depending on what OS the code runs on
4 participants