-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for CMOR checks but DO NOT raise exceptions (at the user's risk!) #338
Comments
Which recipe are you trying? |
recipe_ipccwg1ar6ch3_highres |
Can you try a few another ones to see whether it is a general issue or specific to that recipe? |
I think |
|
...which in my case, it went on and finished the bloody recipe no problemo, without any |
If you want no CMOR checks (see also #88), you can disable them in the recipe by making sure that every preprocessor contains:
However, if you're missing a 2m height coordinate for a dataset I would recommend implementing a fix for it, this shouldn't be much work. Note however that many preprocessor functions and diagnostic scripts depend on the data being in the format specified by the CMOR standard, so disabling the checks is not recommended. |
@bouweandela the issue I am raising is slightly more fundamental than simply adding a few lines in the recipe setting the checks to false and skipping those preprocessor steps: this is equivalent to a tank commander stripping the whole armor off the tank to allow for greater mobility. Instead I'd propose a mission-driven installation of armor:
This is something that I am sure @axel-lauer would agree with from the perspective of science friendliness 🍺 |
The 3-level idea proposed by @valeriupredoi sounds absolutely fantastic! The basic checks could be the default setting for doing "exploratory science", full checks the recommended setting for publications and no checks a "last aid" setting for "I know what I do, just read that data set" cases. I also particularly like the idea of logging all errors put not throwing exceptions in the "basic checks and fixes" and the "no checks" cases. |
There has been a lot of discussion regarding the strict CMOR checks that are currently implemented in ESMValTool v2. What I am hearing from many of you is that “It is incredibly frustrating that ESMValTool can't handle any minor deviation from CMOR data”. While from a “good practice” coding point of view the checks make a lot of sense, in the meantime this really turns out to be a huge obstacle in actually using the tool for what it is built for, namely facilitating routine and rapid evaluation of the CMIP ensemble. I therefore strongly support the proposal from @valeriupredoi to implement three different levels of checks to be selected by the user. I would like to suggest to implement this immediately. If you (particularly @ESMValGroup/esmvaltool-coreteam) have further suggestion on this proposal, please ensure to comment here by the end of this week. I suggest that then early next week @valeriupredoi is implementing his proposal if there are no further strong convincing arguments against this. |
I see ❤️ 's all round, I should candidate for the next elections in the UK in December 🤣 Cheers @veyring for your note! I reckon it's best if @jvegasbsc and myself will be working on this, rather than just myself, since Javi is the authority when it comes to CMOR+ESMValTool. Javi will hopefully recover from his illness by next week (get better soon, bud!!) so we can start the work 🍺 |
I said this, but I would like to clarify what I mean. I'm not frustrated that we have strict CMOR compliance built in, but rather that there is so much variability in the standardisation of CMIP datasets. This variability combined with the strictness of ESMValTools CMOR checks means that almost no data can be read as is (in my experience). I've spent far more time working on fixes than on diagnostics or recipes. I wish that checks had been made earlier in the process perhaps by the modelling groups or by WCRP before submitting or publishing non-standardised data. This is a model-intercomparison project (cMIP!), so it shouldn't be too much to ask to require the data to be inter-comparable. |
@ledm asking scientists to use (and not just observe) a set of standards is like herding cats, man |
Just to make sure it more widely known, this is from @veyring's email:
I have a feeling that once the SOD is done, I'm going to need a few days to report all the issues with |
Poor SODs, they gonna hate us 😁
Dr Valeriu Predoi.
Computational scientist
NCAS-CMS
University of Reading
Department of Meteorology
Reading RG6 6BB
United Kingdom
…On Wed, 6 Nov 2019, 09:07 Lee de Mora, ***@***.***> wrote:
Just to make sure it more widely known, this is from @veyring
<https://github.com/veyring>'s email:
There is a formal way to catch and report errors provided by ES-Doc, see:
https://errata.es-doc.org/static/index.html?project=CMIP6
I have a feeling that once the SOD is done, I'm going to need a few days
to report all the issues with msftyz.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#338?email_source=notifications&email_token=AG5EFIZHFPMF2J6KXUQ4SPLQSKCOVA5CNFSM4JGIEFR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDFZ6ZA#issuecomment-550215524>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG5EFIYAEGYJV5JYTILIHMDQSKCOVANCNFSM4JGIEFRQ>
.
|
Information from Karl Taylor: "There is extensive QC in place to check the global attributes and some of the variable attributes, which are used extensively throughout the CMIP6 infrastructure, but the QC of the data itself is left to those contributing data. Those groups using CMOR to write their data have subjected it to additional QC checks not part of the ESGF basic checks, and I suspect that fewer groups may be using CMOR this round than used it in previous phases, so this might explain some of the problems. See https://goo.gl/NmuENr for documentation on what QC is done by PrePARE as part of the publication procedure on ESGF." Please report issues to the responsible modeling groups as described at https://pcmdi.llnl.gov/CMIP6/Guide/dataUsers.html#6-reporting-suspected-errors. The modeling groups will verify the problems and then enter the information into the errata data base at https://errata.es-doc.org/static/index.html. |
I am going to create a dedicated issue for reporting dataset issues and stop the conversation here since this issue deals with implementing the three-tiered checks and should not blow up on multiple fronts 🍺 |
Before starting to code the intermediate level of checking, it would be really useful to know what @valeriupredoi actually going to implement. That is, we need to have a list of what errors are serious enough to stop and/or what errors will only cause a warning. @ledm and @axel-lauer you seem to have an opinion on this, so maybe you could help out by starting to compile such a list? It would also be good to know what to do with errors not in the list, i.e. should the default behaviour be to stop if an error is encountered, unless it's on the list of errors that are hopefully safe to ignore? |
So this is best laid down if we actually list the types of metadata checks (note I don't propose tiering any of the data checks) performed by the tool. I propose naming Checks Levels: NONE, MEDIUM, STRICT and listing the checks I propose the following actions depending on check level. Note that for None we should log all issues but don't report any error (don't raise any exception):
I invite people to chime in. Also @jvegasbsc or anyone else pls add any other check that I may have omitted 🍺 |
Basically I am trying to replicate iris's behavior towards a Regular Joe netCDF file for MEDIUM with the addition of extra security for units and monotonic coord values; STRICT is strict like hell 😈 |
Just a table version of @valeriupredoi list above; no other changes.
|
darn swanky table @zklaus cheers 🍺 |
the implementation should be very straightforward since |
Some of the checks on the table deal with the errors with automatic fixes. Would in this cases be necessary to add this intermediate level of reporting?
Well thanks for assuming that. But I currently have more than enough cake with the primavera metrics. So you are more than welcome to help out if you can / want to. |
Sure thing, I will! In a work retreat til Wed but will start helping myself
with cake on Wed 😁
Dr Valeriu Predoi.
Computational scientist
NCAS-CMS
University of Reading
Department of Meteorology
Reading RG6 6BB
United Kingdom
…On Mon, 11 Nov 2019, 12:22 sloosvel, ***@***.***> wrote:
Some of the checks on the table deal with the errors with automatic fixes.
Would in this cases be necessary to add this intermediate level of
reporting?
Prob a piece of cake for @jvegasbsc <https://github.com/jvegasbsc> or
@sloosvel <https://github.com/sloosvel>
Well thanks for assuming that. But I currently have more than enough cake
with the primavera metrics. So you are more than welcome to help out if you
can / want to.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#338?email_source=notifications&email_token=AG5EFI25XT7G7ZDEOM7DVU3QTFE6XA5CNFSM4JGIEFR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDWVADI#issuecomment-552423437>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG5EFI5XSCRT4C2LPOUIQOLQTFE6XANCNFSM4JGIEFRQ>
.
|
Enjoy the retreat! Wednesday is fine for me to start working on that. |
The error handling proposed by Klaus is great. 3 small issues with datasets have turned out to be very annoying in the past, so I would propose to write warnings to the log file instead of handling them as errors when cmor check is set to medium, i.e.:
|
I would like to propose a different naming scheme for the different levels of checking, e.g. Regarding the technical implementation, I would propose to add the keyword argument |
@bouweandela I like the naming convention, especially Do you not think the |
I think that using |
I also prefer the command line option, I think it's safer. So in summary, the report_error functions should be adapted to general report functions that give either an error, a warning or nothing (as @valeriupredoi pointed out) depending on the level of checking given by --cmor-checks? |
yup, started work here #374 |
the data issue template has its first use in #384 |
Not good, guys, not good - analyses that need to go through past nit-picky CMOR exceptions still stumble at them and return exceptions with the tool stopping. I thought the whole point of
cmor_strict: false
was to bypass all these roadblocks 🍺 Running an IPCC stuff for @LisaBock and am getting pestered byheight2m
not present. Who cares about bloodyheight2m
? 😠The text was updated successfully, but these errors were encountered: