Multi model statistics: missing files for individual models #975

remi-kazeroni · 2021-02-01T15:05:42Z

Describe the bug
When using the multi_model_statistics preprocessor step, the preproc directory contains only the global file for multi model statistics. The files for each datasets are missing.

Note that everything seems fine with the branch release_2.1. We should aim at fixing that before the next release @ESMValGroup/esmvaltool-coreteam

Please attach

The recipe that you are trying to run, you can find a copy in the run directory in the output directory
The main_log_debug.txt file, this can also be found in the run directory in the output directory
recipe_test_mmm.yml.txt
main_log_debug.txt

The text was updated successfully, but these errors were encountered:

remi-kazeroni · 2021-02-01T15:06:59Z

Mistral being super slow today, I haven't been able to create an up to date environment... Mine is about one week old...

schlunma · 2021-02-01T15:18:51Z

I'm pretty sure this was introduced in #949:

The old version of the cube added all products to the output of multi_model_statistics:

ESMValCore/esmvalcore/preprocessor/_multimodel.py

Lines 390 to 393 in 7d82996

    
           if output_products: 
        
               products |= statistic_products 
        
               return products 
        
           return statistic_products

The new version does not:

ESMValCore/esmvalcore/preprocessor/_multimodel.py

Lines 368 to 379 in 0fb657d

    
           statistics_products = set() 
        
           for statistic, cube in statistics_cubes.items(): 
        
               statistics_product = output_products[statistic] 
        
               statistics_product.cubes = [cube] 
        
               for product in products: 
        
                   statistics_product.wasderivedfrom(product) 
        
               logger.info("Generated %s", statistics_product) 
        
               statistics_products.add(statistics_product) 
        
           return statistics_products

Peter9192 · 2021-02-02T07:58:25Z

Good spot! The behaviour shouldn't have changed like this, but also I think maybe we shouldn't change it back. Let me explain.

#949 is actually working towards support for an ensemble statistics preprocessor. We first implemented all of this in #673 but then figured it had become too big and we wanted to split it up in smaller tasks. Hence I copied some of the changes to #949. At this point it was not necessary to change the behaviour, and I should have spotted that.

However, when we continue towards the ensemble stats, we need this behaviour to change. See for example this use case. If you, at some point in the future, want to combine multimodel and ensemble statistics in a single preprocessor (e.g. first compute ens means, then multimodel between the ensemble means), you don't want to pass the original products to multimodel as well.

So a preprocessor that contains multimodel should only return multimodel products. An alternative way to get the output from the original cubes as well is to split the preprocessor:

preprocessors:
  everything_else: &everything_else
    area_statistics: ... 
    regrid_time: ...  
  multimodel:
    <<: *everything_else
    multi_model_statistics

variables:
  tas_all_cubes:
    preprocessor: everything_else
  tas_mmstats:
    preprocessor: multimodel

Do you know which recipes are affected by this? And do you agree this is the better solution in the long run? Then the quesion is: should we change it back for now and then change it again in the future, or leave it like this, document the changes in the changelog, and make a PR to fix the affected recipes? @remi-kazeroni

LisaBock · 2021-02-02T08:45:58Z

@Peter9192
The problem is that many diagnostics are affected by this modification. Including several IPCC diagnostics in the recipe
recipe_flato13ipcc.yml (Perfmetrics, Temperature Anomaly, Pattern Correlation,...) and in the recipe recipe_collins13ipcc.yml, as well as diagnostics in the recipe recipe_lauer13jclim.yml and several more... All these diagnostics show single models as well as the multi-model mean in their plots.

As your suggestion means that we would have to preprocess and regrid all models twice for these diagnostics and first of all, adjust them all to this modification, I would suggest to add an option to the 'multi_model_statistics' preprocessor. The default could be the old version as used by several diagnostics and the switch could be to supress the additional output of single models.

And please check the next time more carefully how new features affect existing diagnostics! Sorry, but I need the old feature as soon as possible because I have to create the new IPCC figures right now!

Peter9192 · 2021-02-02T09:20:14Z

@LisaBock thanks for the clarification. I think the solution you propose is also I good idea, I'll look into it.

If you need the old behaviour right away please have a look at this tip from Bouwe.

Peter9192 · 2021-02-02T09:45:07Z

Here's a fix: #978

@LisaBock @remi-kazeroni can you see if this solves your problems? I didn't test it thoroughly yet as I get the impression you want it sooner rather than later.

remi-kazeroni assigned schlunma, nielsdrost and remi-kazeroni and unassigned schlunma Feb 1, 2021

schlunma assigned Peter9192 and unassigned schlunma Feb 1, 2021

LisaBock assigned axel-lauer Feb 2, 2021

Peter9192 mentioned this issue Feb 2, 2021

Add option to exclude input cubes in output of multimodel statistics #978

Merged

8 tasks

bouweandela added this to the v2.2.0 milestone Feb 2, 2021

bouweandela added the bug Something isn't working label Feb 2, 2021

valeriupredoi closed this as completed in #978 Feb 2, 2021

Peter9192 mentioned this issue Feb 26, 2021

Update tests for multimodel statistics preprocessor #1023

Merged

10 tasks

ledm mentioned this issue Mar 16, 2021

Feature request: Extract_surface preprocessor #1039

Open

Peter9192 mentioned this issue Feb 4, 2022

Add ensemble statistics preprocessor and 'groupby' option for multimodel #673

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi model statistics: missing files for individual models #975

Multi model statistics: missing files for individual models #975

remi-kazeroni commented Feb 1, 2021

remi-kazeroni commented Feb 1, 2021

schlunma commented Feb 1, 2021 •

edited

Loading

Peter9192 commented Feb 2, 2021

LisaBock commented Feb 2, 2021

Peter9192 commented Feb 2, 2021

Peter9192 commented Feb 2, 2021 •

edited

Loading

Multi model statistics: missing files for individual models #975

Multi model statistics: missing files for individual models #975

Comments

remi-kazeroni commented Feb 1, 2021

remi-kazeroni commented Feb 1, 2021

schlunma commented Feb 1, 2021 • edited Loading

Peter9192 commented Feb 2, 2021

LisaBock commented Feb 2, 2021

Peter9192 commented Feb 2, 2021

Peter9192 commented Feb 2, 2021 • edited Loading

schlunma commented Feb 1, 2021 •

edited

Loading

Peter9192 commented Feb 2, 2021 •

edited

Loading