Skip to content

Building my comparison part 1: datasets_setup.py and atlas subdirectories

jservonnat edited this page Feb 9, 2018 · 14 revisions

The C-ESM-EP is a way to apply collections of diagnostics to a set of simulations.

It is intimately linked with the concept of comparison.

In the C-ESM-EP vocabulary, a comparison is a directory containing:

  • subdirectories for the atlases (collections of diagnostics); each one contains a parameter file that controls the execution of the diagnostics in main_C-ESM-EP.py
  • a python file datasets_setup.py: this is where the user specifies the datasets that will be taken as inputs of the C-ESM-EP

Foreword: keep only the atlases you need for your comparison

It is advised to keep in your comparison directory only the atlases subdirectories you need. The C-ESM-EP runs only the atlases available in the comparison directory. As well, the C-ESM-EP frontpage contains only the links to the atlases available for the comparison. Do not hesitate to remove the subdirectories you don't need to avoid unnecessary computation and storage of the results. The atlases subdirectories (with the parameter files) are available in standard_comparison (or a 'git pull' away) and share/optional_atlas if you need them.

Adding my datasets to datasets_setup.py

In datasets_setup.py, there is a python list 'models' which elements are python dictionaries describing the access to the datasets. They are basically the set of keywords/values that are provided to the CliMAF ds() function to access the data, without the keyword 'variable'. We will see now how to add your own datasets.

In CLiMAF, the different data structures are described with CliMAF 'projects'. Each CliMAF project provides access to the datasets through keywords/values:

- simulation
- frequency
- period

and other keywords that are specific to the projects.

The most commonly used projects are 'CMIP5' (CMIP5 archive on Ciclad) and 'IGCM_OUT' (data tree produced by libIGCM = most of the model outputs produced at IPSL).

Here is an example of a CMIP5 dataset definition and an IPSLCM6 coupled model simulation in datasets_setup.py:

models = [
   dict(
	project = 'CMIP5',
	model = 'IPSL-CM5A-LR',
	experiment = 'historical',
	simulation = 'r1i1p1',
	frequency = 'monthly',
	period = '1980-2005'
       ),
   dict(project = 'IGCM_OUT',
        root = '/ccc/store/cont003/thredds',
        login = 'p86caub',
        model = 'IPSLCM6',
        simulation = 'CM605-LR-pdCtrl01',
        frequency = ’monthly',
        clim_period = ‘last_20Y'
       )
]

Using the common_keys

In datasets_setup.py, you will see a mechanism to specify common keys for the elements of models. This to avoid duplicating the keywords in the dictionaries of models that are the same among a set of datasets. The mechanism in standard_comparison/datasets_setup.py adds the key/values to the IGCM_OUT dataset dictionaries that are not already specified in models.

Example with a set of simulations for an ORCHIDEE meeting:

models = [
      # -- Coupled models
      dict(project='IGCM_OUT', login='p86fair', simulation='CM6014-pd-splith-01', color='green' ),
      dict(project='IGCM_OUT', login='p86maf', simulation='CM6014-pd-split-D-01', color='red'),
      dict(project='IGCM_OUT', login='p86maf', simulation='CM6014-pd-ttop-01', color='blue'),

      # -- LMDZOR
      dict(project='IGCM_OUT', login='p86ghatt', model='LMDZOR', status='PROD',
           experiment='ref4438', simulation='CL5.4438.L6010.ref'),
      dict(project='IGCM_OUT', login='p86ghatt', model='LMDZOR', status='PROD',
           experiment='ref4438', simulation='CL5.4438.L6010.alt1'),

      # -- ORCHIDEE offline
      dict(project='IGCM_OUT', login='p529bast', model='OL2', status='PROD',
           experiment='ref4783', simulation='FG2.4783.v3'),
      dict(project='IGCM_OUT', login='p529bast', model='OL2', status='PROD',
           experiment='ref4783', simulation='FG2.4783.v4'),
      dict(project='IGCM_OUT', login='p529bast', model='OL2', status='PROD',
           experiment='ref4783', simulation='FG3.4783.v3'),
      dict(project='IGCM_OUT', login='p529bast', model='OL2', status='PROD',
           experiment='ref4783', simulation='FG3.4783.v4'),

]

# -- Provide a set of common keys to the elements of models
# ---------------------------------------------------------------------------- >
common_keys = dict(
           root='/ccc/store/cont003/thredds', login='*',
           model='IPSLCM6',
           frequency='monthly',
           clim_period='last_10Y',
           ts_period='full',
           )

for model in models:
  if model['project']=='IGCM_OUT':
    if '-pi' in model['simulation']:
        model.update(dict(experiment='piControl'))
    if '-pd' in model['simulation']:
        model.update(dict(experiment='pdControl'))
    for key in common_keys:
        if key not in model:
           model.update({key:common_keys[key]})

ts_period, clim_period and the period manager

The C-ESM-EP contains diagnostics on climatological averages, and other on time series. This way the user can specify a period for the climatologies, clim_period, and one for the time series, ts_period. Example:

   dict(project = 'IGCM_OUT',
        root = '/ccc/store/cont003/thredds',
        login = 'p86caub',
        model = 'IPSLCM6',
        simulation = 'CM605-LR-pdCtrl01',
        frequency = ’monthly',
        clim_period = ‘last_20Y'
        ts_period   = ‘full'
       )

clim_period and ts_period can take either real dates (ex: 1980-2000, 2100_2169...), or 'instructions' like 'last_20Y', 'first_1Y' or 'full' (explicit).

Those instructions are user-friendly ways to work on the XX last or first years of a simulation, without having to actually search for those dates by yourself.

This task is devoted to the period manager. The period manager is a C-ESM-EP functionality. It works for IGCM_OUT (and other IGCM_OUT related projects), CMIP5 and will work for the upcoming CMIP6 project. If you want to add your project to the C-ESM-EP and use the period manager, contact jerome . servonnat at lsce . ipsl . fr

The period manager works for monthly (CMIP5 and IGCM_OUT) and seasonal (IGCM_OUT only) frequencies. For this latter, use instructions like 'last_SE' or 'first_SE'.