Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we treat reanalysis with members? #945

Open
jvegreg opened this issue Jan 14, 2021 · 2 comments
Open

How do we treat reanalysis with members? #945

jvegreg opened this issue Jan 14, 2021 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@jvegreg
Copy link
Contributor

jvegreg commented Jan 14, 2021

We started working on reading ORAS4 and ORAS5 reanalysis within ESMValTool (see https://www.ecmwf.int/en/forecasts/dataset/ocean-reanalysis-system-5). We plan to do an online cmorization using fixes because it is not compute intensive: just renaming and fixing some metadata.

I have a first draft for ORAS4, but there is a problem we are not very sure in how to address: this particular reanalysis have 5 realizations instead of the single one we use to have in observations.

I can see two options:

  1. Create a new project that supports members and keep doing that anytime we need an extra key for an observation / reanalysis
  2. Allow us to specify different paths for different datasets in the same project, particularly for the input_file and output_file keys. This will allow us to also remove the need to rename the original files after download

An example of a config-developer for the second option, with the first level being the current drs and the second being the dataset specific part

native6:
  cmor_strict: false
  input_dir:
    default: 'Tier{tier}/{dataset}'
  input_file:
    default: 
      default: '*.nc'
      ERA5: 'era5_{era5_name}*{era5_freq}.nc'
      ORAS4: 's4_{ensemble}_*.nc'
  output_file:
    default: 
      default: '{project}_{dataset}_{type}_{version}_{mip}_{short_name}'
      ORAS4: '{project}_{dataset}_{type}_{ensemble}_{version}_{mip}_{short_name}'
  cmor_type: 'CMIP6'
  cmor_default_table_prefix: 'CMIP6_'

Any other ideas? @ESMValGroup/esmvaltool-coreteam

@jvegreg jvegreg added the enhancement New feature or request label Jan 14, 2021
@jvegreg jvegreg self-assigned this Jan 14, 2021
@stefsmeets
Copy link
Contributor

stefsmeets commented Jan 18, 2021

Hi @jvegasbsc , We had a working solution that I was quite happy with for your second suggestion as part of #785. It was not terribly difficult to do and worked well, but there was some resistance as it would require significantly overhauling config-developer. In the end the PR became too big so I closed it.

We went for this structure:
https://github.com/ESMValGroup/ESMValCore/blob/cafa8618ab089d3d1d37aecf3bfe5dc97bcc1217/esmvalcore/config/drs-default.yml

I still think that overhauling config-developer is a good idea to make changes like the one you are proposing easier to implement (#887).

@bouweandela
Copy link
Member

Option two is also proposed here: #494, it might be useful for large datasets where users typically have no control over the directory structure/file names.

The new structure for config-developer proposed by @stefsmeets is very nice, but it is important that we keep changes backward compatible as much as possible and probably we will want to adopt intake (see #31) rather than keep working on our own _data_finder.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants