Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] Packaging Spinup-NEMO benchmark #27

Open
ma595 opened this issue Dec 11, 2024 · 0 comments
Open

[Meta] Packaging Spinup-NEMO benchmark #27

ma595 opened this issue Dec 11, 2024 · 0 comments
Assignees
Labels

Comments

@ma595
Copy link
Member

ma595 commented Dec 11, 2024

We are now in a position to package the Spinup-NEMO code:

High level view of the algorithm:

  1. [PRE-RUN] Run DINO for 50-100 years (minimum). Slurm script has been provided in NEMO notes. If we need to train on more data we then need to concatentate simulation outputs *grid_T.nc using ncrcat.
  2. Evaluate metrics here.
  3. Run the resampling notebook (See resample_dino_data) branch.
    This notebook converts DINO 2d monthly SSH output DINO_1m_grid_T.nc to annual DINO_1m_To_1y_grid_T.nc. Temperature and salinity (3D) are sampled annually already and are in DINO_1y_grid_T.nc. We can then read these files in the updated notebook for DINO (but it still works for NEMO).
  4. Load output in updated Jumper.ipynb notebook and run to create projected state.
  5. Evaluate metrics at this point? This relies on having sufficient output from step 1.
  6. Prepare restart file:
    Combine mesh_mask[0000].nc files and DINO_[<time>]_restart_[<process>].nc (last files) using REBUILD_NEMO tools
    Create new restart file: Run main_restart.py.
    main_restart.py --restart_path /path/to/nemo_data/ --radical DINO_00576000_restart --mask_file /path/to/mesh_mask.nc --prediction_path /path/to/simus_predicted
    main_restart.py has been modified to work on DINO data. This is in the run_with_DINO_data branch.
  7. Restart DINO with updated restart file.
  8. Evaluate metrics at regular intervals (like every 10 years) to see how close the correction brings us to ground truth. Does it converge or diverge?

Specification for packaging the above:

The aim is to provide a tool that allows people in IPSL to do ML assisted spinup without external help. The packaging should enable researchers to experiment with different parameters / inputs / addition of other metrics:

  • Option to change initial data period for training.
  • Provide different jump strategies so that spinup tool can be tested i.e. 10 year intervals, linearly scaling to 30 years.
  • Provide more metrics: Make it sufficiently modular so that more metrics can be added. Or provide documentation to make it easy to understand how to extend.
  • Modular restart toolbox.
    • algorithms like PCA.
  1. Provide a script that automates the above steps:
    • This needs to be executed end-to-end
      • Docker
      • pip install
  2. Provide documentation:
    • A report with output file. What should this output report provide? Some idea of how far we are away from the ground truth at various points?
    • Metrics (provide instructions on how to add more metrics).
  3. Take in other inputs, i.e., .npy from Etienne's diffusion process.
  4. Run iteratively
    • Nemo + projection multiple times.

Development strategy:

  • Prioritise restart.py script initially.
    • Establish if there are any issues with this task - review scripts, list tasks to do and come with questions. Ideally, produce a very simple README.md explaining the steps to produce the restart file. By 16.01.25.
  • Convert notebooks into scripts / modules.
    • Resampling.py
    • What do main_forecast.py and main_restart.py do - can probably be modified for purposes of providing an automated tool.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

3 participants