-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sorting of variables for transfer learning #98
Comments
Do you mean this reordering functionnality https://anemoi-datasets.readthedocs.io/en/latest/using/selecting.html#reorder ? In any case, you are very welcome to create a pull request, as this would clarify the topic. |
It is something similar. However it needs a key called sort_vars which is provided in the anemoi-training config. However this is a boolean. If sorts_vars is True during a pretraning (no transfer learning) the variable set is alphabetically sorted. Meaning performing transfer learning with a pretrained checkpoint and sort vars=True (in pretraining and transfer learning), dataset1, dataset2, etc.. will have same variable sorting as the pretrained checkpoint (i.e the original dataset). |
Instead of an additional key word, would this ds = open_dataset(
dataset,
reorder='sort',
) work ? (assuming it was implemented) |
I have introduced a PR that checks for these errors in different areas of the training cycle. ecmwf/anemoi-training#120 In situations, where we can easily suggest a fix, the logger will write out a suggested "reorder" command with that values a model would expect. Happy to take feedback on this one, but it seems to be parallel to this issue. |
Yes, something like that. Should I or do you want to implement this feature? |
I made a draft PR, and the reason is I have created unit test which test a global (ERA5) and lam datasets, but these datasets are located on my local computer. I was wondering if i should upload or include a small LAM and global dataset somewhere to run the unit test in the CICD pipeline. The draft PR does not include unittest for now, but if desired I can add them. Link: |
There is currently an implementation of sorting variables alphabetically which is performed by default. However for the case of transfer learning, we ideally want to have the same variable sorting for the pre-trained model, when including a stretched grid which has to match the original order of the variable list (when performing transfer learning). Meaning, we have to re-sort the variable list to match the previous list so it makes sense for the pre-trained model.
I have already implemented this on our MetNo fork (more specifically in src/anemoi/datasets/data/misc.py in function _open_dataset) which is stable and has been tested. I was wondering if I could create a branch with latest version and do a pull-request and include this feature into anemoi-datasets, as more member states wish to do stretched grid and transfer learning.
The text was updated successfully, but these errors were encountered: