Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model versioning #17

Open
VarIr opened this issue Jul 3, 2020 · 1 comment
Open

Model versioning #17

VarIr opened this issue Jul 3, 2020 · 1 comment

Comments

@VarIr
Copy link
Collaborator

VarIr commented Jul 3, 2020

Currently, deepnog ships one model per eggnog level and network architecture.
If we ever decide to retrain certain models, users need to individually come up with strategies to tell models apart, or use a specific model (e.g., for reproducibility), such as manually moving files around, renaming accordingly, etc.
Retraining, however, could sometimes make sense. For example, we might want to use different data splits, increase the share of training sequences compared to test sequences to squeeze a little more performance out of the model.

We should at least introduce some versioning, model identifiers, etc., that are stored with the model. Could be a simple string inside the model_dict. This could even be "backported" to existing models.

Ideally, automatic model download should also be version-aware. Currently, a user that already has downloaded a model will not receive any updated model.

@VarIr
Copy link
Collaborator Author

VarIr commented Oct 5, 2020

To summarize some key points of the recent discussion:

Models will receive a metadata field that holds the following information,

  • UUID as model identifier
  • Date & timestamp of training
  • training params (incl. learning rate, scheduler, number of epochs, etc.)
  • Orthology DB name
  • Taxonomic level in DB
  • metadata format version (v1 for now, v2 if this ever needs to be extended)
    Technically, this can be implemented as a dict that is serialized into the .pth model file.
    This can be backported to old models.

Model filenames obtain a version hint, e.g. the date, or v1, v2, etc., and a "latest" pointer to the most up-to-date version.

The client subcommand deepnog infer will use a use_latest boolean flag to use the latest model (otherwise, the one currently installed). A warning/info could be issued to users, when new models are available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant