Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is there a plan to upgrade r941_prom_hac_g360+g422 model to any of latest model soon? #6547

Open
mthang opened this issue Nov 11, 2024 · 5 comments

Comments

@mthang
Copy link
Contributor

mthang commented Nov 11, 2024

The Galaxy clair3 tool currently uses r941_prom_hac_g360+g422 model and r941_prom_sup_g5014 model.

#test_model "the_model_name" r941_prom_hac_g360+g422 $(dirname $(which run_clair3.sh))/models/r941_prom_hac_g360+g422

I would like to find out if the model will be upgraded to any of the model listed here https://github.com/nanoporetech/rerio/tree/master/clair3_models .

Thanks

@pvanheus
Copy link
Contributor

Yes! My plan is to do two things:

  1. Propose on the bioconda side that the 4 latest models should be included in the bioconda recipe (i.e. downloaded at build time and added to the existing models folder)
  2. Update the Galaxy tool

If (1) is not accepted on the bioconda side we will need to build a tool to download models and / or a data manager. I'd prefer not to go that route.

Does this sound reasonable to you?

@mthang
Copy link
Contributor Author

mthang commented Nov 11, 2024

Totally. (1) is probably the better solultion. I would also avoid to deal with data manager if possible. Do you know when will the model available ?

@bernt-matthias
Copy link
Contributor

How big are the models?

@pvanheus
Copy link
Contributor

@bernt-matthias the latest are 310 MB in size in total.

I have since realised that my plan in (1) will not work. The models in the rerio repository are distributed under the Oxford Nanopore Technologies PLC. Public License v. 1.0 which does not appear to be an Open Source license (see the discussion here). Or at least I cannot verify that it is an Open Source license.

So distributing the models with the clair3 software would not work (Bioconda is for open source software). Which means, I think, that a data manager would need to be built - but could this be used on a public Galaxy server? One of the particularly concerning (for me) parts of the Oxford Nanopore Technologies PLC. Public License v. 1.0 is this:

"Each Contributor hereby grants You a world-wide, royalty-free,
non-exclusive license under Contributor copyrights Licensable by such
Contributor to use, reproduce, make available, modify, display,
perform, distribute, and otherwise exploit solely for Research Purposes
its Contributions, either on an unmodified basis, with Modifications,
or as part of a Larger Work."

What does "Research Purposes" mean?

I know that some Galaxy servers have negotiated for the use of "Free for Academic Use" software on their servers. I have written to ONT to ask in which way these models may be re-used in a public context.

@bernt-matthias
Copy link
Contributor

latest are 310 MB in size in total.

Then (to me) a data manager seems better than inclusion in bioconda.

I know that some Galaxy servers have negotiated for the use of "Free for Academic Use" software on their servers.

Yep. Sometimes a statement in the help / description was sufficient, sometime a boolean parameter where users need to agree to the license.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants