-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lineageTree.pb update: new lineages (pango-designation release v1.18.1) #35
lineageTree.pb update: new lineages (pango-designation release v1.18.1) #35
Conversation
The tree is a severely pruned version of the UCSC tree (2023-02-15) re-rooted to lineage A with 50 randomly selected descendants per lineage.
PRing directly into master instead of making a prerelease branch in cov-lineages/pangolin-data because this update is UShER-only. |
@AngieHinrichs Are you thinking to update the hash and alias key? |
Yes, I will tag release v1.18.1 on pangolin-data and pangolin-assignment
very soon, probably in the next hour. By request from GISAID (if I
understand correctly), we let a new release sit on the main branch for at
least half a day before tagging the release. New development-mode
installations will pick up the not-yet-released data, but conda installs
will continue with the current release until the new tag triggers a
bioconda update.
|
I mention the alias_key.json with the thought that it might drive the report_collation piece: as an example error that I ran into with v1.18.1
Thank you for your response and apologies if this error is unrelated! |
@OnePotato2 sorry I misunderstood your question. Did you get the error from a freshly installed pangolin, or did you update pangolin and then get the error? What method did you use to install (and/or update) pangolin? |
This was a fresh install using a .yaml to specify the modules ( https://github.com/cov-lineages/pangolin-data/archive/refs/tags/v1.18.1.tar.gz for example), followed by adding assignment-cache from pangolin --add-assignment-cache |
OK, alias_key.json comes from the cov-lineages/pango-designation repository. Does your .yaml also have v1.18.1 (not v1.18) for pango-designation? |
For pangolin v3, I had been including pango-designation, but with the roll-up of pLearn and pango-designations into pangolin-data, I had been using the four dependency repositories alone to build it:
this works in cases where the five parts of pangolin-data/pangolin_data/data/ are updated simultaneously, and meant explicitly following the direction from pangolin-data directly, but with the issues on the server building the pLearn model, the minor revision of the protobuff and the init.py alone deviates from that. I suspect if i go back and add in pango-designation as a separate explicit dependency it would be happy |
Ah I see now! I didn't realize that @aineniamh's pangoLEARN updates copied in alias_key.json from pango-designation and I also forgot about the designation hash file, although I guess I should have known that from commit comments like "adding latest hash, alias file and rf model corresponding to v1.18". And now I see that's what you were referring to by "hash and alias key" in your first comment. Sorry I wasn't paying enough attention to the part of the update that isn't the usher tree. I don't already have a process to update the designation hash on my systems, but I could probably set that up without too much trouble. If I can get that working before @aineniamh's server is repaired then I will tag a [ |
Thanks again @OnePotato2 -- as far as I can tell, the out-of-date alias_key.json in pangolin-data v1.18.1 was the cause of the error reported in cov-lineages/pangolin#510. So I went ahead and pushed out the update sooner than we normally would (usually we wait at least a half day after merging changes into main before tagging a new release), because a lot of people were getting errors. |
Thank you as well for your comprehensive support during the associated server outage! Tangentially, have you had success passing pangolin to srun in slurm and having Usher-Sampled run properly? I keep having it decline and run the fall-back usher |
I don't use slurm but I have had trouble getting it to run on my cluster nodes too -- it seems to me that the dependency on OpenMPI makes the installation really finicky, and usher-sampled exits with errors that are mysterious to me but might have something to do with system-level configuration? Since I have access to a server with a lot of CPUs, and usher-sampled is so much faster than original usher, it's actually much faster for me to run a bunch of pangolin commands with gnu parallel than to do the cluster run with original usher anyway! So I haven't been motivated enough to get to the bottom of why usher-sampled fails in the cluster environment despite running fine on the command line. Perhaps @yceh and/or @yatisht will have more insights about debugging OpenMPI errors - you could try filing an issue with the specific error messages from running |
@yceh have you tried |
As an update, it seems like usher-sampled will run in slurm sbatch commands, but I haven't stumbled into an MPI setting that allows it to run under "srun" calls. Is this something you would be interested in having raised as an issue on the usher page? |
Sorry, I haven't tried slurm before. I mostly just used mpirun to run it manually. (It was sort of a workaround for TBB stalling when there are too many cores.) |
The tree is a severely pruned version of the UCSC tree (2023-02-15) re-rooted to lineage A with 50 randomly selected descendants per lineage.