Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lineageTree.pb update: new lineages (pango-designation release v1.18.1) #35

Merged
merged 1 commit into from
Feb 21, 2023

Conversation

AngieHinrichs
Copy link
Member

The tree is a severely pruned version of the UCSC tree (2023-02-15) re-rooted to lineage A with 50 randomly selected descendants per lineage.

The tree is a severely pruned version of the UCSC tree (2023-02-15) re-rooted to lineage A with 50 randomly selected descendants per lineage.
@AngieHinrichs
Copy link
Member Author

PRing directly into master instead of making a prerelease branch in cov-lineages/pangolin-data because this update is UShER-only.

@AngieHinrichs AngieHinrichs merged commit 30458b5 into cov-lineages:main Feb 21, 2023
@OnePotato2
Copy link

@AngieHinrichs Are you thinking to update the hash and alias key?

@AngieHinrichs
Copy link
Member Author

AngieHinrichs commented Feb 22, 2023 via email

@OnePotato2
Copy link

@AngieHinrichs Are you thinking to update the hash and alias key?

I mention the alias_key.json with the thought that it might drive the report_collation piece:

as an example error that I ran into with v1.18.1

...site-packages/pangolin/utils/report_collation.py", line 104, in get_recombinant_parents for parent in alias_dict[lineage_parts[0]]: KeyError: 'XBP'

Thank you for your response and apologies if this error is unrelated!

@AngieHinrichs
Copy link
Member Author

@OnePotato2 sorry I misunderstood your question. Did you get the error from a freshly installed pangolin, or did you update pangolin and then get the error? What method did you use to install (and/or update) pangolin?

@OnePotato2
Copy link

This was a fresh install using a .yaml to specify the modules ( https://github.com/cov-lineages/pangolin-data/archive/refs/tags/v1.18.1.tar.gz for example), followed by adding assignment-cache from pangolin --add-assignment-cache

@AngieHinrichs
Copy link
Member Author

OK, alias_key.json comes from the cov-lineages/pango-designation repository. Does your .yaml also have v1.18.1 (not v1.18) for pango-designation?

@OnePotato2
Copy link

For pangolin v3, I had been including pango-designation, but with the roll-up of pLearn and pango-designations into pangolin-data, I had been using the four dependency repositories alone to build it:

- pip: - https://github.com/cov-lineages/pangolin/archive/refs/tags/v4.2.tar.gz - https://github.com/cov-lineages/scorpio/archive/refs/tags/v0.3.17.tar.gz - https://github.com/cov-lineages/constellations/archive/refs/tags/v0.1.10.tar.gz - https://github.com/cov-lineages/pangolin-data/archive/refs/tags/v1.18.1.tar.gz

this works in cases where the five parts of pangolin-data/pangolin_data/data/ are updated simultaneously, and meant explicitly following the direction from pangolin-data directly, but with the issues on the server building the pLearn model, the minor revision of the protobuff and the init.py alone deviates from that. I suspect if i go back and add in pango-designation as a separate explicit dependency it would be happy

@AngieHinrichs
Copy link
Member Author

AngieHinrichs commented Feb 23, 2023

the minor revision of the protobuff and the init.py alone deviates from that.

Ah I see now! I didn't realize that @aineniamh's pangoLEARN updates copied in alias_key.json from pango-designation and I also forgot about the designation hash file, although I guess I should have known that from commit comments like "adding latest hash, alias file and rf model corresponding to v1.18". And now I see that's what you were referring to by "hash and alias key" in your first comment. Sorry I wasn't paying enough attention to the part of the update that isn't the usher tree.

I don't already have a process to update the designation hash on my systems, but I could probably set that up without too much trouble. If I can get that working before @aineniamh's server is repaired then I will tag a [v1.18.2 Edit: v1.18.1.1, just in case there is a pango-designation release v1.18.2 in the future] minor release with the alias_key.json and lineages.hash.csv properly updated. Thank you for pointing that out!

@AngieHinrichs
Copy link
Member Author

Thanks again @OnePotato2 -- as far as I can tell, the out-of-date alias_key.json in pangolin-data v1.18.1 was the cause of the error reported in cov-lineages/pangolin#510. So I went ahead and pushed out the update sooner than we normally would (usually we wait at least a half day after merging changes into main before tagging a new release), because a lot of people were getting errors.

@OnePotato2
Copy link

Thank you as well for your comprehensive support during the associated server outage! Tangentially, have you had success passing pangolin to srun in slurm and having Usher-Sampled run properly? I keep having it decline and run the fall-back usher

@AngieHinrichs
Copy link
Member Author

Tangentially, have you had success passing pangolin to srun in slurm and having Usher-Sampled run properly? I keep having it decline and run the fall-back usher

I don't use slurm but I have had trouble getting it to run on my cluster nodes too -- it seems to me that the dependency on OpenMPI makes the installation really finicky, and usher-sampled exits with errors that are mysterious to me but might have something to do with system-level configuration? Since I have access to a server with a lot of CPUs, and usher-sampled is so much faster than original usher, it's actually much faster for me to run a bunch of pangolin commands with gnu parallel than to do the cluster run with original usher anyway! So I haven't been motivated enough to get to the bottom of why usher-sampled fails in the cluster environment despite running fine on the command line. Perhaps @yceh and/or @yatisht will have more insights about debugging OpenMPI errors - you could try filing an issue with the specific error messages from running usher-sampled --help in your slurm environment.

@yatisht
Copy link

yatisht commented Feb 24, 2023

@yceh have you tried usher-sampled on slurm? It would be good to support it.

@OnePotato2
Copy link

@yceh have you tried usher-sampled on slurm? It would be good to support it.

As an update, it seems like usher-sampled will run in slurm sbatch commands, but I haven't stumbled into an MPI setting that allows it to run under "srun" calls. Is this something you would be interested in having raised as an issue on the usher page?

@yceh
Copy link

yceh commented Mar 8, 2023

Sorry, I haven't tried slurm before. I mostly just used mpirun to run it manually. (It was sort of a workaround for TBB stalling when there are too many cores.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants