Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SILVA #1306

Open
jplfaria opened this issue Dec 12, 2024 · 7 comments · May be fixed by #1307
Open

SILVA #1306

jplfaria opened this issue Dec 12, 2024 · 7 comments · May be fixed by #1307
Labels
New Used in combination with prefix, metaprefix, or collection for new entries Prefix

Comments

@jplfaria
Copy link

jplfaria commented Dec 12, 2024

Prefix

silva

Name

SILVA rRNA database project

Homepage

https://www.arb-silva.de/

Source Code Repository

No response

Description

SILVA is a comprehensive, quality-checked, and regularly updated resource of aligned ribosomal RNA (rRNA) gene sequences for Bacteria, Archaea, and Eukaryotes. It provides a consistent taxonomic framework, commonly used in microbial ecology and diversity studies.

License

CC-BY-SA-4.0

Publications

doi:10.1093/nar/gks1219 | doi:10.1093/nar/gkt1209

Example Local Unique Identifier

11084

Regular Expression Pattern for Local Unique Identifier

^\d+$

URI Format String

https://www.arb-silva.de/search?q=$1

Wikidata Property

No response

Contributor Name

Jose P. Faria

Contributor GitHub

jplfaria

Contributor ORCiD

0000-0001-9302-7250

Contributor Email

[email protected]

Contact Name

Frank Oliver Glöckner

Contact ORCiD

0000-0001-8528-9023

Contact GitHub

frankolivergloeckner

Contact Email

[email protected]

Additional Comments

No response

@jplfaria jplfaria added New Used in combination with prefix, metaprefix, or collection for new entries Prefix labels Dec 12, 2024
@github-actions github-actions bot linked a pull request Dec 12, 2024 that will close this issue
@cthoyt
Copy link
Member

cthoyt commented Dec 12, 2024

Thanks @jplfaria. I made a few updates, including putting @frankolivergloeckner as the contact person - it's Bioregistry policy to have a single point of contact and not a group email.

@cthoyt
Copy link
Member

cthoyt commented Dec 12, 2024

@jplfaria however, it's not clear what's the actual semantic space here. The URI format doesn't seem to work when using your example. Can you link me to a page that shows something for the example acccession number 11084?

@cthoyt
Copy link
Member

cthoyt commented Dec 12, 2024

Is it possible that https://www.arb-silva.de/browser/ssu-138.2/AJ001010 is actually the accession numbers that SILVA creates, and 11084 was an NCBI Taxonomy id?

@jplfaria
Copy link
Author

jplfaria commented Dec 12, 2024

I was going off the folder I believe it has taxonomy data:
https://www.arb-silva.de/no_cache/download/archive/current/Exports/taxonomy/
In that folder in specific my plan was to work ith the file https://www.arb-silva.de/fileadmin/silva_databases/current/Exports/taxonomy/tax_slv_ssu_138.2.txt.gz take a look at the contents there and let me know if that makes any sense. There are additional files with mappings, etc, but I was going to start just with that. Also, I decided only to make this ontology file for the SSU data since that is the most commonly used. We could, in theory, have a separate ontology file for the LSU data or merge both, which I am not sure is a good idea.

@cthoyt
Copy link
Member

cthoyt commented Dec 12, 2024

Archaea;Crenarchaeota;Nitrososphaeria;Nitrosotaleales;	42950	order		138
Archaea;Crenarchaeota;Nitrososphaeria;Nitrosotaleales;Nitrosotaleaceae;	42951	family		138
Archaea;Crenarchaeota;Nitrososphaeria;Nitrosotaleales;Nitrosotaleaceae;Candidatus Nitrosotalea;	42952	genus		138

here are a few lines in that file, I think it corresponds to https://www.arb-silva.de/browser/ssu-121/AB050229/ but I can't figure out where any of the numbers (42950, 42951, 42952) are on the web page. Or, what does AB050229 mean?

@jplfaria
Copy link
Author

jplfaria commented Dec 12, 2024

We are bpth having issues understanding what is going on here. Maybe starting this entry could have been more timely. I will spend more time trying to understand the resource to be able to properly represent this entry.

@jplfaria
Copy link
Author

The IDs in the file 42950, 42951, etc seem to be some taxonomy unique ID that only exists in that file, I cant find it anywhere else. When I search fo those IDs as ncbi taxonomy ids, the organisms are different.
Another issues I don't understand is that in the taxonomy file, there are no taxa for the "species" level, I understand that SILVA doesn't always has all taxonomy levels since the taxonomy may not be fully resolved with just the ssu unit sequence but in the GTDB metada file we use to build the GTDB ontology I see mapping to silva at the species level.
i may need to email these authors to understand what is happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Used in combination with prefix, metaprefix, or collection for new entries Prefix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants