-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add code/description how to create a CompDb from MassBank #66
Comments
Is there an advantage to this compared to using the SDF from MoNA? |
I can not say for the content. What I like about the MassBank is that a) the license is pretty clear, so data can be (re)shared, b) MassBank makes releases, which allows to "freeze" the data - important for reproducible research and c) extracting the data directly from their database is easier than importing from text files (SDF and/or json). |
OMG - did not expect that. So, MassBank has one compound for each spectrum. Far from being a normalized database :( |
Yes, and the IDs differ between the different labs. Only common thing could be the InChIKey to cross-map, but never tried so far. |
Problem is that not all compounds have an inchikey - which makes it then really tricky. Well, for now I will import the data as is. |
Do all of them have a SMILES? Then the InChIKey could be calculated with this one: https://github.com/CDK-R/rinchi |
Indeed - it seems that all of them have SMILES. Good point - maybe you could chime in here too: MassBank/MassBank-web#266 |
MassBank releases their databases at regular intervals and shares the data with a rather open license, which makes them an ideal candidate for annotation databases that could be distributed via Bioconductor's
AnnotationHub
.Explanation: I'm building so called
EnsDb
databases for all species for each release of Ensembl. These databases are self-contained SQLite files with gene, transcript, exon and protein annotations and can be downloaded/fetched fromAnnotationHub
. This is very convenient for the user.CompDb databases could be distributed in a similar fashion.
What I will try next is to define simple scripts to easily import data from the MassBank (MySQL database) into a
CompDb
database.The text was updated successfully, but these errors were encountered: