Replies: 1 comment
-
Okay. I've been reviewing annotation format a bit, and it seems like
So a reaction from RECON3 which look like this References:PMID:1847521 GENE_ASSOCIATION: HGNC:2228
Confidence Level: 0 NOTES: SAB <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:vCard4="http://www.w3.org/2006/vcard/ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"> <rdf:Description rdf:about="#R_34DHPHAMT"> bqbiol:is rdf:Bag <rdf:li rdf:resource="http://identifiers.org/ec-code/2.1.1.6"/> </rdf:Bag> </bqbiol:is> </rdf:Description> </rdf:RDF> in SBML level 3 should look something like NOTES: SAB Gene association will use GeneProduct, and remove that note. Since the reference is now in annotation, the note will be removed as well. If so, what should I do with Confidence Level (in reactions) and SMILES (in metabolites)? There is no namespace defined, as far as I can tell. Thank you |
Beta Was this translation helpful? Give feedback.
-
Hi. I've been working on updating #988 (merging devel into it, tidying up), and most of it works.
The Notes as present in #988 are broken. I also don't understand what they're supposed to do, so I thought I'd open a discussion on what we want notes to be/do and potential API.
@matthiaskoenig @cdiener @Midnighter @synchon What does SBML and other formats say about Notes? What do you guys want? Anybody else who has strong opinions on this?
My general understanding is that
0) Notes can be str or dict or ???
I think that notes should be a string, and that all notes that are otherwise can go into annotations, transformed into identifiers when relevant.
Specialist class? If so, something like metadata, KeyValuePairs from #988?
<html:body xmlns:html="http://www.w3.org/1999/xhtml">
html:pCONFIDENCE_LEVEL: 1</html:p>
html:pGENE_ASSOCIATION: HGNC:2228</html:p>
html:pNOTES: SAB</html:p>
html:pSUBSYSTEM: Tyrosine metabolism</html:p>
</html:body>
GENE_ASSOCIATION is turned to GPR
NOTES: SAB is a signature of the person who added it. It should be possible to add such notes automatically or manually (using code)
SUBSYSTEM goes to groups and subsystem field of reactions
An additional example, some metabolites in RECON3 have things in notes that don't have identifiers
Some metabolites have things that have no identifiers
<html:body xmlns:html="http://www.w3.org/1999/xhtml">
html:pCHARGE: -4</html:p>
html:pFORMULA: C48H74N7O20P3S</html:p>
html:pSMILES: [H]C@@(CCC(=O)C(C)C(=O)SCCNC(=O)CCNC(=O)C@HC(C)(C)COP(O)(=O)OP(O)(=O)OC[C@H]1OC@Hn1cnc2c(N)ncnc12)[C@@]1([H])CC[C@@]2([H])[C@]3([H])C@HC[C@]4([H])CC@HCC[C@]4(C)[C@@]3([H])CC[C@]12C</html:p>
</html:body>
I would take CHARGE to met.charge
FORMULA to met.formula
SMILES can go into annotations, but it doesn't have identifiers - SMILES is not a database, but a way of drawing the molecule with ASCII. See SMILES in wikipedia. If not in annotations, should notes stay a dict? What do you guys think?
There are also other examples of things that don't really belong in annotations, such as reaction notes that look something like
<html:body xmlns:html="http://www.w3.org/1999/xhtml">
html:pCONFIDENCE_LEVEL: 2</html:p>
html:pGENE_ASSOCIATION: HGNC:11455</html:p>
html:pNOTES: -this is a lumped rxn w/ an inferred reduction rxn that occurs in conjunction w/ sulfotransferase step; this reduction step has to occur in order for the transformation (2amac-->Lcyst) to be valid -used NADPH as electron donor since it is a common cofactor involved in biosynthetic steps utilizing oxygen (see Neema for further explanations) MM</html:p>
html:pSUBSYSTEM: Methionine and cysteine metabolism</html:p>
</html:body>
The SBML reader turns the GENE_ASSOCIATION into GPR, which makes sense.
NOTES: should definitely be part of notes.
CONFIDENCE SCORES should be in? Throwing them away is an option, but not a great one to me.
Beta Was this translation helpful? Give feedback.
All reactions