-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to Issue#83 #106
Fix to Issue#83 #106
Conversation
…; (2) there exists a ChEBI id in the SDF file but not in the obo file, causing KeyError when reading ontology; (3) intermediate compound documents' ids are lists of 1 string, not single strings
…values as a temp fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just two minor comments, we can then merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My one minor comment aside, this PR looks good to me
Please ignore the failed checks for now. It’s related to how Biothings API 0.10.x changes things around. |
Summary
This PR mainly updates ChEBI parser and will fix issue#83.
Key agreements we reached on reading the ontology fields:
is_a
relationshipsCode structure of the new ChEBI parser
The original code was refactored into a
CompoundReader
(which reads thesdf
file for chemical structure fields). A newOntologyReader
is added to read theobo
file for ontology fields.A new
ChebiParser
is created to hold one instance for each of the above two Reader classes and generate ChEBI documents in the following order:sdf
file. For each ChEBI idi
, generate a compound document.i
also exists in the ontology network, generate an ontology document as well, merge with its compound document as its final ChEBI document.New entries in mapping
Document Samples
CHEBI:45783, imatinib (chemical + ontology fields)
CHEBI:25106, macrolide (ontology fields only)