-
Notifications
You must be signed in to change notification settings - Fork 34
OWLQueries
We want to be able to do complex queries over ontologies - sometimes we want to include inferred axioms, sometimes non-logical constructs (aka owl annotations). What are our options?
This should be familiar to users of Protege through the "DL Query" tab. DL queries are possible via the OWLAPI reasoner interface. Use the OWL API to build up a class expression, then ask for ancestors/descendants.
Problems:
- DL Queries are limited to the logical structure of the ontology - elements such as labels, synonyms, obo-subsets (aka annotations) cannot be incorporated without contorting the ontology somehow
- DL Queries do not allow for closed world negation (AKA SPARQL filters).
- Not all reasoners support DL queries
- In particular, our favorite reasoner ELK only allows you to query using named classes (also true of JCEL?)
- Reasoning can be slow, particularly if you have a lot of individuals
SPARQL does not suffer from these limitations, but has its own problems:
- not supported in the OWL API
- querying over entailments not always supported
- SPARQL is very awkward and low-level for use with OWL class expressions
SPARQL-DL is in theory the best of both worlds. It has a nice OWL syntax (yet another one...) for expressing queries. You can use closed world negation, combine logical relationships and annotations
However:
- Not a W3C standard
- Future not clear. E.g.
- will updates be supported?
- how does SPARQL-DL track SPARQL, if at all?
- OWLAPI support is not good
- OWLTools comes with the Dresden sparql-dl jar, but this does not support all of SPARQL-DL
OBO-Edit provides a visual query editor for building complex queries that combine lexical elements with logical relationships, closed-world negation and entailments.
- Doesn't have a standard syntax - must use GUI (check this..?)
- Non-standard
- Limited to OBO-Edit reasoner (EL-ish expressivity, slow)
One possibility is to load all inferences into a relational database and use SQL. We are aware of two schemas that support this:
- OBD
- GOLD (Gene Ontology schema)
See examples for the Prolog OWL shell
In the current absence of "one query language to rule them all", OWLTools takes a pragmatic approach and allows you to mix and match as the situation calls for. It provides convenience methods for common processing operations that help with querying
Command line example:
owltools fly_anatomy_XP.obo --reasoner hermit --sparql-dl "SELECT * WHERE {SubClassOf(?x, <http://purl.obolibrary.org/obo/FBbt_00005106>)}"
Unfortunately, the SPARQL-DL library used is very limited. In the future this should hopefully allow more powerful queries.
Say we want to query for neurons in the mushroom body. This can be expressed as a DL query:
owltools fly_anatomy_XP.obo --reasoner hermit --reasoner-query "FBbt_00005106 and (BFO_0000050 some FBbt_00005801)"
(support for labels on command line in future)
Note the ELK is much faster that hermit (we heart elk). Why don't we use it instead? One current limitation of ELK is that the reasoner implementation only allows named classes. A common(?) workaround is to name the query. OWLTools will do this for you with the -m option (to materialize the query expression as a class):
owltools fly_anatomy_XP.obo --reasoner-query -r elk -m "FBbt_00005106 and (BFO_0000050 some FBbt_00005801)"
Danger Will Robinson! - this trick involves a pretty major deviation from OWL semantics....
What if we want to throw in closed world negation? E.g. neurons that are not inferred to be part of the mushroom body?
owltools fly_anatomy_XP.obo --query-cw "FBbt_00005106 and not (BFO_0000050 some FBbt_00005801)"
(of course, the DL expression actually means neurons that are not part of the MB, which is different from neurons that are not provably part of the MB... but it's useful in the absence of a standard query language)
Note that no external reasoner is used here (it is assumed that classification is done in advance). The OWLTools graph walking algorithm is used here. Property chains etc are taken into account. See the OWLTools graph package for more details.
In future this kind of dark arts will be obsoleted by a working SPARQL-DL query regime.
SPARQL isn't supported with the OWLAPI, and as OWLTools is mostly a wrapper onto the OWLAPI, there is no SPARQL support at this time. This may be provided in future if there is demand.
One possibility is to load all inferences into a triplestore and provide a means of accessing this. This does not fit well into the OWL-centric nature of this tool.
Adding individuals to an ontology can slow down reasoning considerably. Another challenge is that Elk does not currently handle individuals.
Often we do not need complete ABox reasoning, in particular if the individuals are not interconnected.
One strategy is to translate individuals to classes:
owltools myi.owl --i2c -o file://`pwd`/myc.owl
This can be reasoned over with elk
In future, OWLTools may support hybrid query strategies. For example, we might have individuals stored in a Solr server with pre-classified facets. It would be nice to issue a DL query and for this to be executed via a mixture of reasoning (TBox) and Solr queries (ABox)