-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficient Knowledge Base SPARQL query for labels #4570
Comments
For larger KBs, the queries INCEpTION uses are not unlikely to time out. Usually, it boils down INCEpTION not having support for the respective full text indexing capabilities of the endpoint. The AllegroGraph FTI is e.g. not supported by INCEpTION at the moment. Also, INCEpTION assumes that knowledge bases do not have reasoning enabled and therefore has somewhat more complex queries that e.g. look up subtypes of labels at query time. |
FWIW, this KB is not big; my issue likely a DB problem, but that particular Later today, I can also compare on Stardog which I use at work |
If I would remove it, I would have trouble with KBs that make use of subproperties and either do not have a reasoner enabled or do not contain the fully resolved |
It might be considered to make the clause non-optional - that would mean that concepts without a label cannot be retrieved at all. I believe retrieving concepts without a label is something we hardly do these days... |
Not remove the whole graph pattern, just the |
Ah, I see you replied while I was typing. Alternatively, UNION can have better performance than optionals but since the query is filtered by matching labels, would it return anything if there were no labels? You probably use a similar, unfiltered query to retrieve the list of in the KB screen, right? |
Fortunately, the SPARQL part of INCEpTION has a pretty comprehensive test suite. Removing the optional does not cause any of the tests to fail. So if it helps, let's do that :) |
The queries are generated using INCEpTION's |
- Remove "optional" to speed up query on AllegroGraph backends
Without the optional, querying wikibus is still a bit sluggish. I'd expect that situation would improve by supporting the AllegroGraph FTI. |
About moving the |
I think you should ignore that suggestion. I probably doesn't matter if the pattern is inside or outside.
Do you also get response times around 0.5 s? |
I will have to beef up that server at some point, probably :) |
…nt-Knowledge-Base-SPARQL-query-for-labels #4570 - Inefficient Knowledge Base SPARQL query for labels
* release/31.x: #4570 - Inefficient Knowledge Base SPARQL query for labels
…ty-annotation * main: #4570 - Inefficient Knowledge Base SPARQL query for labels No issue: Small improvement to log message. #4565 - Improve system-level backup documentation #4558 - Better verification for feature names #4558 - Better verification for feature names #4558 - Better verification for feature names
I get response times much longer than 0.5s - look for
|
Ah, that still seems to have the optional... let me check again... but at least it does not time out. |
Ok, I found the proper Now, removing that and running the tests, I can say that Wikidata (Blazegraph I suppose) really don't like it without the optional and I get several failing tests on Wikidata if I remove it. E.g. this query returns:
But this one does time out:
So I don't think removing the optional is the right aproach. |
@tpluscode If you control the Wikibus endpoint, could you enable freetext indices on it so we can test this query?
|
AllegroGraph is quite picky.... eclipse-rdf4j/rdf4j#4923 |
For the time being I disabled reasoning on that endpoint to be able to work with it from INCEpTION. I just switched it back on.
I added a broad index. Queries like above are now blazing fast :) |
Good. I have a FTI support for AllegroGraph now in INCEpTION - let's see how that works... |
Yup, that's fast now:
|
After the restart of your server, querying the hierarchy to construct the concept tree for the KB page took quite long (over a minute). |
I think I'll roll back the change made under this issue. Here we have the FTS support: #4573 Might be interesting to check if the FTS-driven query also works now even if you turn the inference back on. |
Inferencing is already enabled again on query.wikibus.org/query |
Good. Just tested again - looks like the FTS is the solution for you then. No need to adjust the |
Just a bit of a pity that there is this little incompatibility between RDF4J and AllegroGraph right now. |
* release/31.x: Revert "#4570 - Inefficient Knowledge Base SPARQL query for labels"
Superseded by #4573 |
Describe the bug
I found that the patterns used in the KB query to retrieve instances is suboptimal, and in the case of AllegroGraph, it somehow makes it completely fail when reasoning is enabled on the SPARQL endpoint
To Reproduce
Steps to reproduce the behavior:
https://query.wikibus.org/query
(IRI schema RDF)Expected behavior
Some data should be returned but AllegroGraph will time-out.
Screenshots
No response
Environment
31.1 (2024-02-20 23:27:52, build c78ce3c5)
Additional context
I found that the query works when I disable reasoning and consistently takes about 2.5-3 seconds (reasoning has no effect on the result of that query)
I found the query in question in the logs, having these patterns
I don't know what the deal with AllegroGraph is (I filed a ticket there too), but the
OPTIONAL
graph pattern is to blame. I find it unnecessary. I also don't see why it should be outside the next BGP containing the pattern?subj ?pMatch ?m .
Instead, I'd move it inside and remove
OPTIONAL
to make it much faster. With reasoning enabled, it returns in about 0.5 seconds once caches are filled. Without reasoning it goes down to about 200 ms.Here's my rendition of that query: https://s.zazuko.com/2gQvGBw
The text was updated successfully, but these errors were encountered: