forked from ad-freiburg/qlever
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Completely refactor the fulltext operations (ad-freiburg#1093)
As of this commit, the fulltext index (triggered by `ql:contains-word` and `ql:contains-entity`) uses two basic operations: 1. `TextIndexScanForWord`: For a given word or prefix, return all text records that contain the word, (possibly together with the matched word in the case of a prefix, and the score of the match). 2. `TextIndexScanForEntity`: For a given word or prefix, return a superset of all pairs of `(text, entity)` where the entity is contained in the text according to `ql:contains-entity` and the text contains the `word`. For technical reasons this is a superset: We always have to scan the complete block from the half-inverted index which might belong to a shorter prefix. The general processing is then as follows: * For each word or prefix that appears as part of the object of a `ql:contains-word` triple, a `TextIndexScanForWord` is created. * For each entity or variable that appears as the object of a `ql:contains-entity` triple, a `TextIndexScanForEntity` is created. * The rest of the query processing is handled by the "ordinary" query planner using the normal operations like JOIN that are also used to process standard SPARQL queries. This is much cleaner than the old `TextOperationWith[out]Filter` operations which combined the functionality of the above scan operations with JOIN operations, because the old approach lead to a lot of code duplication (the code for a join of two tables was duplicated for the fulltext module) and because the new approach makes queries easier to optimize and to reason about because the runtime information trees become much clearer if the scans and joins are represented separately.
- Loading branch information
Showing
33 changed files
with
1,624 additions
and
863 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.