-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add triples of ql:has-pattern
predicate to PSO and POS
#1226
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1226 +/- ##
==========================================
+ Coverage 84.33% 84.34% +0.01%
==========================================
Files 304 304
Lines 29058 29100 +42
Branches 3437 3446 +9
==========================================
+ Hits 24505 24544 +39
Misses 3153 3153
- Partials 1400 1403 +3 ☔ View full report in Codecov by Sentry. |
ql:has-pattern
predicate.ql:has-pattern
predicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1-1 with Johannes, looks great, some minor changes left
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor changes left
ql:has-pattern
predicateql:has-pattern
predicate to PSO and POS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks a lot!
Quality Gate passedThe SonarCloud Quality Gate passed, but some issues were introduced. 1 New issue |
…tern` into RAM (#1223) PRs #1168 and #1177 have added the subject patterns as two additional columns to the OSP&OPS and PSO&POS permutations. PR #1226 has added the triples of the `ql:has-pattern` predicate to the PSO&POS permutations. Now use this information instead of the old patterns, which did cost a lot of RAM. We tried a few queries involving patterns and the speed is very similar to that of the previous implementation. NOTE: This is an index-breaking change. The old `.index.patterns` file stored the `ql:has-pattern` predicate (for each subject its pattern) and the information which pattern consists of which predicates. Now the `.index.patterns` file only stores the latter information. The file size therefore is significantly reduced and no longer depends on the size of the dataset (but only on how many distinct patterns there are, typically few). For example, for Wikidata, the file size reduced from 17 GB to 2.8 GB. For UniProt, the reduction is from 152 GB (which does not fit into the RAM of our standard machines) to something very small (because UniProt is very regular and there are only very few distinct patterns).
The PSO and POS permutation now also contain the triples of the internal
ql:has-pattern
predicate. These will be used as a fallback for the new pattern implementation (which will come with one of the next commits). Note that we don't need the triples in the other four permutations, so the pair PSO&POS now has more triples than SPO&SOP and OSP&OPS.