Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add triples of ql:has-pattern predicate to PSO and POS #1226

Merged
merged 11 commits into from
Jan 17, 2024

Conversation

joka921
Copy link
Member

@joka921 joka921 commented Jan 16, 2024

The PSO and POS permutation now also contain the triples of the internal ql:has-pattern predicate. These will be used as a fallback for the new pattern implementation (which will come with one of the next commits). Note that we don't need the triples in the other four permutations, so the pair PSO&POS now has more triples than SPO&SOP and OSP&OPS.

Copy link

codecov bot commented Jan 16, 2024

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (0bd2b6c) 84.33% compared to head (4a9bf23) 84.34%.

Files Patch % Lines
src/engine/idTable/CompressedExternalIdTable.h 76.47% 0 Missing and 4 partials ⚠️
src/engine/idTable/IdTable.h 50.00% 0 Missing and 1 partial ⚠️
src/index/IndexImpl.h 95.45% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1226      +/-   ##
==========================================
+ Coverage   84.33%   84.34%   +0.01%     
==========================================
  Files         304      304              
  Lines       29058    29100      +42     
  Branches     3437     3446       +9     
==========================================
+ Hits        24505    24544      +39     
  Misses       3153     3153              
- Partials     1400     1403       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hannahbast hannahbast changed the title Add additional permutations that contain the ql:has-pattern predicate. Add additional permutation pair for the ql:has-pattern predicate Jan 16, 2024
Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1-1 with Johannes, looks great, some minor changes left

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor changes left

@hannahbast hannahbast marked this pull request as ready for review January 17, 2024 12:44
@hannahbast hannahbast changed the title Add additional permutation pair for the ql:has-pattern predicate Add triples of ql:has-pattern predicate to PSO and POS Jan 17, 2024
Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot!

Copy link

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

1 New issue
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@joka921 joka921 merged commit f7c2c32 into ad-freiburg:master Jan 17, 2024
18 checks passed
@joka921 joka921 deleted the additional-permutations branch January 17, 2024 14:01
hannahbast pushed a commit that referenced this pull request Jan 18, 2024
…tern` into RAM (#1223)

PRs #1168 and #1177 have added the subject patterns as two additional columns to the OSP&OPS and PSO&POS permutations. PR #1226 has added the triples of the `ql:has-pattern` predicate to the PSO&POS permutations. Now use this information instead of the old patterns, which did cost a lot of RAM. We tried a few queries involving patterns and the speed is very similar to that of the previous implementation.

NOTE: This is an index-breaking change. The old `.index.patterns` file stored the `ql:has-pattern` predicate (for each subject its pattern) and the information which pattern consists of which predicates. Now the `.index.patterns` file only stores the latter information. The file size therefore is significantly reduced and no longer depends on the size of the dataset (but only on how many distinct patterns there are, typically few). For example, for Wikidata, the file size reduced from 17 GB to 2.8 GB. For UniProt, the reduction is from 152 GB (which does not fit into the RAM of our standard machines) to something very small (because UniProt is very regular and there are only very few distinct patterns).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants