Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve query speed for large KBs when using RDF4J Lucene FTS #4209

Closed
reckart opened this issue Sep 27, 2023 · 0 comments
Closed

Improve query speed for large KBs when using RDF4J Lucene FTS #4209

reckart opened this issue Sep 27, 2023 · 0 comments
Assignees
Milestone

Comments

@reckart
Copy link
Member

reckart commented Sep 27, 2023

Is your feature request related to a problem? Please describe.
When working with a very large KB, certain queries to the FTS index can return a very high number of matches. These matches are then post-processed using regexes which can make the processing really slow. This is aggravated by the fact the the Lucene FTS queries currently do not have any limit.

Describe the solution you'd like
In other FTS supports, we can limit the number of results produced by the FTS. This should also be made possible for the Lucene FTS.

@reckart reckart added this to the 29.4 milestone Sep 27, 2023
@reckart reckart self-assigned this Sep 27, 2023
reckart added a commit that referenced this issue Sep 27, 2023
- Factor Lucene FTS query out into a helper class
- Use sub-selects to impose a limit on the FTS matches
reckart added a commit that referenced this issue Sep 28, 2023
- Set a default fuzzy query length of 3 for local KB queries
- Enable limiting the results returned internally by local KBs
reckart added a commit that referenced this issue Sep 28, 2023
…query-speed-for-large-KBs-when-using-RDF4J-Lucene-FTS-2

#4209 - Improve query speed for large KBs when using RDF4J Lucene FTS
@reckart reckart closed this as completed Sep 28, 2023
reckart added a commit that referenced this issue Sep 28, 2023
* release/29.x:
  #4209 - Improve query speed for large KBs when using RDF4J Lucene FTS
  #4212 - Unable to mark local KB as read-only
reckart added a commit that referenced this issue Oct 2, 2023
* main: (42 commits)
  #4221 - Upgrade dependencies
  #4219 - Improve speed of importing a large KB
  #4217 - Speed up properties list loading on KB page
  #4209 - Improve query speed for large KBs when using RDF4J Lucene FTS
  #4212 - Unable to mark local KB as read-only
  [maven-release-plugin] prepare for next development iteration
  [maven-release-plugin] prepare release inception-29.3
  No issue. Fix PubMed test.
  #4192 - Upgrade dependencies
  No issue. Fix PubMed test.
  #4205 - Ability to refresh document to load pending suggestions
  #4158 - Exception when annotating something after a longer pause
  #4158 - Exception when annotating something after a longer pause
  No issue: Enable parallel builds
  #4158 - Exception when annotating something after a longer pause
  #4158 - Exception when annotating something after a longer pause
  #4192 - Upgrade dependencies
  #4201 - Layer export as JSON does not include coloring rules
  #4199 - Jumping to the end of a long annotation sometimes does not work
  #4198 - Show layer name instead of no-label
  ...

% Conflicts:
%	inception/inception-workload-dynamic/src/main/java/de/tudarmstadt/ukp/inception/workload/dynamic/annotation/DynamicAnnotatorWorkflowActionBarItemGroup.java
%	pom.xml
@reckart reckart added this to Kanban Aug 7, 2024
@reckart reckart moved this to 🍹 Done in Kanban Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

1 participant