-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research: Elasticsearch relevancy scores #5842
Comments
From my research I was able to determine that we can implement a relevancy sort and that it would be relatively low effort with no re-index needed. How Relevance Scores WorkBefore scoring, Elasticsearch limits the set of candidate documents by applying a boolean test so that it only includes documents that match the query. After that, a score is calculated for each document in this set using the BM25 algorithm. The BM25 algorithm considers:
There is no maximum for a relevance score, however we can set a min_score to help filter out less desirable results. How Nested Relevance Scores Work Elasticsearch calculates a relevancy score for each nested document based on the query it matches. This scoring is similar to how it would score a regular document. After computing the relevancy scores for the nested documents, Elasticsearch aggregates these scores to produce a final score for the parent document.
Here is an example of using max, avg, and sum score_modes. Examples Also, when testing locally, we can use the Explain API to determine exactly how a document got its score.
|
Here is an example of the scoring algorithm for a mur |
If we wanted to sort by the document_hit_score we would have to use a workaround as this is not directly supported by elasticsearch. We could use scripted sorting for instance:
|
What we’re after
Research relevancy scores and how they work. Some research has been documented by Mark at GSA that could better inform this effort.
Related ticket(s)
Action item(s)
Completion criteria
The text was updated successfully, but these errors were encountered: