Return the output of the ranking rules for each search result #594
Replies: 15 comments 3 replies
-
Thanks for the write-up @loiclec! Regarding returning the score in the response, it reminds me of the I'm not sure if we want the scores by default in the response though, given how it adds a lot of data to the response (+95% in the example you provided, obviously relative overheads decreases with document size). Regarding normalization, I'm unsure of how the scores are computed right now. Are they independent from, e.g., the number of documents in the index? I'm a bit concerned that for the score to be comparable, the rules need to be similar, or at least carefully weighted (and same number of rules). I wonder if we could find a simpler solution to avoid pitfalls such as "accidentally making an entire index less relevant". On the whole, I really like the direction you're setting here! Feels like it opens up a lot of possibilities! |
Beta Was this translation helpful? Give feedback.
-
Thank @loiclec 👍 I'm cross-referencing a previous product discussion (#379) |
Beta Was this translation helpful? Give feedback.
-
Hello, This is pretty interesting, I like the idea to have detailed information on how a hit is found in an index. Like @dureuill, having this level of details by default is not a good idea, there is too much information for normal use cases. What would be great, is to have a single score per hit on regular searches. This would allow a user to query Meilisearch and make sure that only hits with a score over 0.7 for exemple are returned. |
Beta Was this translation helpful? Give feedback.
-
Thank @loiclec, I'm really interested too. That could allow to improve UX for some use case if a result have an important score or not. |
Beta Was this translation helpful? Give feedback.
-
It may be too early to know, but could it be developed directly into the |
Beta Was this translation helpful? Give feedback.
-
Amazing idea! We would love to have Federated Search as we have a few problems right now that would be easily solved if we could merge the results of multi-index search! Returning score in the response by default seems unnecessary. Maybe it would be better with a parameter (like a boolean: score) or a totally different api endpoint, so everyone could decide which to use. |
Beta Was this translation helpful? Give feedback.
-
Sharing and copying/pasting the initial message from @AymanHamdoun on #614 Feature DescriptionThere is a very useful feature in Algolia where you can send Feel free to check their API Reference Basically the results array would then look like this: [
{
"id": 111,
"title": "Some Document Title",
"_rankingInfo": {
"nbTypos": 0,
"proximityDistance": 0,
"nbExactWords": 0,
"words": 0,
}
},
...
] How it can be helpful ?Assume you have several indices, each containing a certain type of entity and you want to search each of these indices and provide one list of mixed results to your users. Lets say you have a Movies index and a Series index. and I search for "Help".
it would be very helpful to have the rankingInfo of each result, so i can simply merge the two results into one array and sort it by the ranking info attributes (matched words descending, typos descending, exact words descending, etc...) This is a huge part of my work actually and Algolia makes my result merging logic easy because it provides the rankingInfo for my results. If i were to compute them manually it would be such a waste of time coz the search engine already computed them at one point and i shouldn't need to compute them again. Also if i were to compute them myself, I may as well do so in a slightly different manner than the search engine which would cause some weird inconsistencies. |
Beta Was this translation helpful? Give feedback.
-
Adding my support for a feature like this along with my use case. I display my search results on a Mon-Fri calendar, so a result might be an event occurring on Mondays and Wednesdays at 1pm. I have a limit of 20 results, so when a user makes a very specific search (e.g. many keywords) the calendar gets clogged visually with 15-19 irrelevant results. This is because Meilisearch always tries to return the search limit by pruning search keywords / expanding to find potential typos. I'm fine with this behavior leading to some visually clogging on broad searches (e.g. one keyword), but I want to make the difference in relevancy more obvious. For example, using this feature I could use the ranking rules output to create a color mapping on the calendar for each result. The most relevant results could be red, and less would go down a gradient of yellow >> green >> blue. So the very specific search would have a few red results but the sharp dropoff in relevancy would mean the rest would be green/blue. But for the broad search, most of the results would be red/yellow, emphasizing the similarity in relevancy. If I tried to create this feature now using only the order of the search results, I wouldn't be able to distinguish the sudden dropoff in relevancy for a very specific search vs the similar relevance for a broad search. As a potential suggestion to address the issue of significantly increasing the response data, perhaps one option could be to calculate an aggregated "relevancy score" of the ranking rule output. A perfectly relevant result would have a score of 1.0, with corresponding reductions for each missed word, typo, and so on. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone! 👋 Quick update on our side, we've started working on a solution to return ranking details. Stay tuned for updates and feel free to keep the feedback coming! |
Beta Was this translation helpful? Give feedback.
-
Hello everyone 👋 We just released a 🧪 prototype that allows displaying ranking details when searching, and we'd love your feedback. How to get the prototype?Using docker, use the following command:
Alternatively, you can also build the prototype from source by checkout'ing to the How to use the prototype?You can find some usage examples below, or look at the original PR for more details. Getting the ranking details to customize result UI
Getting the ranking details for the results of multiple indexes to be able to re-rank documents coming from distinct indexes
Questions we have for you
Feedback and bug reporting when using this prototype are encouraged! Thanks in advance for your involvement. It means a lot to us ❤️ |
Beta Was this translation helpful? Give feedback.
-
Hello again 👋 We just released a new version of this prototype with some improvements thanks to your feedback ❤️ How to get the prototype?Using docker, use the following command:
Alternatively, you can also build the prototype from source by checkout'ing to the What has changed?
|
Beta Was this translation helpful? Give feedback.
-
Update: Scoring details will be available as an experimental feature in Meilisearch You can find the details in #674 |
Beta Was this translation helpful? Give feedback.
-
Hello everyone 👋 We have just released the first RC (release candidate) of Meilisearch containing this new feature! docker run -it --rm -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:v1.3.0-rc.0 You are welcome to leave your feedback in this discussion. If you encounter any bugs, please report them here. 🎉 The official and stable release containing this change will be available on July 31st, 2023 |
Beta Was this translation helpful? Give feedback.
-
@macraig Is it/will it be possible to filter on the scores too? Would be useful to say only return results >0.8 etc (per rule ideally). |
Beta Was this translation helpful? Give feedback.
-
Hey folks 👋 v1.3 has been released! 🦁 You can now get global and detailed ranking scores for your documents ✨ Note: Document ranking score details are considered Experimental, you can find the usage instructions in #674 📚 https://www.meilisearch.com/docs/learn/core_concepts/relevancy#ranking-score |
Beta Was this translation helpful? Give feedback.
-
I'd like to make the output of each ranking rule visible for each document returned by a search request.
Let's start with a simple example. Given the following ranking rules:
and the search query:
and the results:
We could return the following information:
With the caveat that the output of some ranking rules could be unavailable, because it was not necessary to execute them:
And an additional caveat that the output of a ranking rule would not necessarily be a number. For example, with the
sort
ranking rule, we could have results such as:Why
First, I would find it very useful to debug search relevancy problems. It would also help users understand how meilisearch works and help them fine-tune their settings and improve the relevancy of their search results.
Second, this is a building block towards automatically-aggregated multi-index search queries (i.e. federated search).
Federated Search
Given a per-index mapping from the ranking rules' outputs to a vector of numbers, we would be able to merge two sets of search results by sorting them by their score.
For example, with the ranking rules:
and the output:
The mapping could work as follows:
Then, we can compare the score of two search results from different indexes by comparing their score components one-by-one (lexicographically):
Note, however, that we may need to normalise each results further so that they are comparable. This could be done by giving a weight to each ranking rule for the two indexes
which could also work when the indexes have different ranking rules, with (very) carefully chosen weights. More options can be considered to make the results comparable, such as adding dummy ranking rules with constant scores:
["dummy", 0.5]
to tweak the per-score-component sorting behaviour. There is unfortunately no way to perform the sorting other than lexicographically though.Note also that I haven't considered what the score should be if the ranking rule's output is unknown (
= null
). This is an open problem for later, which could always be (desperately) resolved by forcing the execution of all ranking rules.Finally, we could consider tweaking the search algorithm so that it can stop its search when reaching results that fall below a given score. Let's say we perform a federated search on two indexes.
First, we perform a search on
indexA
, which gives 20 results with the normalized scores:Then, when we perform the request in
indexB
, we want to stop searching as soon as a document's score falls below[0.2, 0.85, 0.42524, 0.912]
, because we know it will rank below the 20th position (such an optimisation becomes more difficult to implement for search queries starting from a given offset, for pagination).Beta Was this translation helpful? Give feedback.
All reactions