Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow kNN searches with index alias filters #79654

Closed

Conversation

mayya-sharipova
Copy link
Contributor

This PR throws an exception for kNN searches on filtered aliases.
We don't allow kNN searches on filtered aliases as currently filters are
applied only after kNN searches are done, which may lead to returning
less than k results.

In the future, we want to apply filters while doing a kNN search.
Once implemented, we will allow kNN searches on filtered aliases.

Relates to #78473

This PR throws an exception for kNN searches on filetered aliases.
We don't allow kNN searches on filtered aliases as currently filters are
applied only after kNN searches are done, which may lead to returning
less than k results.

In the future, we want to apply filters while doing a kNN search.
Once implemented, we will allow kNN searches on filtered aliases.

Relates to elastic#78473
@mayya-sharipova mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Oct 21, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@jtibshirani jtibshirani mentioned this pull request Oct 21, 2021
17 tasks
Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense to me. Maybe we could double-check if the data management team is okay with it -- it looks like they've done a lot of work on IndicesOptions through their work on the index APIs.

@@ -699,6 +707,9 @@ private void executeSearch(SearchTask task, SearchTimeProvider timeProvider, Sea
searchService.getResponseCollectorService(), nodeSearchCounts);
final Set<String> indicesAndAliases = indexNameExpressionResolver.resolveExpressions(clusterState, searchRequest.indices());
aliasFilter = buildPerIndexAliasFilter(clusterState, indicesAndAliases, indices, remoteAliasMap);
if (searchRequest.indicesOptions().forbidFilteredAliases()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is being introduced as a general option, maybe we should check it at a lower layer, like in IndicesService#buildAliasFilter. The other indices options apply across endpoints, not just to search.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I've addressed it in bd484b1, but I like the previous approach more – keeping it only in TransportSearchAction, as we consider indicesOptions().forbidFilteredAliases() only a temporary measure, and want to remove it later. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the current approach but don't feel strongly. It'd be great to hear the data management team's opinion here.

- Move all request building logic to KnnSearchRequestBuilder
- Move a check for allowing filtered aliases to IndicesService
@mayya-sharipova mayya-sharipova added the :Data Management/Indices APIs APIs to create and manage indices and templates label Oct 25, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Oct 25, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@mayya-sharipova
Copy link
Contributor Author

@jtibshirani Thanks for the review. I've addressed your feedback!

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Oct 25, 2021

@elastic/es-data-management We wanted to check if you are ok with us introducing a new IndicesOptions.Option of FORBID_FILTERED_ALIASES that will disallow forbidding filtered aliases. We need this option temporarily for the knn search as currently it doesn't support searching across filtered aliases.

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Oct 25, 2021

Chatted with @elastic/es-data-management, and they suggested that introducing an extra Indices options just for this use case seems to be excessive.

Since we haven't found any other elegant solution, @jtibshirani and me decided to close this PR and instead to update documentation with a note that doing _knn_search on filtered aliases is not recommended, as filter is applied post knn search, and hence the number of results returned can be less < k, and 0 in the worst case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates :Search/Search Search-related issues that do not fall into other categories Team:Data Management Meta label for data/management team Team:Search Meta label for search team v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants