Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] KNN Queries Fails when segment replication is enabled and replicas has deleted docs #1807

Closed
navneet1v opened this issue Jul 9, 2024 · 1 comment
Assignees
Labels
bug Something isn't working v2.16.0

Comments

@navneet1v
Copy link
Collaborator

What is the bug?
When segment replication is enabled on the index with more than 1 replica, exception happens when we try to do search on the replica if index contains deleted docs.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Setup a cluster with more than 1 node.
  2. Create a knn index with segment replication enabled and atleast 1 replica.
  3. Index some data in the index.
  4. Delete few docs from the index.
  5. Now run the search with preference=_replica
  6. Error will be thrown.
PUT table
{
  "settings": {
    "index": {
      "knn": true,
      "number_of_replicas": 1
    }
  }, 
  "mappings": {
    "_source": {
      "enabled": true
    }, 
    "properties": {
      "my_vector1": {
        "type": "knn_vector",
                "dimension": 1,
                "method": {
                    "name": "hnsw",
                    "space_type": "l2",
                    "engine": "faiss"
                }
      },
      "rowId": {
        "type": "keyword",
        "store": true
      },
      "applicationId": {
        "type": "keyword",
        "store": true
      }
    }
  }
}

Now index data in the index.

PUT /_bulk?refresh=true
{ "index": { "_index": "table" } }
{"my_vector1":[1], "rowId": "recWtYS3a5LENkhte", "applicationId": "appw3EJpjHiON9qhd", "id": "1"}
{ "index": { "_index": "table"} }
{"my_vector1":[2], "rowId": "recZRfK7DnN0K0ZKM", "applicationId": "appw3EJpjHiON9qhd", "id": "2"}
{ "index": { "_index": "table"} }
{"my_vector1":[3], "rowId": "ABCrecsCBwqA7E7OWvqw", "applicationId": "appw3EJpjHiON9qhd", "id": "3"}
{ "index": { "_index": "table"} }
{"my_vector1":[4], "rowId": "recWtYS3a5LENkhte", "applicationId": "appw5EJpjHiON9qhd", "id": "4"}
{ "index": { "_index": "table"} }
{"my_vector1":[5], "rowId": "recZRfK7DnN0K0ZKM", "applicationId": "appw6EJpjHiON9qhd", "id": "5"}
{ "index": { "_index": "table"} }
{"my_vector1":[6], "rowId": "recsCBwqA7E7OWvqw", "applicationId": "appw3EJpjHiON9qhd", "id": "6"}
{ "index": { "_index": "table" } }
{"my_vector1":[7], "rowId": "recWtYS3a5LENkhte", "applicationId": "appw3EJpjHiON9qhd", "id": "7"}
{ "index": { "_index": "table"} }
{"my_vector1":[8], "rowId": "recZRfK7DnN0K0ZKM", "applicationId": "appw3EJpjHiON9qhd", "id": "8"}
{ "index": { "_index": "table"} }
{"my_vector1":[9], "rowId": "ABCrecsCBwqA7E7OWvqw", "applicationId": "appw3EJpjHiON9qhd", "id": "9"}
{ "index": { "_index": "table"} }
{"my_vector1":[10], "rowId": "recWtYS3a5LENkhte", "applicationId": "appw5EJpjHiON9qhd", "id": "10"}
{ "index": { "_index": "table"} }
{"my_vector1":[11], "rowId": "recZRfK7DnN0K0ZKM", "applicationId": "appw6EJpjHiON9qhd", "id": "11"}
{ "index": { "_index": "table"} }
{"my_vector1":[12], "rowId": "recsCBwqA7E7OWvqw", "applicationId": "appw3EJpjHiON9qhd", "id": "12"}
Delete some documents from the index.
DELETE table/_doc/<Id>

Now run the search by using preference as _replica

POST table/_search?preference=_replica
{
  "query": {
    "knn": {
      "my_vector1": {
        "vector": [
          4
        ],
        "k": 10
      }
    }
  }
}

Error Logs:

Caused by: NotSerializableExceptionWrapper[class_cast_exception: class org.apache.lucene.index.SoftDeletesDirectoryReaderWrapper$SoftDeletesFilterCodecReader cannot be cast to class org.apache.lucene.index.SegmentReader (org.apache.lucene.index.SoftDeletesDirectoryReaderWrapper$SoftDeletesFilterCodecReader and org.apache.lucene.index.SegmentReader are in unnamed module of loader 'app')]
    at org.opensearch.knn.index.query.KNNWeight.doANNSearch(KNNWeight.java:198)
    at org.opensearch.knn.index.query.KNNWeight.scorer(KNNWeight.java:122)
    at org.apache.lucene.search.Weight.scorerSupplier(Weight.java:135)
    at org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:596)
    at org.apache.lucene.search.Weight.bulkScorer(Weight.java:165)
    at org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:449)
    at org.opensearch.search.internal.ContextIndexSearcher$1.bulkScorer(ContextIndexSearcher.java:383)
    at org.opensearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:324)
    at org.opensearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:283)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:552)
    at org.opensearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:362)
    at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWithCollector(QueryPhase.java:449)
    at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWith(QueryPhase.java:433)
    at org.opensearch.search.query.QueryPhaseSearcherWrapper.searchWith(QueryPhaseSearcherWrapper.java:60)
    at org.opensearch.neuralsearch.search.query.HybridQueryPhaseSearcher.searchWith(HybridQueryPhaseSearcher.java:49)
    at org.opensearch.search.query.QueryPhase.executeInternal(QueryPhase.java:284)
    at org.opensearch.search.query.QueryPhase.execute(QueryPhase.java:157)
    at org.opensearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:556)
    at org.opensearch.search.SearchService.executeQueryPhase(SearchService.java:620)
    at org.opensearch.search.SearchService$2.lambda$onResponse$0(SearchService.java:589)
    at org.opensearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:74)
    at org.opensearch.action.ActionRunnable$2.doRun(ActionRunnable.java:89)
    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
    at org.opensearch.threadpool.TaskAwareRunnable.doRun(TaskAwareRunnable.java:78)
    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
    at org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59)
    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:922)
    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.lang.Thread.run(Thread.java:840)

What is the expected behavior?
Search should be successful.

What is your host/environment?

  • Plugins: KNN

Do you have any screenshots?
NA

Do you have any additional context?
Add any other context about the problem.

@navneet1v
Copy link
Collaborator Author

Code merged in 2.16 version of Opensearch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v2.16.0
Projects
Status: Done
Development

No branches or pull requests

1 participant