Integrate model inference to build query #20

jmazanec15 · 2022-10-13T21:10:31Z

Description

Integrates ml-commons model inference capabilities to transform NeuralQueryBuilder into a KNNQueryBuilder. Minor changes to parsing logic and build.gradle to fix bugs. Minor enhancement to MLCommonsCLientAccessor to add single sentence inference.

Added unit tests to test functionality.

Working on integration tests. Confirmed it works on a local cluster by setting up k-NN index with 1K docs and running the following query:

$ curl -XPOST "localhost:9200/test-index/_search?_source_excludes=cool_field&pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "neural": {
      "cool_field": {
        "query_text": "Hello world!",
        "model_id": "rlAg04MB3cG1ZCLOBuDF",
        "k": 1000
      }
    }
  },
  "size": 5
}
'
{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : 0.002477972,
    "hits" : [
      {
        "_index" : "test-index",
        "_id" : "613",
        "_score" : 0.002477972,
        "_source" : { }
      },
      {
        "_index" : "test-index",
        "_id" : "864",
        "_score" : 0.0024763448,
        "_source" : { }
      },
      {
        "_index" : "test-index",
        "_id" : "190",
        "_score" : 0.002475292,
        "_source" : { }
      },
      {
        "_index" : "test-index",
        "_id" : "818",
        "_score" : 0.002458808,
        "_source" : { }
      },
      {
        "_index" : "test-index",
        "_id" : "340",
        "_score" : 0.0024474028,
        "_source" : { }
      }
    ]
  }
}

Issues Resolved

#14

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Integrates ml-commons model inference capabilities to transform NeuralQueryBuilder into a KNNQueryBuilder. Minor changes to parsing logic and build.gradle to fix bugs. Minor enhancement to MLCommonsCLientAccessor to add single sentence inference. Added uTs. Signed-off-by: John Mazanec <[email protected]>

navneet1v · 2022-10-13T21:40:01Z

src/main/java/org/opensearch/neuralsearch/ml/MLCommonsClientAccessor.java

+     * @param inputText {@link List} of {@link String} on which inference needs to happen
+     * @param listener {@link ActionListener} which will be called when prediction is completed or errored out
+     */
+    public void inferenceSentence(


you don't need this, when we already have a function which does for a list of sentences, please try to use that.

Im going to be writing the code to convert functionality to single sentence/single vector. Why not incorporate it here so it can be used by other features of the code as well?

All you need to do is do a get on the vector list, for that if we are creating a new function that is over abstraction. We already have 2 versions of this function in the class, and add third which is very specific seems to be an overkill for now.

I prefer to keep this here. I dont see a downside to having it. The name of the method "inferenceSentence" distinguishes it from "inferenceSentences" so there won't be any confusion when to use what. Also, I think single versus collection isnt so specific that it couldn't be useful outside my current use case. Many other Java interfaces/classes have methods for both.

All you need to do is do a get on the vector list,

True, but you also have to ensure only 1 vector is returned - so each client would also have to add this check/error handling. Using this function, we can centralize that check. Additionally, it is clunky to pass in a List<List<Float>> listener when you expect that there will only be 1 List<Float>

Given that the purpose of this class is to build an easy to use abstraction over MLClient, I think it fits to add this method here.

I can add parity method with inferenceSentences where filters are passed in as well.

Can we abstract the single sentence function in the queryBuilder class only? I just want to avoid confusion around the usage and 1 more function where we will have use case of not using the TARGET_RESPONSE_FILTERS, for 1 single sentence.

I dont think there is any usage confusion, given the method name and signature are descriptive of the functionality.

In terms of maintainability, I dont think it makes sense for other components that will want to use the inferenceSentence method to depend on the queryBuilder class. I think that the ml package should be responsible for providing easy/intuitive interaction with ml-commons for the components in the plugin and should therefore be able to house this functionality.

That being said, I think it makes sense to either integrate it into this class or build a new class that uses this class to provide higher level abstractions. In a similar case, OpenSearch has a "HighLevel" rest client that is built on top of the lower level RestClient to provide more functionality. We could do this, or we could treat the MLCommonsClientAccessor as the higher level client.

navneet1v · 2022-10-13T21:48:06Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+        if (vectorSupplier() != null && vectorSupplier.get() != null) {
+            return new KNNQueryBuilder(fieldName(), vectorSupplier.get(), k());
+        }


why do we need these checks, if the query builder is running the vectorSupplier will be null isn't it?

VectorSupplier is null initially and then gets set during rewrite. The actual vector will not get set until the async call finishes completely.

navneet1v · 2022-10-13T21:48:51Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+            queryRewriteContext.registerAsyncAction(
+                ((client, actionListener) -> ML_CLIENT.inferenceSentence(modelId(), queryText(), ActionListener.wrap(floatList -> {
+                    vectorSetOnce.set(vectorAsListToArray(floatList));
+                    actionListener.onResponse(null);


why setting null on the OnReponse?

This was how it was done for Geo: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/query/AbstractGeometryQueryBuilder.java#L519.

The rewrite context specifies listener response type as wildcard. That being said, we dont control the listener that is passed in, so all we can do is return null.

I think the reason of making onResponse(null) is because we have already set the response to vector supplier.

Can we add some more details around how this rewrite query function will work with first if condition like vectorSupplier != null and all.

Because what I am getting from this whole function is the re-writeQuery function can be called atleast 2 times. where one time vectorSupplier will be null and next time it will not be null. is that understanding correct?

I think the reason of making onResponse(null) is because we have already set the response to vector supplier.

We are not setting the response, but instead setting the value of the supplier.

Can we add some more details around how this rewrite query function will work with first if condition like vectorSupplier != null and all.
Because what I am getting from this whole function is the re-writeQuery function can be called atleast 2 times. where one time vectorSupplier will be null and next time it will not be null. is that understanding correct?

Yes, will add additional context in the comment. From my understanding, the query will be rewritten until the rewrite method returns the object itself. So, on first rewrite, the supplier will be null. On second rewrite, the supplier will be set, but the vector may be null, depending if the async call finished in time or not. Let me double check this though and add a comment.

I think the reason of making onResponse(null) is because we have already set the response to vector supplier.

We are not setting the response, but instead setting the value of the supplier.

I mean here the value only.

Can we add some more details around how this rewrite query function will work with first if condition like vectorSupplier != null and all.
Because what I am getting from this whole function is the re-writeQuery function can be called atleast 2 times. where one time vectorSupplier will be null and next time it will not be null. is that understanding correct?

Yes, will add additional context in the comment. From my understanding, the query will be rewritten until the rewrite method returns the object itself. So, on first rewrite, the supplier will be null. On second rewrite, the supplier will be set, but the vector may be null, depending if the async call finished in time or not. Let me double check this though and add a comment.

My understanding is if we keep on returning the NeuralQueryBuilder, then rewrite will keep on happening(which we are doing).

To stop query re-write we added vectorSupplier() != null && vectorSupplier.get() != null) condition so that we can return another QueryBuilder(KNNQueryBuilder) which only happens when we have response from MLCommonsClientAccessor.

Please confirm this understanding is correct or not.

On second rewrite, the supplier will be set, but the vector may be null, depending if the async call finished in time or not. Let me double check this though and add a comment.

I think when second rewrite happens, the supplier vector shouldn't be null, because, the second call will only happen when the async call complete because the for loop is returned here. Once the recursive call is complete and a KNNQueryBuilder is returned, the passed-in listener will be executed and continue the query flow.

My understanding is if we keep on returning the NeuralQueryBuilder, then rewrite will keep on happening(which we are doing).

To stop query re-write we added vectorSupplier() != null && vectorSupplier.get() != null) condition so that we can return another QueryBuilder(KNNQueryBuilder) which only happens when we have response from MLCommonsClientAccessor.

Please confirm this understanding is correct or not.

Yes this is correct.

I think when second rewrite happens, the supplier vector shouldn't be null, because, the second call will only happen when the async call complete because the for loop is returned here. Once the recursive call is complete and a KNNQueryBuilder is returned, the passed-in listener will be executed and continue the query flow.

Right, thats correct, but I return a copy at the end instead of "this" to prevent a case where rewrite stops early.

src/main/java/org/opensearch/neuralsearch/common/VectorUtil.java

Signed-off-by: John Mazanec <[email protected]>

navneet1v · 2022-10-14T00:31:33Z

build.gradle

@@ -59,6 +59,7 @@ opensearchplugin {
    classname "${projectPath}.${pathToPlugin}.${pluginClassName}"
    licenseFile rootProject.file('LICENSE')
    noticeFile rootProject.file('NOTICE')
+    extendedPlugins = ['opensearch-knn']


should we add MLCommons also over here? or this is just for the plugins who we depend during compile time?

For ml-commons, we are okay because we just take the client as a dependency.

navneet1v · 2022-10-14T00:35:10Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+        // Rewrites will continuously happen until the supplier is set and the vector is generated. Rewrites will stop
+        // once this object is returned. Hence, if we get here, we need to return a new object
+        return new NeuralQueryBuilder(fieldName(), queryText(), modelId(), k(), vectorSupplier());


Isn't the case is if we are coming on this line then rewrites will happen continuously. Isn't the documentation little off?

Yes thats my understanding. Let me provide better comment on this.

Here is what I was basing this off of: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/query/QueryBuilder.java#L90-L98

~~Here is the logic: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/query/Rewriteable.java#L83~~

Actually this one: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/query/Rewriteable.java#L117

navneet1v

Overall code looks good to me. Minor comments.

navneet1v · 2022-10-15T06:00:18Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+    private static MLCommonsClientAccessor ML_CLIENT;
+
+    public static void initialize(MLCommonsClientAccessor mlClient) {
+        NeuralQueryBuilder.ML_CLIENT = mlClient;
+    }
+


Question: Can we init the MLCommonsClientAccessor via the constructor of NeuralQueryBuilder class? is it not possible to do so?

Good question. I thought a little bit about this, but I am not sure about how to handle stream constructors. Any ideas on this?

Let's leave it like this only then. I am worried on the streamInput constructor, that should not cause a NPE for MLCommonsClientAccessor.

Sure, will leave as is

navneet1v · 2022-10-15T06:01:11Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+            queryRewriteContext.registerAsyncAction(
+                ((client, actionListener) -> ML_CLIENT.inferenceSentence(modelId(), queryText(), ActionListener.wrap(floatList -> {
+                    vectorSetOnce.set(vectorAsListToArray(floatList));
+                    actionListener.onResponse(null);


I think the reason of making onResponse(null) is because we have already set the response to vector supplier.

We are not setting the response, but instead setting the value of the supplier.

I mean here the value only.

navneet1v · 2022-10-15T06:04:54Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+            queryRewriteContext.registerAsyncAction(
+                ((client, actionListener) -> ML_CLIENT.inferenceSentence(modelId(), queryText(), ActionListener.wrap(floatList -> {
+                    vectorSetOnce.set(vectorAsListToArray(floatList));
+                    actionListener.onResponse(null);


Can we add some more details around how this rewrite query function will work with first if condition like vectorSupplier != null and all.
Because what I am getting from this whole function is the re-writeQuery function can be called atleast 2 times. where one time vectorSupplier will be null and next time it will not be null. is that understanding correct?

Yes, will add additional context in the comment. From my understanding, the query will be rewritten until the rewrite method returns the object itself. So, on first rewrite, the supplier will be null. On second rewrite, the supplier will be set, but the vector may be null, depending if the async call finished in time or not. Let me double check this though and add a comment.

My understanding is if we keep on returning the NeuralQueryBuilder, then rewrite will keep on happening(which we are doing).

To stop query re-write we added vectorSupplier() != null && vectorSupplier.get() != null) condition so that we can return another QueryBuilder(KNNQueryBuilder) which only happens when we have response from MLCommonsClientAccessor.

Please confirm this understanding is correct or not.

zane-neo · 2022-10-16T02:08:23Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+            queryRewriteContext.registerAsyncAction(
+                ((client, actionListener) -> ML_CLIENT.inferenceSentence(modelId(), queryText(), ActionListener.wrap(floatList -> {
+                    vectorSetOnce.set(vectorAsListToArray(floatList));
+                    actionListener.onResponse(null);


My understanding here: doRewrite method will be invoked by rewriteAndFetch which is a recursive method by wrapping itself invocation in the listener. And once the predict invocation is done, the vectorSupplier will be populated with valid data, and a KNNQueryBuilder is returned, next recursive call will hit the check on vectorSupplier and return KNNQueryBuilder again, then this method stop.

When the action gets invoked at executeAsyncActions, the actionListener.onResponse(null) here means to invoke the wildcardListener's on Response to count down. So this is a mandatory and necessary operation.

Right, so we need to call listener on null so countdown happens

@jmazanec15 please add these code links as documentation on top of this code.

ylwu-amzn · 2022-10-18T05:00:20Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

-        // Rewrites will continuously happen until the supplier is set and the vector is generated. Rewrites will stop
-        // once this object is returned. Hence, if we get here, we need to return a new object
-        return new NeuralQueryBuilder(fieldName(), queryText(), modelId(), k(), vectorSupplier());
+        return this;


I think the third commit fixed the rescore exception of too many rewrite rounds. The key part is don't return a new NeuralQueryBuilder. That's why the PoC code worked for rescore before

Thanks for fixing the rescore issue. The code LGTM.
Please also test other possible use cases.

ylwu-amzn

LGTM

navneet1v · 2022-10-18T15:15:39Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+            queryRewriteContext.registerAsyncAction(
+                ((client, actionListener) -> ML_CLIENT.inferenceSentence(modelId(), queryText(), ActionListener.wrap(floatList -> {
+                    vectorSetOnce.set(vectorAsListToArray(floatList));
+                    actionListener.onResponse(null);


@jmazanec15 please add these code links as documentation on top of this code.

navneet1v · 2022-10-18T15:16:56Z

src/main/java/org/opensearch/neuralsearch/plugin/query/NeuralQueryBuilder.java

+                    actionListener.onResponse(null);
+                }, actionListener::onFailure)))
+            );
+            return new NeuralQueryBuilder(fieldName(), queryText(), modelId(), k(), vectorSetOnce::get);


[Query] : is there any reason why we are not returning "this" here and returning a new query object? and also I am not able to understand in which case line number 218 will hit?
vectorSupplier can be null or not null.

We are not returning this here because we are changing the supplier of the query builder. If we return this, this check https://github.com/opensearch-project/OpenSearch/blob/e44158d4d10d4f8905895ffa50bf9398b8550667/server/src/main/java/org/opensearch/index/query/Rewriteable.java#L109 will see the same reference is indicate another round of rewrites does not need to be performed. So instead, we copy it. This is how Geo does it: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/index/query/AbstractGeometryQueryBuilder.java#L509.

vectorSupplier can be null or not null.

Good point. Ill simplify

Signed-off-by: John Mazanec <[email protected]>

jmazanec15 requested a review from a team October 13, 2022 21:10

navneet1v reviewed Oct 13, 2022

View reviewed changes

src/main/java/org/opensearch/neuralsearch/common/VectorUtil.java Show resolved Hide resolved

Update access to package

cb516ce

Signed-off-by: John Mazanec <[email protected]>

jmazanec15 requested a review from navneet1v October 13, 2022 22:01

navneet1v reviewed Oct 14, 2022

View reviewed changes

jmazanec15 requested a review from navneet1v October 14, 2022 04:18

navneet1v reviewed Oct 15, 2022

View reviewed changes

zane-neo reviewed Oct 16, 2022

View reviewed changes

jmazanec15 requested review from navneet1v and zane-neo October 17, 2022 18:56

jmazanec15 force-pushed the issue-14 branch from 1bbde5e to b7b5af3 Compare October 18, 2022 00:55

ylwu-amzn reviewed Oct 18, 2022

View reviewed changes

ylwu-amzn approved these changes Oct 18, 2022

View reviewed changes

navneet1v reviewed Oct 18, 2022

View reviewed changes

jmazanec15 force-pushed the issue-14 branch 2 times, most recently from 43540a3 to b1d826c Compare October 18, 2022 17:33

Add fix for doRewrite to query builder

6c236ff

Signed-off-by: John Mazanec <[email protected]>

jmazanec15 force-pushed the issue-14 branch from b1d826c to 6c236ff Compare October 18, 2022 17:37

Refactor doRewrite method

00585a3

Signed-off-by: John Mazanec <[email protected]>

jmazanec15 requested a review from navneet1v October 18, 2022 17:45

navneet1v approved these changes Oct 19, 2022

View reviewed changes

jmazanec15 merged commit 272d803 into opensearch-project:main Oct 19, 2022

jmazanec15 mentioned this pull request Oct 19, 2022

Upgrade plugin to 2.4 and refactor zip dependencies #25

Merged

1 task

jmazanec15 added the Features Introduces a new unit of functionality that satisfies a requirement label Nov 3, 2022

Integrate model inference to build query #20

Integrate model inference to build query #20

Conversation

jmazanec15 commented Oct 13, 2022

Description

Issues Resolved

Check List

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zane-neo Oct 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

navneet1v Oct 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmazanec15 Oct 14, 2022 • edited Loading

Choose a reason for hiding this comment

navneet1v left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zane-neo Oct 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylwu-amzn Oct 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ylwu-amzn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zane-neo Oct 16, 2022 •

edited

Loading

navneet1v Oct 14, 2022 •

edited

Loading

jmazanec15 Oct 14, 2022 •

edited

Loading

zane-neo Oct 16, 2022 •

edited

Loading

ylwu-amzn Oct 18, 2022 •

edited

Loading