support question answering model #2208

rbhavna · 2024-03-16T03:58:34Z

Description

support question answering model. This PR adds question answering model to the list of existing list of models supported by ml-commons. It expects question and context and gives the answer based on the context provides. Below is a sample predict API request to QA model and its expected output

POST /_plugins/_ml/models/m6LBVI4BuIvXgszWP7KN/_predict
{
    "question": "Where do I live",
    "context":  "I am Clara. I live in Texas."
}

// Response
{
    "inference_results": [
        {
            "output": [
                {
                    "result": "Texas"
                }
            ]
        }
    ]
}

Issues Resolved

#1873

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

dhrubo-os · 2024-03-16T04:00:44Z

common/src/main/java/org/opensearch/ml/common/input/MLInput.java

@@ -269,7 +269,7 @@ public static MLInput parse(XContentParser parser, String inputAlgoName) throws
            }
        }
        MLInputDataset inputDataSet = null;
-        if (algorithm == FunctionName.TEXT_EMBEDDING || algorithm == FunctionName.SPARSE_ENCODING || algorithm == FunctionName.SPARSE_TOKENIZE) {
+        if (algorithm == FunctionName.TEXT_EMBEDDING || algorithm == FunctionName.SPARSE_ENCODING || algorithm == FunctionName.SPARSE_TOKENIZE || algorithm == FunctionName.QUESTION_ANSWERING) {


Let's add a map, this branch will get longer and longer....

dhrubo-os · 2024-03-16T04:07:17Z

...main/java/org/opensearch/ml/engine/algorithms/question_answering/QuestionAnsweringModel.java

+
+@Log4j2
+@Function(FunctionName.QUESTION_ANSWERING)
+public class QuestionAnsweringModel extends DLModel {


This model will also eventually use TextEmbeddingModelConfig, which is not ideal. Let's create a separate model config for this type of model.

I am trying to see what config fields we will need for QA model and add accordingly

...java/org/opensearch/ml/engine/algorithms/question_answering/QuestionAnsweringTranslator.java

.../java/org/opensearch/ml/engine/algorithms/question_answering/QuestionAnsweringModelTest.java

dhrubo-os · 2024-03-16T04:09:55Z

@HenryL27 could you please review the PR?

codecov · 2024-03-16T04:23:55Z

Codecov Report

Attention: Patch coverage is 62.45734% with 110 lines in your changes are missing coverage. Please review.

Project coverage is 81.70%. Comparing base (189f2a2) to head (4de3643).
Report is 5 commits behind head on main.

❗ Current head 4de3643 differs from pull request most recent head fb127f2. Consider uploading reports for the commit fb127f2 to get more accurate results

Files	Patch %	Lines
.../ml/common/model/QuestionAnsweringModelConfig.java	60.43%	27 Missing and 9 partials ⚠️
...ain/java/org/opensearch/ml/engine/ModelHelper.java	30.76%	34 Missing and 2 partials ⚠️
...n/java/org/opensearch/ml/common/input/MLInput.java	13.33%	11 Missing and 2 partials ⚠️
.../ml/common/input/nlp/QuestionAnsweringMLInput.java	82.05%	3 Missing and 4 partials ⚠️
...ommon/transport/register/MLRegisterModelInput.java	25.00%	3 Missing and 3 partials ⚠️
...rc/main/java/org/opensearch/ml/common/MLModel.java	0.00%	2 Missing and 2 partials ⚠️
...opensearch/ml/common/output/model/ModelTensor.java	0.00%	4 Missing ⚠️
...ansport/upload_chunk/MLRegisterModelMetaInput.java	33.33%	2 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2208      +/-   ##
============================================
- Coverage     81.90%   81.70%   -0.21%     
- Complexity     5719     5755      +36     
============================================
  Files           547      552       +5     
  Lines         23075    23325     +250     
  Branches       2378     2409      +31     
============================================
+ Hits          18900    19057     +157     
- Misses         3230     3302      +72     
- Partials        945      966      +21

Flag	Coverage Δ
ml-commons	`81.70% <62.45%> (-0.21%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ylwu-amzn · 2024-03-16T16:55:01Z

common/src/main/java/org/opensearch/ml/common/input/nlp/TextDocsMLInput.java

@@ -25,7 +25,7 @@
 * ML input class which supports a list fo text docs.
 * This class can be used for TEXT_EMBEDDING model.
 */
-@org.opensearch.ml.common.annotation.MLInput(functionNames = {FunctionName.TEXT_EMBEDDING, FunctionName.SPARSE_ENCODING, FunctionName.SPARSE_TOKENIZE})
+@org.opensearch.ml.common.annotation.MLInput(functionNames = {FunctionName.TEXT_EMBEDDING, FunctionName.SPARSE_ENCODING, FunctionName.SPARSE_TOKENIZE, FunctionName.QUESTION_ANSWERING})


I remember you plan to add new input/output type for QA model, will you add it in next commit?

Yes pushing the new commit with the following input/out formats
// Input { "question": "What color is Apple", "context": "I like Apples. Because they are red" }

// output { "inference_results": [ { "output": [ { "result": "red" } ] } ] }

Do we still needs to add FunctionName.QUESTION_ANSWERING here?

yeah good catch, we don't need it here anymore. Thanks @dhrubo-os will remove it

opensearch-trigger-bot · 2024-03-19T07:51:06Z

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-2208-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c560fcca6ee2e413e7ebc2503a2e64ad691e6e2b
# Push it to GitHub
git push --set-upstream origin backport/backport-2208-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-2208-to-2.x.

opensearch-trigger-bot · 2024-03-19T07:51:24Z

The backport to 2.13 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.13 2.13
# Navigate to the new working tree
cd .worktrees/backport-2.13
# Create a new branch
git switch --create backport/backport-2208-to-2.13
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c560fcca6ee2e413e7ebc2503a2e64ad691e6e2b
# Push it to GitHub
git push --set-upstream origin backport/backport-2208-to-2.13
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.13

Then, create a pull request where the base branch is 2.13 and the compare/head branch is backport/backport-2208-to-2.13.

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna requested review from b4sjoo, dhrubo-os, jngz-es, model-collapse, ylwu-amzn, zane-neo, Zhangxunmt, austintlee, HenryL27 and sam-herman as code owners March 16, 2024 03:58

rbhavna had a problem deploying to ml-commons-cicd-env March 16, 2024 03:58 — with GitHub Actions Error

rbhavna had a problem deploying to ml-commons-cicd-env March 16, 2024 03:58 — with GitHub Actions Failure

rbhavna temporarily deployed to ml-commons-cicd-env March 16, 2024 03:58 — with GitHub Actions Inactive

dhrubo-os reviewed Mar 16, 2024

View reviewed changes

rbhavna temporarily deployed to ml-commons-cicd-env March 16, 2024 04:26 — with GitHub Actions Inactive

ylwu-amzn reviewed Mar 16, 2024

View reviewed changes

rbhavna had a problem deploying to ml-commons-cicd-env March 18, 2024 18:42 — with GitHub Actions Error

rbhavna had a problem deploying to ml-commons-cicd-env March 18, 2024 18:42 — with GitHub Actions Failure

rbhavna temporarily deployed to ml-commons-cicd-env March 18, 2024 18:43 — with GitHub Actions Inactive

rbhavna force-pushed the qa_model branch from 3ec4e8c to 6714bdc Compare March 18, 2024 18:47

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Failure

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Error

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Failure

dhrubo-os approved these changes Mar 19, 2024

View reviewed changes

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Error

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Error

rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure

b4sjoo approved these changes Mar 19, 2024

View reviewed changes

rbhavna merged commit c560fcc into opensearch-project:main Mar 19, 2024
3 of 9 checks passed

rbhavna added backport 2.x backport 2.13 labels Mar 19, 2024

rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024

support question answering model (opensearch-project#2208)

cd89fb5

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna mentioned this pull request Mar 19, 2024

[Backport 2.x] support question answering model (#2208) #2223

Closed

5 tasks

rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024

support question answering model (opensearch-project#2208)

07f7a5b

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024

support question answering model (opensearch-project#2208)

e0fc394

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna mentioned this pull request Mar 19, 2024

[Backport 2.x] support question answering model (#2208) #2224

Merged

5 tasks

rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024

support question answering model (opensearch-project#2208)

b704700

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024

support question answering model (opensearch-project#2208)

e991801

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna mentioned this pull request Mar 19, 2024

[Backport 2.13] support question answering model (#2208) #2225

Merged

5 tasks

rbhavna added a commit that referenced this pull request Mar 19, 2024

support question answering model (#2208) (#2225)

e236177

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

rbhavna added a commit that referenced this pull request Mar 19, 2024

support question answering model (#2208) (#2224)

680da8b

* support question answering model Signed-off-by: Bhavana Ramaram <[email protected]> (cherry picked from commit c560fcc)

jngz-es mentioned this pull request Apr 1, 2024

[FEATURE] Add QA model support #2114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support question answering model #2208

support question answering model #2208

rbhavna commented Mar 16, 2024 •

edited

Loading

dhrubo-os Mar 16, 2024

dhrubo-os Mar 16, 2024

rbhavna Mar 18, 2024

dhrubo-os commented Mar 16, 2024

codecov bot commented Mar 16, 2024 •

edited

Loading

ylwu-amzn Mar 16, 2024

rbhavna Mar 18, 2024

dhrubo-os Mar 18, 2024

rbhavna Mar 18, 2024 •

edited

Loading

opensearch-trigger-bot bot commented Mar 19, 2024

opensearch-trigger-bot bot commented Mar 19, 2024

support question answering model #2208

support question answering model #2208

Conversation

rbhavna commented Mar 16, 2024 • edited Loading

Description

Issues Resolved

Check List

dhrubo-os Mar 16, 2024

Choose a reason for hiding this comment

dhrubo-os Mar 16, 2024

Choose a reason for hiding this comment

rbhavna Mar 18, 2024

Choose a reason for hiding this comment

dhrubo-os commented Mar 16, 2024

codecov bot commented Mar 16, 2024 • edited Loading

Codecov Report

ylwu-amzn Mar 16, 2024

Choose a reason for hiding this comment

rbhavna Mar 18, 2024

Choose a reason for hiding this comment

dhrubo-os Mar 18, 2024

Choose a reason for hiding this comment

rbhavna Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

opensearch-trigger-bot bot commented Mar 19, 2024

opensearch-trigger-bot bot commented Mar 19, 2024

rbhavna commented Mar 16, 2024 •

edited

Loading

codecov bot commented Mar 16, 2024 •

edited

Loading

rbhavna Mar 18, 2024 •

edited

Loading