Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support question answering model #2208

Merged
merged 5 commits into from
Mar 19, 2024
Merged

Conversation

rbhavna
Copy link
Collaborator

@rbhavna rbhavna commented Mar 16, 2024

Description

support question answering model. This PR adds question answering model to the list of existing list of models supported by ml-commons. It expects question and context and gives the answer based on the context provides. Below is a sample predict API request to QA model and its expected output

POST /_plugins/_ml/models/m6LBVI4BuIvXgszWP7KN/_predict
{
    "question": "Where do I live",
    "context":  "I am Clara. I live in Texas."
}
// Response
{
    "inference_results": [
        {
            "output": [
                {
                    "result": "Texas"
                }
            ]
        }
    ]
}
            

Issues Resolved

#1873

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -269,7 +269,7 @@ public static MLInput parse(XContentParser parser, String inputAlgoName) throws
}
}
MLInputDataset inputDataSet = null;
if (algorithm == FunctionName.TEXT_EMBEDDING || algorithm == FunctionName.SPARSE_ENCODING || algorithm == FunctionName.SPARSE_TOKENIZE) {
if (algorithm == FunctionName.TEXT_EMBEDDING || algorithm == FunctionName.SPARSE_ENCODING || algorithm == FunctionName.SPARSE_TOKENIZE || algorithm == FunctionName.QUESTION_ANSWERING) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a map, this branch will get longer and longer....


@Log4j2
@Function(FunctionName.QUESTION_ANSWERING)
public class QuestionAnsweringModel extends DLModel {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model will also eventually use TextEmbeddingModelConfig, which is not ideal. Let's create a separate model config for this type of model.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to see what config fields we will need for QA model and add accordingly

@dhrubo-os
Copy link
Collaborator

@HenryL27 could you please review the PR?

Copy link

codecov bot commented Mar 16, 2024

Codecov Report

Attention: Patch coverage is 62.45734% with 110 lines in your changes are missing coverage. Please review.

Project coverage is 81.70%. Comparing base (189f2a2) to head (4de3643).
Report is 5 commits behind head on main.

❗ Current head 4de3643 differs from pull request most recent head fb127f2. Consider uploading reports for the commit fb127f2 to get more accurate results

Files Patch % Lines
.../ml/common/model/QuestionAnsweringModelConfig.java 60.43% 27 Missing and 9 partials ⚠️
...ain/java/org/opensearch/ml/engine/ModelHelper.java 30.76% 34 Missing and 2 partials ⚠️
...n/java/org/opensearch/ml/common/input/MLInput.java 13.33% 11 Missing and 2 partials ⚠️
.../ml/common/input/nlp/QuestionAnsweringMLInput.java 82.05% 3 Missing and 4 partials ⚠️
...ommon/transport/register/MLRegisterModelInput.java 25.00% 3 Missing and 3 partials ⚠️
...rc/main/java/org/opensearch/ml/common/MLModel.java 0.00% 2 Missing and 2 partials ⚠️
...opensearch/ml/common/output/model/ModelTensor.java 0.00% 4 Missing ⚠️
...ansport/upload_chunk/MLRegisterModelMetaInput.java 33.33% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2208      +/-   ##
============================================
- Coverage     81.90%   81.70%   -0.21%     
- Complexity     5719     5755      +36     
============================================
  Files           547      552       +5     
  Lines         23075    23325     +250     
  Branches       2378     2409      +31     
============================================
+ Hits          18900    19057     +157     
- Misses         3230     3302      +72     
- Partials        945      966      +21     
Flag Coverage Δ
ml-commons 81.70% <62.45%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rbhavna rbhavna temporarily deployed to ml-commons-cicd-env March 16, 2024 04:26 — with GitHub Actions Inactive
@rbhavna rbhavna temporarily deployed to ml-commons-cicd-env March 16, 2024 04:26 — with GitHub Actions Inactive
@rbhavna rbhavna temporarily deployed to ml-commons-cicd-env March 16, 2024 04:26 — with GitHub Actions Inactive
@@ -25,7 +25,7 @@
* ML input class which supports a list fo text docs.
* This class can be used for TEXT_EMBEDDING model.
*/
@org.opensearch.ml.common.annotation.MLInput(functionNames = {FunctionName.TEXT_EMBEDDING, FunctionName.SPARSE_ENCODING, FunctionName.SPARSE_TOKENIZE})
@org.opensearch.ml.common.annotation.MLInput(functionNames = {FunctionName.TEXT_EMBEDDING, FunctionName.SPARSE_ENCODING, FunctionName.SPARSE_TOKENIZE, FunctionName.QUESTION_ANSWERING})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you plan to add new input/output type for QA model, will you add it in next commit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes pushing the new commit with the following input/out formats
// Input { "question": "What color is Apple", "context": "I like Apples. Because they are red" }

// output { "inference_results": [ { "output": [ { "result": "red" } ] } ] }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still needs to add FunctionName.QUESTION_ANSWERING here?

Copy link
Collaborator Author

@rbhavna rbhavna Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good catch, we don't need it here anymore. Thanks @dhrubo-os will remove it

@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 18, 2024 18:42 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 18, 2024 18:42 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 18, 2024 18:42 — with GitHub Actions Failure
@rbhavna rbhavna temporarily deployed to ml-commons-cicd-env March 18, 2024 18:43 — with GitHub Actions Inactive
@rbhavna rbhavna temporarily deployed to ml-commons-cicd-env March 18, 2024 18:43 — with GitHub Actions Inactive
@rbhavna rbhavna temporarily deployed to ml-commons-cicd-env March 18, 2024 18:43 — with GitHub Actions Inactive
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Failure
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Failure
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 06:53 — with GitHub Actions Failure
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Error
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure
@rbhavna rbhavna had a problem deploying to ml-commons-cicd-env March 19, 2024 07:28 — with GitHub Actions Failure
@rbhavna rbhavna merged commit c560fcc into opensearch-project:main Mar 19, 2024
3 of 9 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-2208-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c560fcca6ee2e413e7ebc2503a2e64ad691e6e2b
# Push it to GitHub
git push --set-upstream origin backport/backport-2208-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-2208-to-2.x.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.13 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.13 2.13
# Navigate to the new working tree
cd .worktrees/backport-2.13
# Create a new branch
git switch --create backport/backport-2208-to-2.13
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c560fcca6ee2e413e7ebc2503a2e64ad691e6e2b
# Push it to GitHub
git push --set-upstream origin backport/backport-2208-to-2.13
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.13

Then, create a pull request where the base branch is 2.13 and the compare/head branch is backport/backport-2208-to-2.13.

rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
rbhavna added a commit to rbhavna/ml-commons that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
rbhavna added a commit that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
rbhavna added a commit that referenced this pull request Mar 19, 2024
* support question answering model

Signed-off-by: Bhavana Ramaram <[email protected]>
(cherry picked from commit c560fcc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants