Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tutorial: generate embedding for arrays of object #2477

Merged
merged 1 commit into from
May 28, 2024

Conversation

ylwu-amzn
Copy link
Collaborator

Description

Add tutorial: generate embedding for arrays of object

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ylwu-amzn ylwu-amzn merged commit 0722df1 into opensearch-project:main May 28, 2024
11 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request May 28, 2024
Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit 0722df1)
Create sub-pipeline to generate embedding for one item in the array.

This pipeline contains 3 processors
- set processor: The `text_embedding` processor is unable to identify "_ingest._value.title". You need to copy "_ingest._value.title" to a temporary field for text_embedding to process it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how customers will know what will be the name of the temporary field? if the original field name is xyz then embedding field will be xyz_embedding?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be arbitrary name, just make sure it's not same with any field which is being used in document.

}
```

Create pipeline with foreach processor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we can add a link for foreach processor to explain what does this processor do?

Copy link
Collaborator Author

@ylwu-amzn ylwu-amzn May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, OpenSearch doesn't have foreach processor doc, read https://opensearch.org/docs/latest/ingest-pipelines/processors/index-processors/. There is one issue tracking the gap opensearch-project/documentation-website#4647

"description": "This is first book"
},
{
"title": "second book",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how did you get this title here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, should remove this, will fix in next PR soon

ylwu-amzn added a commit that referenced this pull request May 28, 2024
Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit 0722df1)

Co-authored-by: Yaliang Wu <[email protected]>
@ochrist-eis
Copy link

Thanks a lot for providing this tutorial! 👍

@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env May 28, 2024 17:53 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env May 28, 2024 17:53 — with GitHub Actions Inactive
@ylwu-amzn ylwu-amzn temporarily deployed to ml-commons-cicd-env May 28, 2024 17:53 — with GitHub Actions Inactive
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 30, 2024
Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit 0722df1)
dhrubo-os pushed a commit that referenced this pull request Sep 30, 2024
Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit 0722df1)

Co-authored-by: Yaliang Wu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants