-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add acs vector store #2041
feat: add acs vector store #2041
Conversation
Hey @aydan-at-microsoft 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary by GPT-4
The code changes in this commit include:
- Adding support for vector fields in Azure Search Writer.
- Adding a new option
vectorCols
to specify the vector columns and their dimensions. - Modifying the
writeHelper
function to accept an additional parameterisVectorField
. - Adding new test cases for handling vector fields.
The main purpose of these changes is to enable users to work with vector fields in Azure Search Writer, which can be useful for scenarios like similarity search and other machine learning tasks that require vector representations of data.
Suggestions
The changes in this PR look good and no suggestions are needed.
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearchAPI.scala
Show resolved
Hide resolved
...itive/src/test/scala/com/microsoft/azure/synapse/ml/cognitive/search/SearchWriterSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing Start!
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/search/AzureSearch.scala
Outdated
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report
@@ Coverage Diff @@
## master #2041 +/- ##
==========================================
- Coverage 87.07% 87.01% -0.06%
==========================================
Files 306 306
Lines 16063 16117 +54
Branches 852 858 +6
==========================================
+ Hits 13987 14025 +38
- Misses 2076 2092 +16
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
* add vector column option * add the vector option * vector fields are added and code compiles, untested * fix bug on checkparity when the index exists * add FloatType to edm-spark type conversions * fix synonymmap * core functionality works * add no nested field vector check * add vector validation check * modify vector columns behavior when column doesn't exist in df schema * add another test * clean up the unit test file * add more tests * add openai embedding pipeline test * address comments * address comments * address comments * update notebook * change index name in notebook chore: bump to spark 3.3.1 chore: bump to spark 3.3.1 chore: bump to spark 3.3.1
* add vector column option * add the vector option * vector fields are added and code compiles, untested * fix bug on checkparity when the index exists * add FloatType to edm-spark type conversions * fix synonymmap * core functionality works * add no nested field vector check * add vector validation check * modify vector columns behavior when column doesn't exist in df schema * add another test * clean up the unit test file * add more tests * add openai embedding pipeline test * address comments * address comments * address comments * update notebook * change index name in notebook
Related Issues/PRs
#xxx
What changes are proposed in this pull request?
Add ACS vector store for vector search.
How is this patch tested?
Does this PR change any dependencies?
Does this PR add a new feature? If so, have you added samples on website?
website/docs/documentation
folder.Make sure you choose the correct class
estimators/transformers
and namespace.DocTable
points to correct API link.yarn run start
to make sure the website renders correctly.<!--pytest-codeblocks:cont-->
before each python code blocks to enable auto-tests for python samples.WebsiteSamplesTests
job pass in the pipeline.