FEATURE: Enhance embedding functionality with batch and image support. #55

cyrus2281 · 2024-11-01T14:49:56Z

What is the purpose of this change?

The purpose of this change is to enhance the LangChain embeddings component by adding support for batch processing of embeddings and extending its functionality to include image embeddings. This allows for more efficient handling of multiple embeddings in a single operation and broadens the use cases to include image data alongside text data.

How is this accomplished?

This is accomplished by modifying the input schema to accept a list of items instead of a single text input and introducing methods to handle different types of embeddings (document, query, and image). The invoke method has been refactored to process all item types consistently, delegating the actual embedding process to specialized methods for each type.

Anything reviews should focus on/be aware of?

Reviewers should focus on the changes to the input and output schemas and how the invoke method handles the embedding operations for different types of data. Ensure that the batch processing logic works as intended for all supported embedding types and that the new image embedding functionality is properly integrated.

Changes are backward incompatible, but no one is using this component yet

…ncer

gitstream-cm · 2024-11-01T14:50:54Z

Please mark whether you used Copilot to assist coding in this PR

Copilot Assisted

cyrus2281 · 2024-11-01T14:51:02Z

src/solace_ai_connector/components/general/langchain/langchain_embeddings.py

+        for query in queries:
+            embeddings.append(self.component.embed_query(query))
+        return {"embeddings": embeddings}


[CMT] LangChain doesn't expose the function to do batch in single calls, so had to loop here

sonarqube-solacecloud · 2024-11-04T15:13:17Z

SonarQube Quality Gate

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

13.0% Coverage
0.0% Duplication

gregmeldrum

LG

alimosaed · 2024-11-04T16:30:39Z

LG

alimosaed · 2024-11-04T16:22:54Z

src/solace_ai_connector/components/general/llm/langchain/langchain_embeddings.py

        embedding_type = data.get("type", "document")

-        embeddings = None
+        items = [items] if type(items) != list else items


[SUG]: You can improve the validation step before getting an error in the next steps. Maybe check None and/or check the content of items based on types (document, image or query).

alimosaed added 16 commits October 18, 2024 16:35

added litellm component

a3f3870

support chat history

8a7b8a7

trimmed comments

be89285

dynamically get the model parameters

ab3032b

added llm load balancer

0cadcc0

added the AI PR reviewer workflow

2565990

fixed minor issues

9dc1a67

controlled session id

26b07d3

refactored chat history and reused codes

c3d1dab

resolved conflicts

afcc8ff

fixed minor logging issue

9c87e41

reverted minor changes

60f7b9f

handle all LiteLLM inferences and embedding requests by the load bala…

f862a44

…ncer

updated documents

a4cf8fd

fix: remove useless import command

07e7f2c

refactor: restructure the LLM components

ea59e7e

cyrus2281 self-assigned this Nov 1, 2024

cyrus2281 requested review from efunneko and a team November 1, 2024 14:50

cyrus2281 commented Nov 1, 2024

View reviewed changes

cyrus2281 force-pushed the cyrus/feature/embedding branch from e57aa96 to a6af36a Compare November 1, 2024 15:53

Added support for batch embedding + image embedding

071e76b

cyrus2281 requested review from gregmeldrum and alimosaed November 1, 2024 17:46

Merge branch 'main' into cyrus/feature/embedding

082b07b

cyrus2281 force-pushed the cyrus/feature/embedding branch from a6af36a to 082b07b Compare November 4, 2024 14:40

cyrus2281 added 3 commits November 4, 2024 09:46

Added init py

ec38d45

typo

d67af28

fixed embedding

b46e939

cyrus2281 force-pushed the cyrus/feature/embedding branch from 73104cc to b46e939 Compare November 4, 2024 15:11

gregmeldrum approved these changes Nov 4, 2024

View reviewed changes

cyrus2281 merged commit 3b4c99a into main Nov 4, 2024
4 checks passed

cyrus2281 deleted the cyrus/feature/embedding branch November 4, 2024 15:48

alimosaed reviewed Nov 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: Enhance embedding functionality with batch and image support. #55

FEATURE: Enhance embedding functionality with batch and image support. #55

cyrus2281 commented Nov 1, 2024 •

edited

Loading

gitstream-cm bot commented Nov 1, 2024

cyrus2281 Nov 1, 2024 •

edited

Loading

sonarqube-solacecloud bot commented Nov 4, 2024

gregmeldrum left a comment

alimosaed commented Nov 4, 2024

alimosaed Nov 4, 2024

FEATURE: Enhance embedding functionality with batch and image support. #55

FEATURE: Enhance embedding functionality with batch and image support. #55

Conversation

cyrus2281 commented Nov 1, 2024 • edited Loading

What is the purpose of this change?

How is this accomplished?

Anything reviews should focus on/be aware of?

Changes are backward incompatible, but no one is using this component yet

gitstream-cm bot commented Nov 1, 2024

cyrus2281 Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

sonarqube-solacecloud bot commented Nov 4, 2024

gregmeldrum left a comment

Choose a reason for hiding this comment

alimosaed commented Nov 4, 2024

alimosaed Nov 4, 2024

Choose a reason for hiding this comment

cyrus2281 commented Nov 1, 2024 •

edited

Loading

cyrus2281 Nov 1, 2024 •

edited

Loading