Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LiteLLM as an agent for model connections #53

Merged
merged 20 commits into from
Nov 4, 2024
Merged

Conversation

alimosaed
Copy link

@alimosaed alimosaed commented Oct 21, 2024

What is the purpose of this change?

  • Easily connect to various LLMs using LiteLLM with minimal coding effort.

How is this accomplished?

  • Users can add a LiteLLM agent to a workflow (YAML file) and set the proper model (e.g., model name, api key and base url). Then, they can run the workflow to connect to an LLM.
  • User can configure the following features for the agent: streaming/non-streaming, embedding/inference, enable/disable history
  • User can enable load balancer to distribute requests among multiple LLM models.
  • Refactored chat history and reused methods for LiteLLM and OpenAI components.

Anything reviews should focus on/be aware of?
The model name typically combines the model provider and the specific model name (e.g., azure/chatgpt-v-2). Find the accurate names from the list of Proviers.

How to test?

  1. Install the Solace AI Connector
  2. Install the litellm python module using pip install litellm
  3. Set the OpenAI keys and models in the 'example/llm/litellm_chat.yaml' file
  4. Run the flow by cd src && python3 -m solace_ai_connector.main ../config.yaml
  5. Subscribe to 'demo/question/response' and send a message to the 'demo/question' topic

Repeat steps 3 to 5 for 'litellm_embedding.yaml' and 'litellm_chat_with_history.yaml'

@alimosaed alimosaed requested a review from efunneko October 21, 2024 19:36
@alimosaed alimosaed self-assigned this Oct 21, 2024
Copy link

gitstream-cm bot commented Oct 21, 2024

Please mark whether you used Copilot to assist coding in this PR

  • Copilot Assisted

Copy link

@cyrus2281 cyrus2281 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments

Copy link
Collaborator

@efunneko efunneko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to address

docs/components/litellm_chat_model.md Show resolved Hide resolved
# add any other parameters here
- model_name: "claude-3-5-sonnet" # model alias
litellm_params:
model: ${OPENAI_MODEL_NAME}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example should use different model names and keys here

return response

def prune_history(self, session_id, history):
current_time = time.time()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be time to refactor this code, since I expect it is identical to other components

Copy link
Author

@alimosaed alimosaed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Addressed comments
  • Replied to some comments

Copy link

@cyrus2281 cyrus2281 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 86 to 95
source_expression: |
template:You are a helpful AI assistant. Please help with the user's request below:
<user-question>
{{text://input.payload:text}}
</user-question>
dest_expression: user_data.llm_input:messages.0.content
- type: copy
source_expression: static:user
dest_expression: user_data.llm_input:messages.0.role
input_selection:
Copy link

@cyrus2281 cyrus2281 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embedding example should not modify the user query. It should be used to get back the vector

Copy link

@cyrus2281 cyrus2281 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, Thanks

Copy link

SonarQube Quality Gate

Quality Gate failed

Failed condition 40.4% 40.35% Duplicated Lines (%) on New Code (is greater than 3%)

See analysis details on SonarQube

@alimosaed alimosaed requested a review from efunneko November 4, 2024 13:57
Copy link
Collaborator

@efunneko efunneko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for all the refactoring. It is much better organized now.

from ...component_base import ComponentBase
from ....common.log import log

class ChatHistoryHandler(ComponentBase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels strange that this inherits from ComponentBase since it isn't a component, however I see that it does use a bunch of the services of the parent (timer, kv_store, etc). I am not necessarily saying we shouldn't do it, but I would be interested in hearing other opinions on it @cyrus2281

@alimosaed alimosaed merged commit 948554f into main Nov 4, 2024
2 of 4 checks passed
@efunneko efunneko deleted the ap/integrate_litellm branch November 5, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants