-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: CohereGenerator #6034
feat: CohereGenerator #6034
Conversation
Signed-off-by: sunilkumardash9 <[email protected]>
Hello @sunilkumardash9! I'm going to take care of the review soon. In the meantime, let's fix the CI. Have you followed the installation instructions in our contributions guidelines? You need to install pre-commit hooks for the CI to run properly your tests. On top of that, after you installed the pre-commit hooks you will need to run Let me know if you have issues! |
Hi @ZanSara, I followed what you suggested. I got two failed checks for Ruff and Codespell. Is it a problem? Though running hooks individually runs fine. |
2. removed commented files in test-cohere_generators 3. removed unused imports Signed-off-by: sunilkumardash9 <[email protected]>
Hey @sunilkumardash9 , if Ruff and Codespell don't pass that means that there are some small issues with your code. Normally the error messages explain what's the issue and how to fix it, and if it's not clear they should be easy to Google. If you struggle with some of them you can share them here and I can help interpreting them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few notes
2. remove dict casting of metadata in run Signed-off-by: sunilkumardash9 <[email protected]>
Signed-off-by: sunilkumardash9 <[email protected]>
@sunilkumardash9 I see that the CI is now failing because haystack/.github/workflows/tests_preview.yml Line 119 in aaee03a
It should fix the issue. You will need to do the same thing here:
to also make PyLint pass. |
Signed-off-by: sunilkumardash9 <[email protected]>
2. small change in doc string Signed-off-by: sunilkumardash9 <[email protected]>
@sunilkumardash9 I see there's typo in the dependencies list: |
2. changed api key env var from CO_API_KEY to COHERE_API_KEY Signed-off-by: sunilkumardash9 <[email protected]>
@sunilkumardash9 You can ignore the MyPy issue for now (we're working on it, it's a larger issue), while for the others, I forgot to make you update these lines as well:
The change is the same, you just need to add Don't forget to do |
Signed-off-by: sunilkumardash9 <[email protected]>
@ZanSara, I should have scrolled down a bit too. 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small fixes, overall looking good! I believe we will manage to merge this one soon
Args: | ||
api_key (str): The API key for the Cohere API. | ||
model_name (str): The name of the model to use. | ||
streaming_callback (Callable, optional): A callback function to be called with the streaming response. Defaults to None. | ||
api_base_url (str, optional): The base URL of the Cohere API. Defaults to "https://api.cohere.ai". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use a different style for docstrings: it's quite important for our automated API documentation tooling, so let's fix it.
Here is how it should look like:
Args: | |
api_key (str): The API key for the Cohere API. | |
model_name (str): The name of the model to use. | |
streaming_callback (Callable, optional): A callback function to be called with the streaming response. Defaults to None. | |
api_base_url (str, optional): The base URL of the Cohere API. Defaults to "https://api.cohere.ai". | |
Instantiates a `CohereGenerator` component. | |
:param api_key: The API key for the Cohere API. | |
:param model_name: The name of the model to use. | |
:param streaming_callback: A callback function to be called with the streaming response. Defaults to None. | |
:param api_base_url: The base URL of the Cohere API. Defaults to "https://api.cohere.ai". | |
:param kwargs: additional model parameters. These will be used during generation. |
In addition:
- for model_name, let's give some example model names.
- for kwargs, let's add a link to the Cohere documentation where these arguments are listed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same, should have stuck with the way it is in GptGenerator. For the Kwargs, I have added the params in doc_string.
replies: List[str] | ||
metadata: List[Dict[str, Any]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure we need these type definitions? Does mypy raises errors if you remove these two lines?
metadata = [{"finish_reason": response[0].finish_reason}] | ||
replies = [response[0].text] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users want to receive multiple alternative responses from the model, this won't work. Let's leverage the fact that we can return a list:
metadata = [{"finish_reason": response[0].finish_reason}] | |
replies = [response[0].text] | |
metadata = [{"finish_reason": resp.finish_reason} for resp in response] | |
replies = [resp.text for resp in response] |
api_key: str, | ||
model: str = "command", | ||
streaming_callback: Optional[Callable] = None, | ||
api_base_url: str = API_BASE_URL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the tests you're using cohere.COHERE_API_URL
. Can we use the same here instead of duplicating its content into a constant? In this way, if cohere updates this value we won't need to do anything and the component will still work.
def default_streaming_callback(chunk): | ||
""" | ||
Default callback function for streaming responses from Cohere API. | ||
Prints the tokens of the first completion to stdout as soon as they are received and returns the chunk unchanged. | ||
""" | ||
print(chunk.text, flush=True, end="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this function, because it's used only in the tests. To test for a streaming callback, you can define it in the tests themselves.
2. Added kwargs doc strings for CohereGenerator 3. removed type hints for metadata and replies 4. use COHERE_API_URL instead of hard coded URL. Signed-off-by: sunilkumardash9 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sunilkumardash9 ! Adding some cosmetic docstrings suggestions.
Co-authored-by: Daria Fokina <[email protected]>
Co-authored-by: Daria Fokina <[email protected]>
Co-authored-by: Daria Fokina <[email protected]>
Co-authored-by: Daria Fokina <[email protected]>
Co-authored-by: Daria Fokina <[email protected]>
Hello @sunilkumardash9, sorry for the late feedback. I'm going to take over this PR in order to fix the tests and merge it. Thank you for your contribution! |
Closing this PR in favor of #6395 |
Related Issues
CohereGenerator
#5984Proposed Changes:
CohereGenerator to generate Cohere LLM responses from user-provided prompts
How did you test it?
Unit tests
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.