Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate is not working #420

Closed
jackchan0528 opened this issue Mar 23, 2024 · 2 comments · Fixed by #478
Closed

Evaluate is not working #420

jackchan0528 opened this issue Mar 23, 2024 · 2 comments · Fixed by #478
Assignees
Labels
bug Something isn't working status: in progress Issues that are currently being worked on.

Comments

@jackchan0528
Copy link

Following the doc here: https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/nemoguardrails/eval/data/moderation/README.md

I unzip the required text files under eval/data/moderation folder, and tried running the commands:
nemoguardrails evaluate moderation --config=config --dataset-path .\eval\data\moderation\anthropic_harmful.txt --split harmful
nemoguardrails evaluate moderation --config=config --dataset-path .\eval\data\moderation\anthropic_helpful.txt --split helpful

For the harmful one, I got this error:

...\Lib\site-packages\langchain_openai\chat_models\base.py", line 165, in
_convert_message_to_dict _convert_message_to_dict
raise TypeError(f"Got unknown type {message}")
TypeError: Got unknown type Y

and for the helpful one, the error is:

...\Lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 4455: character maps to

It seems that there are 2 main issues. One is that the "Y" could possibly be the answer from the rails (Yes). But it does not get recognized by any instance type defined under langchain_openai/chat_models/base.py _convert_message_to_dict(), quoting it below:
`def _convert_message_to_dict(message: BaseMessage) -> dict:
"""Convert a LangChain message to a dictionary.

Args:
    message: The LangChain message.

Returns:
    The dictionary.
"""
message_dict: Dict[str, Any]
if isinstance(message, ChatMessage):
    message_dict = {"role": message.role, "content": message.content}
elif isinstance(message, HumanMessage):
    message_dict = {"role": "user", "content": message.content}
elif isinstance(message, AIMessage):
    message_dict = {"role": "assistant", "content": message.content}
    if "function_call" in message.additional_kwargs:
        message_dict["function_call"] = message.additional_kwargs["function_call"]
        # If function call only, content is None not empty string
        if message_dict["content"] == "":
            message_dict["content"] = None
    if "tool_calls" in message.additional_kwargs:
        message_dict["tool_calls"] = message.additional_kwargs["tool_calls"]
        # If tool calls only, content is None not empty string
        if message_dict["content"] == "":
            message_dict["content"] = None
elif isinstance(message, SystemMessage):
    message_dict = {"role": "system", "content": message.content}
elif isinstance(message, FunctionMessage):
    message_dict = {
        "role": "function",
        "content": message.content,
        "name": message.name,
    }
elif isinstance(message, ToolMessage):
    message_dict = {
        "role": "tool",
        "content": message.content,
        "tool_call_id": message.tool_call_id,
    }
else:
    raise TypeError(f"Got unknown type {message}")
if "name" in message.additional_kwargs:
    message_dict["name"] = message.additional_kwargs["name"]
return message_dict`

and for the second issue, I believe you need to have the "encoding="utf8"" somewhere in the code.

@drazvan

@drazvan
Copy link
Collaborator

drazvan commented Mar 26, 2024

Thanks @jackchan0528. @trebedea should be able to help with this early next week. Let me know if this is urgent and I can try to help as well.

@drazvan drazvan added bug Something isn't working status: in progress Issues that are currently being worked on. labels Mar 26, 2024
trebedea added a commit that referenced this issue Apr 30, 2024
@trebedea
Copy link
Collaborator

Thanks for reporting this @jackchan0528 , evaluate was not working with chat LLMs from Langchain. The evaluation package was created before Langchain branched off the BaseChatModel as a different base class for chat models.

This should solve your main problem. However, I was not able to replicate the second one with the unicode error.
Running this works for me with no errors:

python process_anthropic_dataset.py --dataset-path anthropic_helpful.jsonl --split helpful
nemoguardrails evaluate moderation --config=config --dataset-path .\eval\data\moderation\anthropic_helpful.txt --split helpful

I used the test set from Anthropic HH (test.jsonl.gz).

I will close this, just reopen if the problems persist.

drazvan pushed a commit that referenced this issue May 8, 2024
drazvan added a commit that referenced this issue May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status: in progress Issues that are currently being worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants