-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle hallucinated tool execution requests #1244
Handle hallucinated tool execution requests #1244
Conversation
f700e5b
to
30f2ddb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
I'll have a closer look tomorrow, but for now I added a comment about the test
...ama/runtime/src/test/java/io/quarkiverse/langchain4j/ollama/OllamaChatLanguageModelTest.java
Outdated
Show resolved
Hide resolved
As for the tests, I would much rather prefer to have a tool test like the ones in the OpenAI module |
30f2ddb
to
7191574
Compare
I moved the tests and I've rewritten them to match the wiremock style of the others in ollama but saw your latest remark after I was done. Are you talking about I am not sure if I understand the setup of those. |
Right. It's similar to the other Wiremock tests and the idea is to be able to track whether a tool has been called or not. |
7191574
to
16e7724
Compare
I have added a scenario test somewhat equal to the ones in the openai package. I had to add the called property as a static one since I was unable to use an AtomicInteger like in I kept the other ones since I feel like checking/asserting the filtering of the execution is somewhat neccessary to ensure that calls are filtered out. For me without the other testcases this would cause blindspots. |
16e7724
to
62bf751
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This comment has been minimized.
This comment has been minimized.
62bf751
to
6adf4d5
Compare
Fixed/ran the formatting
|
This comment has been minimized.
This comment has been minimized.
6adf4d5
to
2cbebc1
Compare
fixed sorting of imports |
Status for workflow
|
Not sure why I needed to add the quarkus-junit5 dependency for my test to run.
Added behavior to filter out hallucinated toolExecutionRequests.
Had to add some reflection to write a test with mocked client in OllamaChatLanguageModel. I am happy about suggestions.
Also I am not happy with the assertions in the test "doesNotCrashIfToolNotPresent". But I did not find out how to get the content of the messagebuilder from "chatResponseWithHalucinatedTool" to be present in the resultin ChatResponse. I think it has something todo with missing ToolExecutors but added breakpoints into the ToolExecutors present in the repo did not help since they didn't hit.
Fixes #1232 (hallucinated toolExecutionRequests)