Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: The LLM node setting with stream set to False. #9098

Closed
wants to merge 2 commits into from

Conversation

Hisir0909
Copy link
Contributor

Checklist:

Important

Please review the checklist below before submitting your pull request.

  • Please open an issue before creating a PR or link to an existing issue
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Description

Fixed the issue where the LLM node returned no results when the configured model did not use stream returns.
Fixes #8998

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update, included: Dify Document
  • Improvement, including but not limited to code refactoring, performance optimization, and UI/UX improvement
  • Dependency upgrade

Testing Instructions

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Gemini works correctly with Stream = False.
  • Gemini works correctly with stream = True.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. 🐞 bug Something isn't working labels Oct 9, 2024
@CXwudi
Copy link
Contributor

CXwudi commented Oct 9, 2024

The stream setting for Gemini in the workflow and chatflow works for me now. Since the change is modifying in the core part, I also tested with other providers like OpenAI, Arthorpic, and a few open sourced LLM from OpenRouter to make sure it doesn't produce regression.

The bug still exists in chatbot. I believe it is the same fix for chatbot, but it is totally fine if you just want to fix the bug in workflow and chatflow.
image

@Hisir0909
Copy link
Contributor Author

The stream setting for Gemini in the workflow and chatflow works for me now. Since the change is modifying in the core part, I also tested with other providers like OpenAI, Arthorpic, and a few open sourced LLM from OpenRouter to make sure it doesn't produce regression.

The bug still exists in chatbot. I believe it is the same fix for chatbot, but it is totally fine if you just want to fix the bug in workflow and chatflow. image

I've barely used this chatbot, so I've forgotten about it.😰

@Hisir0909
Copy link
Contributor Author

In COT, there's no handling for the return values of non-stream modes. However, in fc, the stream mode is determined using stream_tool_call. I think the stream flag shouldn't be used in the chatbot; modifying the YAML description of stream is sufficient.

@CXwudi
Copy link
Contributor

CXwudi commented Oct 10, 2024

Sorry for the late reply. I agree that stream mode is not likely to be used in chatbot. Given that, user would usually expect the streaming behavior on the UI/UX side for chatbot. Hence, it is totally fine to not bother fixing it.

@laipz8200 laipz8200 closed this Oct 14, 2024
@Hisir0909
Copy link
Contributor Author

@laipz8200 Why was it closed? We mean that fixing it under workflow and chatflow is correct; however, this stream setting should be disabled in agent mode, and the error can be ignored.

@laipz8200
Copy link
Collaborator

@laipz8200 Why was it closed? We mean that fixing it under workflow and chatflow is correct; however, this stream setting should be disabled in agent mode, and the error can be ignored.

Oops! Sorry for the misunderstanding.

@laipz8200 laipz8200 reopened this Oct 14, 2024
@Hisir0909
Copy link
Contributor Author

@laipz8200 Why was it closed? We mean that fixing it under workflow and chatflow is correct; however, this stream setting should be disabled in agent mode, and the error can be ignored.为什么会关闭?我们的意思是,在 workflowchatflow 模式下进行修复是正确的;但是,在 agent 模式下应禁用此流设置,错误可以忽略。

Oops! Sorry for the misunderstanding.哎呀!对不起,让您误会了。

Thank you!

@laipz8200
Copy link
Collaborator

laipz8200 commented Oct 14, 2024

Our CI configuration has been updated, could you sync your code with the main branch to pass the CI?

@Hisir0909
Copy link
Contributor Author

Our CI configuration has been updated, could you sync your code with the main branch to pass the CI?我们的 CI 配置已经更新,您能否将代码与主分支同步,以便通过 CI?

Sorry, I just saw it, I've synced now.

@hjlarry
Copy link
Contributor

hjlarry commented Oct 16, 2024

Hi @Hisir0909 @CXwudi , I haven't see this PR before, and already remove the stream config of yaml in this PR #9319 , caused by:

  1. the error above
  2. all the other models not have a stream config, the dify control whether stream mode itself

This PR seems to address the issue that the Gemini model's stream mode responds five times slower than when the stream mode is not enabled. The proposed solution is to let the user decide whether to activate the stream mode. However, I don't believe this is the right approach. The core issue is understanding why the Gemini model's stream mode is so slow. Stream mode should enhance UX, but currently, it detracts from it. This fundamental performance issue should be resolved by Google, not us.

@CXwudi
Copy link
Contributor

CXwudi commented Oct 16, 2024

  1. all the other models not have a stream config, the dify control whether stream mode itself

If that's so, then I believe this PR is not needed.

This PR seems to address the issue that the Gemini model's stream mode responds five times slower than when the stream mode is not enabled. The proposed solution is to let the user decide whether to activate the stream mode. However, I don't believe this is the right approach. The core issue is understanding why the Gemini model's stream mode is so slow. Stream mode should enhance UX, but currently, it detracts from it. This fundamental performance issue should be resolved by Google, not us.

Uhhh, I think there is a misunderstanding here. We are not addressing performance issues, nor have we experienced any slowness from Gemini. This PR is mainly addressing #8998, which is a regression from #8678, as I mentioned in #8678 (comment).

@hjlarry
Copy link
Contributor

hjlarry commented Oct 16, 2024

Uhhh, I think there is a misunderstanding here. We are not addressing performance issues, nor have we experienced any slowness from Gemini. This PR is mainly addressing #8998, which is a regression from #8678, as I mentioned in #8678 (comment).

the performance issue is #8652 which seems #8678 to resolved.

@CXwudi
Copy link
Contributor

CXwudi commented Oct 16, 2024

I see, sorry that I wasn't aware of the root issue.

@Hisir0909
Copy link
Contributor Author

Uhhh, I think there is a misunderstanding here. We are not addressing performance issues, nor have we experienced any slowness from Gemini. This PR is mainly addressing #8998, which is a regression from #8678, as I mentioned in #8678 (comment).

the performance issue is #8652 which seems #8678 to resolved.

I'd like to mention that I originally raised the performance issue #8652, and I'm willing to fix it. However, I didn't have time at that point, and someone else submitted PR #8678. Their approach simply added the stream flag in the YAML file without modifying the code. In DIFY, all LLMs are designed to handle both stream and non-stream processing, but the stream parameter is rarely used. This PR now adapts to the stream flag in the YAML file. If you think this is unnecessary or the approach is problematic, I will close the PR.

@hjlarry
Copy link
Contributor

hjlarry commented Oct 16, 2024

I'd like to mention that I originally raised the performance issue #8652, and I'm willing to fix it. However, I didn't have time at that point, and someone else submitted PR #8678. Their approach simply added the stream flag in the YAML file without modifying the code. In DIFY, all LLMs are designed to handle both stream and non-stream processing, but the stream parameter is rarely used. This PR now adapts to the stream flag in the YAML file. If you think this is unnecessary or the approach is problematic, I will close the PR.

Thank you for explaining the whole story from beginning to end. And the stream flag was already removed from all the YAML file, so this PR seems not so mush necessary, what do you think about this? @laipz8200

@laipz8200
Copy link
Collaborator

Thank you all for providing the context. If the performance issues with Gemini have been resolved, we won't need this PR anymore.

@Hisir0909
Copy link
Contributor Author

Thank you all for providing the context. If the performance issues with Gemini have been resolved, we won't need this PR anymore.

Clearly, no. Only version 002 of Gemini-flash is normal because version 002 defaults to having the review interface turned off (according to Google). Other flash versions, including version 8B, are very slow in stream mode.

@laipz8200
Copy link
Collaborator

Since the steam option for the model has been removed, I think this PR is unnecessary. Let's wait for Gemini to fix this issue.

@Hisir0909 Hisir0909 closed this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to fix Gemini stream mode?
4 participants