How to fix Gemini stream mode？ #8998

AAEE86 · 2024-10-03T11:01:22Z

@CXwudi 在config_kwargs中设置
···
config_kwargs.pop("stream", None)
····
Although it can solve the error reporting problem, it is still not stream output. The code _handle_generate_stream_response seems to have a problem.

Originally posted by @AAEE86 in #8678 (comment)

dosubot · 2024-10-03T11:03:31Z

To address the issue with the Gemini stream mode not resulting in stream output, you should focus on the _handle_chat_generate_stream_response method. This method is responsible for processing the streaming response. Here are a few things to check and potentially fix:

Ensure the Response is Properly Iterated: The method iterates over the response object, which should be an Iterator. Make sure that the response object passed to this method is indeed an iterator and is yielding data as expected.
Check the Data Parsing: The method attempts to parse each line of the response as JSON. If the response lines are not in the expected format, this could cause issues. Ensure that the response lines are correctly formatted JSON strings.
Handle the End of Stream: The method checks for a line that equals "[DONE]" to determine the end of the stream. Ensure that this condition is correctly handled and that the stream is properly closed or returned when this line is encountered.
Yielding the Stream Data: The method yields LLMResultChunk objects for each piece of data. Ensure that the choice["delta"]["content"] is not empty before yielding, as the current code skips empty content.

Here is a snippet of the relevant code for reference:

def _handle_chat_generate_stream_response(
        self,
        model: str,
        prompt_messages: list[PromptMessage],
        credentials: dict,
        response: Iterator,
    ) -> Generator:
        for line in response:
            if not line:
                continue
            line = line.decode("utf-8")
            # remove the first `data: ` prefix
            if line.startswith("data:"):
                line = line[5:].strip()
            try:
                data = json.loads(line)
            except Exception as e:
                if line.strip() == "[DONE]":
                    return
            choices = data.get("choices", [])

            stop_reason = ""
            for choice in choices:
                if choice.get("finish_reason"):
                    stop_reason = choice["finish_reason"]

                if len(choice["delta"]["content"]) == 0:
                    continue
                yield LLMResultChunk(
                    model=model,
                    prompt_messages=prompt_messages,
                    delta=LLMResultChunkDelta(
                        index=0,
                        message=AssistantPromptMessage(content=choice["delta"]["content"], tool_calls=[]),
                        finish_reason=stop_reason,
                    ),
                )

Ensure that the response from the model is correctly formatted and that the streaming logic is correctly implemented in the _handle_chat_generate_stream_response method ^[1].

To continue talking to Dosu, mention @dosu.

Hisir0909 · 2024-10-08T06:31:50Z

We need to know how the stream parameter was initially passed in. I can override this stream in the function to fix the stream=True case, but when stream is False, because the external stream defaults to True, the non-streaming return result won't be displayed.

Hisir0909 · 2024-10-08T06:49:12Z

When I changed the return value of _handle_generate_response to also be in stream format, the stream=False issue was also fixed. This indicates that the handling of the return value has a conditional branch based on the stream value. However, I'm currently unsure how to pass the original stream value to the modified function.

Hisir0909 · 2024-10-08T07:00:08Z

    def _handle_invoke_result(
        self, invoke_result: LLMResult | Generator
    ) -> Generator[RunEvent | ModelInvokeCompleted, None, None]:
        """
        Handle invoke result
        :param invoke_result: invoke result
        :return:
        """
        if isinstance(invoke_result, LLMResult):
            return

Hmm, okay, I found the place to handle it. They didn't consider the stream=false case at all... Even though they require us to have non-streaming handling when adding LLMs, they actually completely ignore the non-streaming return.

AAEE86 · 2024-10-08T07:51:45Z

I don't quite understand these. Can you fix this issue? @Hisir0909

CXwudi · 2024-10-08T13:32:35Z

Hi @Hisir0909, I am just adding my one cent here, _handle_generate_response has nothing to do when stream=False. Based on

dify/api/core/model_runtime/model_providers/google/llm/llm.py

Lines 217 to 220 in 7121afd

    
           if stream: 
        
               return self._handle_generate_stream_response(model, credentials, response, prompt_messages) 
        
           return self._handle_generate_response(model, credentials, response, prompt_messages)

_handle_generate_stream_response is the method handling stream output. However it correctly returns a generator type which really confusing me

Hisir0909 · 2024-10-09T01:11:36Z

Hi @Hisir0909, I am just adding my one cent here, _handle_generate_response has nothing to do when stream=False. Based on

dify/api/core/model_runtime/model_providers/google/llm/llm.py

Lines 217 to 220 in 7121afd

if stream:

return self._handle_generate_stream_response(model, credentials, response, prompt_messages)

return self._handle_generate_response(model, credentials, response, prompt_messages)

_handle_generate_stream_response is the method handling stream output. However it correctly returns a generator type which really confusing me

What do you mean? When stream=False, is _handle_generate_stream_response still being processed here? My current solution is to use a local variable to override the function's stream parameter here, because the LLM node always calls it with stream=True, while the actual stream setting in the YAML configuration is in model_parameters.

CXwudi · 2024-10-09T01:53:30Z

I see, my bad. I misunderstood your previous statement

Hisir0909 · 2024-10-09T01:58:26Z

@AAEE86 @CXwudi Please take a look at my submission. Can it resolve the issue, and are my modifications reasonable? 🐸

dosubot bot added the 🐞 bug Something isn't working label Oct 3, 2024

AAEE86 mentioned this issue Oct 8, 2024

Add an option to stream for Gemini. #8652

Closed

5 tasks

Hisir0909 mentioned this issue Oct 9, 2024

fix: The LLM node setting with stream set to False. #9098

Closed

12 tasks

crazywoola closed this as completed Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to fix Gemini stream mode？ #8998

How to fix Gemini stream mode？ #8998

AAEE86 commented Oct 3, 2024

dosubot bot commented Oct 3, 2024

Hisir0909 commented Oct 8, 2024

Hisir0909 commented Oct 8, 2024

Hisir0909 commented Oct 8, 2024

AAEE86 commented Oct 8, 2024

CXwudi commented Oct 8, 2024

Hisir0909 commented Oct 9, 2024

CXwudi commented Oct 9, 2024

Hisir0909 commented Oct 9, 2024

How to fix Gemini stream mode？ #8998

How to fix Gemini stream mode？ #8998

Comments

AAEE86 commented Oct 3, 2024

dosubot bot commented Oct 3, 2024

Hisir0909 commented Oct 8, 2024

Hisir0909 commented Oct 8, 2024

Hisir0909 commented Oct 8, 2024

AAEE86 commented Oct 8, 2024

CXwudi commented Oct 8, 2024

Hisir0909 commented Oct 9, 2024

CXwudi commented Oct 9, 2024

Hisir0909 commented Oct 9, 2024