.Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync #5116

fabio-sp · 2024-02-22T09:18:35Z

Describe the bug
In an ASP .NET Core controller I want to stream back to the client the tokens response of a prompt issued to OpenAI. Despite yield returning the single response tokens, the response seems to be buffered server side untill all the response stream from OpenAI is consumed and only at the end is returned to the client.

To Reproduce
Here is a simple endpoint implementation to reproduce the problem:

[HttpGet("open-ai")]
public async IAsyncEnumerable<string> GetWithOpenAI()
{
    var prompt = "Write a short poem about cats";
    var kernel = Kernel.CreateBuilder().AddOpenAIChatCompletion("model-id", "api-key").Build();
    var textCompletionService = kernel.GetRequiredService<ITextGenerationService>();

   await foreach (var textStreamingResult in textCompletionService.GetStreamingTextContentsAsync(prompt))
    {
        yield return textStreamingResult.Text;
    }
}

The full example can be found here: https://github.com/fabio-sp/sk-streaming-sample-webapi

Expected behavior
The response is correctly streamed to the client from the controller without waiting the whole LLM response to be completed before returning.

Platform

OS: Windows, Mac
IDE: Visual Studio, VS Code, JetBrains Rider
Language: c#
Source: NuGet package version 1.4.0

Additional context
The problem seems to affect both AzureOpenAI and OpenAI connectors, I could not test it with the other connectors as I've no access to the other platforms. The issue also occurs when using the method InvokePromptStreamingAsync directly on the Kernel instance, or when using the IChatCompletionService and the GetStreamingChatMessageContentsAsync method. All the different tests I made can be found in the repository linked above.

We were using the SK version 1.0.0-beta-3 and with the ITextCompletion we had no such problem, a lot of things have been changed since then.

The text was updated successfully, but these errors were encountered:

Krzysztof318 · 2024-02-22T13:35:59Z

Similar to this issue #4627

stephentoub · 2024-02-22T13:57:22Z

This is likely this bug in the Azure SDK:
Azure/azure-sdk-for-net#41838

It was fixed in Azure/azure-sdk-for-net#41844 but a new build with the fix hasn't been published to nuget yet.

In the meantime, try adding await Task.Yield() in your foreach loop and see if that improves the streaming.

markwallace-microsoft · 2024-02-22T14:02:23Z

@fabio-sp based on the response from Stephen looks like this isn't an Semantic Kernel issue so I'm going to close this.

fabio-sp · 2024-02-22T14:08:12Z

I can confirm that the workaround suggested by @stephentoub works fine. Thank you!

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Feb 22, 2024

github-actions bot changed the title ~~.NET: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync~~ .Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync Feb 22, 2024

markwallace-microsoft self-assigned this Feb 22, 2024

markwallace-microsoft closed this as completed Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync #5116

.Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync #5116

fabio-sp commented Feb 22, 2024

Krzysztof318 commented Feb 22, 2024

stephentoub commented Feb 22, 2024 •

edited

Loading

markwallace-microsoft commented Feb 22, 2024

fabio-sp commented Feb 22, 2024

.Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync #5116

.Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync #5116

Comments

fabio-sp commented Feb 22, 2024

Krzysztof318 commented Feb 22, 2024

stephentoub commented Feb 22, 2024 • edited Loading

markwallace-microsoft commented Feb 22, 2024

fabio-sp commented Feb 22, 2024

stephentoub commented Feb 22, 2024 •

edited

Loading