You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
In an ASP .NET Core controller I want to stream back to the client the tokens response of a prompt issued to OpenAI. Despite yield returning the single response tokens, the response seems to be buffered server side untill all the response stream from OpenAI is consumed and only at the end is returned to the client.
To Reproduce
Here is a simple endpoint implementation to reproduce the problem:
[HttpGet("open-ai")]
public async IAsyncEnumerable<string> GetWithOpenAI()
{
var prompt = "Write a short poem about cats";
var kernel = Kernel.CreateBuilder().AddOpenAIChatCompletion("model-id", "api-key").Build();
var textCompletionService = kernel.GetRequiredService<ITextGenerationService>();
await foreach (var textStreamingResult in textCompletionService.GetStreamingTextContentsAsync(prompt))
{
yield return textStreamingResult.Text;
}
}
Expected behavior
The response is correctly streamed to the client from the controller without waiting the whole LLM response to be completed before returning.
Platform
OS: Windows, Mac
IDE: Visual Studio, VS Code, JetBrains Rider
Language: c#
Source: NuGet package version 1.4.0
Additional context
The problem seems to affect both AzureOpenAI and OpenAI connectors, I could not test it with the other connectors as I've no access to the other platforms. The issue also occurs when using the method InvokePromptStreamingAsync directly on the Kernel instance, or when using the IChatCompletionService and the GetStreamingChatMessageContentsAsync method. All the different tests I made can be found in the repository linked above.
We were using the SK version 1.0.0-beta-3 and with the ITextCompletion we had no such problem, a lot of things have been changed since then.
The text was updated successfully, but these errors were encountered:
github-actionsbot
changed the title
.NET: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync
.Net: OpenAI connector in ASP .NET Core buffers the response when using GetStreamingTextContentsAsync
Feb 22, 2024
Describe the bug
In an ASP .NET Core controller I want to stream back to the client the tokens response of a prompt issued to OpenAI. Despite yield returning the single response tokens, the response seems to be buffered server side untill all the response stream from OpenAI is consumed and only at the end is returned to the client.
To Reproduce
Here is a simple endpoint implementation to reproduce the problem:
The full example can be found here: https://github.com/fabio-sp/sk-streaming-sample-webapi
Expected behavior
The response is correctly streamed to the client from the controller without waiting the whole LLM response to be completed before returning.
Platform
Additional context
The problem seems to affect both AzureOpenAI and OpenAI connectors, I could not test it with the other connectors as I've no access to the other platforms. The issue also occurs when using the method
InvokePromptStreamingAsync
directly on the Kernel instance, or when using theIChatCompletionService
and theGetStreamingChatMessageContentsAsync
method. All the different tests I made can be found in the repository linked above.We were using the SK version 1.0.0-beta-3 and with the
ITextCompletion
we had no such problem, a lot of things have been changed since then.The text was updated successfully, but these errors were encountered: