Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure OpenAI: remove SseReader's use of subtly blocking StreamReader.EndOfStream #41844

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

trrwilson
Copy link
Member

Problem

StreamReader.EndOfStream introduces a subtle but important bit of blocking I/O to the project's server-sent event reader that causes the underlying IAsyncEnumerator for streaming chat completions to invariantly report a completed state for its MoveNext. This causes issues in downstream consumption scenarios that block on this, e.g. via JsonSerializer:

await JsonSerializer.SerializeAsync(Console.OpenStandardOutput(), GetStrings(await client.GetChatCompletionsStreamingAsync(options)));

static async IAsyncEnumerable<string> GetStrings(IAsyncEnumerable<StreamingChatCompletionsUpdate> source)
{
    await foreach (var update in source)
    {
        //await Task.Yield();
        yield return update.ContentUpdate;
    }
}

Fix

Rather than use EndOfStream, we replace the EoS check with an implicit one performed via a null return by ReadLine*. This produces the desired behavior without inducing the blocking I/O that erroneously makes streaming output appear blocking.

fixes #41838 and thanks/credit to @stephentoub for the investigation and fix confirmation.

@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

Copy link
Contributor

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix

@stephentoub
Copy link
Contributor

stephentoub commented Feb 8, 2024

invariantly report a completed state for its MoveNext. This causes issues in downstream consumption scenarios that block on this, e.g. via JsonSerializer:

The more general impact is scalability: even though it looks like it's async, it's actually doing synchronous I/O, blocking the thread while waiting for data to arrive rather than allowing the thread to do other things in the meantime.

What you cite is also impactful to streaming latency and is how I first noticed an issue: returning one of these IAsyncEnumerables from an ASP.NET minimal API results in that blocking behavior negatively impacting how buffering happens in the implicit JSON serialization, causing the streamed tokens to be batched rather than written to the wire individually.

@trrwilson trrwilson merged commit c7c2f4c into main Feb 8, 2024
19 checks passed
@trrwilson trrwilson deleted the user/travisw/aoai-ssereader-eos branch February 8, 2024 16:53
@joakimriedel
Copy link

@trrwilson I'm trying to find a nightly build including this fix, but it seems this feed from the documentation only carries the beta.13 version without this fix. Could you help me point to where the nightly builds are published?

@trrwilson
Copy link
Member Author

@joakimriedel the package feeds seemingly prioritize the beta/alpha tagging in semantic versioning over recency, which I've always found a bit odd.

If you check the full list of versions, you'll find the nightly/alpha/dev builds beneath the published beta builds:
https://dev.azure.com/azure-sdk/public/_artifacts/feed/azure-sdk-for-net/NuGet/Azure.AI.OpenAI/versions/1.0.0-beta.13

E.g. here's the latest:
https://dev.azure.com/azure-sdk/public/_artifacts/feed/azure-sdk-for-net/NuGet/Azure.AI.OpenAI/overview/1.0.0-alpha.20240214.1

@joakimriedel
Copy link

@trrwilson oh 🤦‍♂️ thanks a lot, that do make sense though with alpha < beta sort order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Azure.Core.Sse.SseReader ends up erroneously blocking
5 participants