Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Server-Sent Events(SSE) stream of HTTP/2 supported? #207

Closed
SokichiFujita opened this issue Aug 17, 2023 · 9 comments
Closed

Is Server-Sent Events(SSE) stream of HTTP/2 supported? #207

SokichiFujita opened this issue Aug 17, 2023 · 9 comments
Labels
kind/support Adopter support requests.

Comments

@SokichiFujita
Copy link

I'm trying to receive a stream response which is based on Server-Sent Events (SSE) of HTTP/2 from OpenAI's chat API with stream: true and Accept header is text/event-stream.
The status code is 200 but the following fatal client error is occurred. I would like to know SSE is supported or not.

Swift/ErrorType.swift:200: Fatal error: Error raised at top level: Client error - operationID: createChatCompletion, operationInput: Input(path: OpenAPITest.Operations.createChatCompletion.Input.Path(), query: OpenAPITest.Operations.createChatCompletion.Input.Query(), headers: OpenAPITest.Operations.createChatCompletion.Input.Headers(), cookies: OpenAPITest.Operations.createChatCompletion.Input.Cookies(), body: OpenAPITest.Operations.createChatCompletion.Input.Body.json(OpenAPITest.Components.Schemas.CreateChatCompletionRequest(model: OpenAPITest.Components.Schemas.CreateChatCompletionRequest.modelPayload(value1: nil, value2: Optional(gpt-3.5-turbo)), messages: [OpenAPITest.Components.Schemas.ChatCompletionRequestMessage(role: system, content: "You are the world no1 progrrammer", name: nil, function_call: nil), OpenAPITest.Components.Schemas.ChatCompletionRequestMessage(role: user, content: "Please write fizzbuzz", name: nil, function_call: nil)], functions: nil, function_call: nil, temperature: nil, top_p: nil, n: nil, stream: Optional(true), stop: nil, max_tokens: nil, presence_penalty: nil, frequency_penalty: nil, logit_bias: nil, user: nil))), request: path: /chat/completions, query: <nil>, method: HTTPMethod(value: OpenAPIRuntime.HTTPMethod.(unknown context at $1001d7918).OpenAPIHTTPMethod.POST), header fields: [accept: application/json, content-type: application/json; charset=utf-8], body (prefix): {
  "messages" : [
    {
      "content" : "",
      "role" : "system"
    },
    {
      "content" : "Please write fizzbuzz",
      "role" : "user"
    }
  ],
  "model" : "gpt-3.5-turbo",
  "stream" : true
}, baseURL: https://api.openai.com/v1, response: status: 200, header fields: [openai-organization: user-***, Date: Thu, 17 Aug 2023 08:41:09 GMT, x-ratelimit-limit-tokens: 90000, Content-Type: text/event-stream, cf-ray: ***-***, Server: cloudflare, Cache-Control: no-cache, must-revalidate, x-ratelimit-remaining-tokens: 89968, Alt-Svc: h3=":443"; ma=86400, openai-processing-ms: 6, x-ratelimit-reset-tokens: 21ms, x-ratelimit-remaining-requests: 3499, openai-version: 2020-10-01, Strict-Transport-Security: max-age=15724800; includeSubDomains, Access-Control-Allow-Origin: *, x-ratelimit-limit-requests: 3500, x-request-id: ***, cf-cache-status: DYNAMIC, x-ratelimit-reset-requests: 17ms], body: data: {"id":"chatcmpl-***","object":"chat.completion.chunk","created":1692261669,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-***, underlying error: Unexpected Content-Type header: text/event-stream
2023-08-17 17:41:12.056097+0900 OpenAPITest[69661:8827643] Swift/ErrorType.swift:200: Fatal error: Error raised at top level: Client error - operationID: createChatCompletion, operationInput: Input(path: OpenAPITest.Operations.createChatCompletion.Input.Path(), query: OpenAPITest.Operations.createChatCompletion.Input.Query(), headers: OpenAPITest.Operations.createChatCompletion.Input.Headers(), cookies: OpenAPITest.Operations.createChatCompletion.Input.Cookies(), body: OpenAPITest.Operations.createChatCompletion.Input.Body.json(OpenAPITest.Components.Schemas.CreateChatCompletionRequest(model: OpenAPITest.Components.Schemas.CreateChatCompletionRequest.modelPayload(value1: nil, value2: Optional(gpt-3.5-turbo)), messages: [OpenAPITest.Components.Schemas.ChatCompletionRequestMessage(role: system, content: "You are the world no1 progrrammer", name: nil, function_call: nil), OpenAPITest.Components.Schemas.ChatCompletionRequestMessage(role: user, content: "Please write fizzbuzz", name: nil, function_call: nil)], functions: nil, function_call: nil, temperature: nil, top_p: nil, n: nil, stream: Optional(true), stop: nil, max_tokens: nil, presence_penalty: nil, frequency_penalty: nil, logit_bias: nil, user: nil))), request: path: /chat/completions, query: <nil>, method: HTTPMethod(value: OpenAPIRuntime.HTTPMethod.(unknown context at $1001d7918).OpenAPIHTTPMethod.POST), header fields: [accept: application/json, content-type: application/json; charset=utf-8], body (prefix): {
  "messages" : [
    {
      "content" : "You are the world no1 progrrammer",
      "role" : "system"
    },
    {
      "content" : "Please write fizzbuzz",
      "role" : "user"
    }
  ],
  "model" : "gpt-3.5-turbo",
  "stream" : true
}, baseURL: https://api.openai.com/v1, response: status: 200, header fields: [openai-organization: user-***, Date: Thu, 17 Aug 2023 08:41:09 GMT, x-ratelimit-limit-tokens: 90000, Content-Type: text/event-stream, cf-ray: ***-***, Server: cloudflare, Cache-Control: no-cache, must-revalidate, x-ratelimit-remaining-tokens: 89968, Alt-Svc: h3=":443"; ma=86400, openai-processing-ms: 6, x-ratelimit-reset-tokens: 21ms, x-ratelimit-remaining-requests: 3499, openai-version: 2020-10-01, Strict-Transport-Security: max-age=15724800; includeSubDomains, Access-Control-Allow-Origin: *, x-ratelimit-limit-requests: 3500, x-request-id: ***, cf-cache-status: DYNAMIC, x-ratelimit-reset-requests: 17ms], body: data: {"id":"chatcmpl-***","object":"chat.completion.chunk","created":1692261669,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-***, underlying error: Unexpected Content-Type header: text/event-stream
(lldb) 
@czechboy0
Copy link
Contributor

czechboy0 commented Aug 17, 2023

Hi @SokichiFujita,

I took a look, and there are a few things interacting here.

  1. Neither OpenAPI 3.0.3 nor 3.1.0 explicitly support SSE. However, that doesn't mean that you can't successfully document APIs that stream response bodies.
  2. The OpenAI OpenAPI document seems incomplete. When you pass "stream": true in the request, it returns the content type text/event-stream, but that content type isn't documented in the OpenAPI document. For example, in the createChatCompletion operation, they should add something like:
responses:
  "200":
    description: OK
    content:
      application/json:
        schema:
          $ref: "#/components/schemas/CreateChatCompletionResponse"
+      text/event-stream:
+        schema:
+          type: string
+          format: binary

Once that do that, you will get a new case generated in the Output enum for the binary data. You can then process that raw data according to OpenAI's instructions, such as to split the data by empty newlines, then taking each JSON event and decoding the event (e.g. CreateChatCompletionStreamResponse) using JSONDecoder, etc. But you'll have to do that manually, since there's no way to express this pattern in OpenAPI, so the generator can't really help much here.

However, for now, you'll get a buffered data blob, until we address #9, which is planned for next month - at which point, you'd be able to fully asynchronously stream the data and parse it as it comes in.

To summarize:

  • OpenAI (or you, locally) need to update the OpenAPI document to include the second content type they return when "stream": true, and document the returned payload as binary data.
  • You then need to further process the raw data by splitting it into individual events, and parsing the JSON of those events (hopefully the events are documented in the OpenAPI document, so you can use one of the generated Codable types).
  • Finally, once we address Make request/response bodies an async sequence of bytes #9, you'll be able to fully asynchronously stream those events into your program, instead of waiting for the response to return all the data and getting back the full buffer (which happens today).

Hope this helps!

@czechboy0 czechboy0 added the kind/support Adopter support requests. label Aug 17, 2023
@czechboy0
Copy link
Contributor

There are some OSS Swift libraries for parsing the Server-side Events/EventSource format, which you could use to parse the stream and get the individual events out, which you can then feed into JSONDecoder, again for example when the event format is described by the CreateChatCompletionStreamResponse schema.

Closing this issue, but I'll file a new one to document how to document an SSE endpoint in OpenAPI.

@czechboy0 czechboy0 closed this as not planned Won't fix, can't repro, duplicate, stale Aug 17, 2023
@czechboy0
Copy link
Contributor

Added this pattern to our docs: #208

czechboy0 added a commit that referenced this issue Aug 17, 2023
Document using Server-sent Events with OpenAPI

### Motivation

Inspired by #207. While OpenAPI doesn't provide extra support for Server-sent Events, it still makes sense to document what you can achieve today - turns out it's quite a lot.

### Modifications

Documented how to spell an OpenAPI operation that returns SSE.

### Result

Folks looking to use SSE can quickly see how to consume them (how to produce them would be a similar inverse process, left as an exercise to the reader.)

### Test Plan

N/A


Reviewed by: gjcairo

Builds:
     ✔︎ pull request validation (5.8) - Build finished. 
     ✔︎ pull request validation (5.9) - Build finished. 
     ✔︎ pull request validation (docc test) - Build finished. 
     ✔︎ pull request validation (integration test) - Build finished. 
     ✔︎ pull request validation (nightly) - Build finished. 
     ✔︎ pull request validation (soundness) - Build finished. 

#208
@SokichiFujita
Copy link
Author

SokichiFujita commented Aug 18, 2023

@czechboy0

Thank you the very helpful quick response. I understand the situation and plan.

Regarding SSE, OpenAI does not provide the schema in the document. But they uses only simple data: JSON_STRING (does not use id: or event:) and the stop word is data: [DONE]. During developing a macOS AI app, I wrote the stream parser for URLSession based private OpenAI API client. So I will try to replace my client with generated code by swift-openapi-generator.

Thank you.

@SokichiFujita
Copy link
Author

SokichiFujita commented Aug 18, 2023

I just tried and this worked well with manual parsing. For now this solution seems to be enough for me. And I am looking forward to use your next release about async. Thank you.

openapi.yaml

  /chat/completions:
    post:
      operationId: createChatCompletion
      tags:
        - OpenAI
      summary: Creates a model response for the given chat conversation.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/CreateChatCompletionRequest"
      responses:
        "200":
          description: OK
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/CreateChatCompletionResponse"
          content:
            text/event-stream:
              schema:
                $ref: "#/components/schemas/CreateChatCompletionStreamResponse"

    :

    CreateChatCompletionStreamResponse:
      type: string

A part of the response of generated code (before parsing):

Ok(headers: OpenAPITest.Operations.createChatCompletion.Output.Ok.Headers(), body: OpenAPITest.Operations.createChatCompletion.Output.Ok.Body.text("data: {\"id\":\"chatcmpl-***\",\"object\":\"chat.completion.chunk\",\"created\":1692339765,\"model\":\"gpt-3.5-turbo-0613\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"\"},\"finish_reason\":null}]}\n\ndata: {\"id\":\"chatcmpl-***\",\"object\":\"chat.completion.chunk\",\"created\":1692339765,\"model\":\"gpt-3.5-turbo-0613\",\"choices\":[{\"index\":0,\"delta\":{\"content\":\"Sure\"},\"finish_reason\":null}]}\n\n

@czechboy0
Copy link
Contributor

Ah interesting - I didn't realize we generate a String underlying type for the bytes, not raw data (because the MIME type starts with text/). Added a note to #9 that we might need to consider streaming not just raw bodies, but also (at least some) text bodies.

@SokichiFujita
Copy link
Author

Yes, for the type: string, there are several values of the correspondent format like text(if format is not defined), binary, byte, date,... It's little bit complex.

https://swagger.io/docs/specification/data-models/data-types/#object

@czechboy0
Copy link
Contributor

Right, in this case we derive the underlying raw type not from the JSON schema, but from the content type. So even if you did

text/event-stream: {}

you'd get the same generated code.

@czechboy0
Copy link
Contributor

FYI #494

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Adopter support requests.
Projects
None yet
Development

No branches or pull requests

2 participants