This project provides middleware that mimics the OpenAI API for the Livepeer AI gateway. This allows developers to use Livepeer AI gateways using the openai
library, creating a developer experience AI application developers are accustomed with.
The program runs an HTTP server connected to a Livepeer AI gateway. It takes user requests in OpenAI format, forwards them to the specified Livepeer AI Gateway, and returns the response in OpenAI format to the caller.
The common
package exposes helpers that help you transform between OpenAI and and Livepeer's AI Worker types.
func TransformRequest(openAIReq models.OpenAIRequest) (*worker.LlmGenerateFormdataRequestBody, error)
TransformResponse(req *worker.LlmGenerateFormdataRequestBody, resp *http.Response) (*models.OpenAIResponse, error)
func TransformStreamResponse(chunk worker.LlmStreamChunk, streamID string) (models.OpenAIStreamResponse, error)
It also exports a function for you to easily handle streaming responses and receive them in OpenAI format on a channel, for you to handle as you see fit. The server mode returns them over SSE, but websockets or other transport methods can also be used
func HandleStreamingResponse(ctx context.Context, resp *http.Response) (<-chan models.OpenAIStreamResponse, <-chan error)
import OpenAI from "openai"
const oai = new OpenAI({
baseURL: "<MIDDLEWARE_SERVER>",
})
try {
const res = await oai.chat.completions.create({
messages: [
{
role: "user",
content: "Tell me more about the Livepeer network"
}
],
model: "meta-llama/Llama-3.1-8B-Instruct"
stream: true
})
// log the reply
console.log(res.choices[0].message.content)
} catch(err) {
console.log(err)
}
- Forward LLM chat completion requests to a gateway using the LLM Pipeline.
- Stream support for LLM requests using server-sent-events (SSE).
- Forward other completion requests for other types of pipelines/models
- Go 1.21.x
- Docker (optional)
git clone https://github.com/livepool-io/livepeer-openai-api-middleware.git
Run the server
go openai-api -gateway <LIVEPEER_GATEWAY>
-
Install dependencies
go mod download
-
Build binary
go build openai-api.go
-
Run the program
./openai-api -gateway <LIVEPEER_GATEWAY>
-
Build container
docker build -t openai-api .
-
Run container
docker run -p 8080:8080 openai-api -gateway <LIVEPEER_GATEWAY>
The main difference between OpenAI and the Livepeer Gateway is that the Livepeer Gateway currenetly follows a schema similar to what Llama models would expect.
OpenAI uses messages
for both the current prompt and the chat history, while the gateway has a separete field for the current prompt and the previous history.
The first message in the messages
array would be equivalent to the system message.
The response is also different, where with OpenAI it's a stringified JSON object, while the gateway returns the string directly.
type OpenAIRequest struct {
Model string `json:"model"`
Messages []OpenAIMessage `json:"messages"`
MaxTokens int `json:"max_tokens,omitempty"`
Temperature float64 `json:"temperature,omitempty"`
TopP float64 `json:"top_p,omitempty"`
N int `json:"n,omitempty"`
Stop []string `json:"stop,omitempty"`
PresencePenalty float64 `json:"presence_penalty,omitempty"`
FrequencyPenalty float64 `json:"frequency_penalty,omitempty"`
User string `json:"user,omitempty"`
Stream bool `json:"stream,omitempty"`
}
type OpenAIResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []Choice `json:"choices"`
Usage Usage `json:"usage"`
}
type OpenAIStreamResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []StreamChoice `json:"choices"`
}
type BodyLlmGenerateLlmGeneratePost struct {
History *string `json:"history,omitempty"`
MaxTokens *int `json:"max_tokens,omitempty"`
ModelId *string `json:"model_id,omitempty"`
Prompt string `json:"prompt"`
Stream *bool `json:"stream,omitempty"`
SystemMsg *string `json:"system_msg,omitempty"`
Temperature *float32 `json:"temperature,omitempty"`
}
type LlmResponse struct {
Response string `json:"response"`
TokensUsed int `json:"tokens_used"`
}
type LlmStreamChunk struct {
Chunk string `json:"chunk,omitempty"`
TokensUsed int `json:"tokens_used,omitempty"`
Done bool `json:"done,omitempty"`
}