OpenAI-Compatible RESTful APIs & SDK

FastChat provides OpenAI-Compatible RESTful APIs for the its supported models (e.g. Vicuna). The following OpenAI APIs are supported:

Chat Completions. (Reference: https://platform.openai.com/docs/api-reference/chat)
Completions. (Reference: https://platform.openai.com/docs/api-reference/completions)
Embeddings. (Reference: https://platform.openai.com/docs/api-reference/embeddings)

RESTful API Server

First, launch the controller

python3 -m fastchat.serve.controller

Then, launch the model worker(s)

python3 -m fastchat.serve.model_worker --model-name 'vicuna-7b-v1.1' --model-path /path/to/vicuna/weights

Finally, launch the RESTful API server

python3 -m fastchat.serve.api_server --host localhost --port 8000

Test the API server

Chat Completions

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Text Completions

curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.1",
    "prompt": "Once upon a time",
    "max_tokens": 41,
    "temperature": 0.5
  }'

Embeddings

curl http://localhost:8000/v1/create_embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.1",
    "input": "Hello, can you tell me a joke"
  }'

Client SDK

Assuming environment variable FASTCHAT_BASEURL is set to the API server URL (e.g., http://localhost:8000), you can use the following code to send a request to the API server:

import os
from fastchat import client

client.set_baseurl(os.getenv("FASTCHAT_BASEURL"))

completion = client.ChatCompletion.create(
  model="vicuna-7b-v1.1",
  messages=[
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)

Machine Learning with Embeddings

You can use create_embedding to

Build your own classifier, see fastchat/playground/test_embedding/test_classification.py
Evaluate text similarity, see fastchat/playground/test_embedding/test_sentence_similarity.py
Search relative texts, see fastchat/playground/test_embedding/test_semantic_search.py

To these tests, you need to download the data here. You also need an OpenAI API key for comparison.

Run with:

cd playground/test_embedding
python3 test_classification.py

The script will train classifiers based on vicuna-7b, text-similarity-ada-001 and text-embedding-ada-002 and report the accuracy of each classifier.

Todos

Some features to be implemented:

Streaming
Support of some parameters like top_p, presence_penalty
Proper error handling (e.g. model not found)
The return value in the client SDK could be used like a dict

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openai_api.md

openai_api.md

OpenAI-Compatible RESTful APIs & SDK

RESTful API Server

Chat Completions

Text Completions

Embeddings

Client SDK

Machine Learning with Embeddings

Todos

Files

openai_api.md

Latest commit

History

openai_api.md

File metadata and controls

OpenAI-Compatible RESTful APIs & SDK

RESTful API Server

Chat Completions

Text Completions

Embeddings

Client SDK

Machine Learning with Embeddings

Todos