FastChat provides OpenAI-Compatible RESTful APIs for the its supported models (e.g. Vicuna). The following OpenAI APIs are supported:
- Chat Completions. (Reference: https://platform.openai.com/docs/api-reference/chat)
- Completions. (Reference: https://platform.openai.com/docs/api-reference/completions)
- Embeddings. (Reference: https://platform.openai.com/docs/api-reference/embeddings)
First, launch the controller
python3 -m fastchat.serve.controller
Then, launch the model worker(s)
python3 -m fastchat.serve.model_worker --model-name 'vicuna-7b-v1.1' --model-path /path/to/vicuna/weights
Finally, launch the RESTful API server
python3 -m fastchat.serve.api_server --host localhost --port 8000
Test the API server
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "vicuna-7b-v1.1",
"messages": [{"role": "user", "content": "Hello!"}]
}'
curl http://localhost:8000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "vicuna-7b-v1.1",
"prompt": "Once upon a time",
"max_tokens": 41,
"temperature": 0.5
}'
curl http://localhost:8000/v1/create_embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "vicuna-7b-v1.1",
"input": "Hello, can you tell me a joke"
}'
Assuming environment variable FASTCHAT_BASEURL
is set to the API server URL (e.g., http://localhost:8000
), you can use the following code to send a request to the API server:
import os
from fastchat import client
client.set_baseurl(os.getenv("FASTCHAT_BASEURL"))
completion = client.ChatCompletion.create(
model="vicuna-7b-v1.1",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(completion.choices[0].message)
You can use create_embedding
to
- Build your own classifier, see fastchat/playground/test_embedding/test_classification.py
- Evaluate text similarity, see fastchat/playground/test_embedding/test_sentence_similarity.py
- Search relative texts, see fastchat/playground/test_embedding/test_semantic_search.py
To these tests, you need to download the data here. You also need an OpenAI API key for comparison.
Run with:
cd playground/test_embedding
python3 test_classification.py
The script will train classifiers based on vicuna-7b
, text-similarity-ada-001
and text-embedding-ada-002
and report the accuracy of each classifier.
Some features to be implemented:
- Streaming
- Support of some parameters like
top_p
,presence_penalty
- Proper error handling (e.g. model not found)
- The return value in the client SDK could be used like a dict