-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
will.yang
committed
May 9, 2024
1 parent
b81deb2
commit d59d017
Showing
27 changed files
with
1,141 additions
and
77 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# CHANGELOG | ||
## v1.0.1 | ||
- Optimize model conversion memory occupation | ||
- Optimize inference memory occupation | ||
- Increase prefill speed | ||
- Reduce initialization time | ||
- Improve quantization accuracy | ||
- Add support for Gemma, ChatGLM3, MiniCPM, InternLM2, and Phi-3 | ||
- Add Server invocation | ||
- Add inference interruption interface | ||
- Add logprob and token_id to the return value | ||
|
||
## v1.0.0 | ||
- Supports the conversion and deployment of LLM models on RK3588/RK3576 platforms | ||
- Compatible with Hugging Face model architectures | ||
- Currently supports the models Llama, Qwen, Qwen2, and Phi-2 | ||
- Supports quantization with w8a8 and w4a16 precision |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
Copyright (c) Rockchip Electronics Co., Ltd. | ||
All rights reserved. | ||
|
||
// Redistribution and use in source and binary forms, with or without | ||
// modification, are permitted provided that the following conditions are met: | ||
// | ||
// 1. Redistributions of source code must retain the above copyright notice, | ||
// this list of conditions and the following disclaimer. | ||
// | ||
// 2. Redistributions in binary form must reproduce the above copyright notice, | ||
// this list of conditions and the following disclaimer in the documentation | ||
// and/or other materials provided with the distribution. | ||
// | ||
// 3. Neither the name of the copyright holder nor the names of its contributors | ||
// may be used to endorse or promote products derived from this software without | ||
// specific prior written permission. | ||
// | ||
// 4. This Software may contain some Open Source Software. You may not redistribute | ||
// and/or modify such Open Source Software except in compliance with the applicable | ||
// Open Source License. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | ||
// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE | ||
// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR | ||
// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF | ||
// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS | ||
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN | ||
// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) | ||
// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | ||
// POSSIBILITY OF SUCH DAMAGE. | ||
|
||
The following Open Source Software have been modified by Rockchip Electronics Co., Ltd. | ||
---------------------------------------------------------------------------------------- | ||
1. ggml master | ||
Copyright (c) 2023-2024 The ggml authors | ||
All rights reserved. | ||
Licensed under the terms of the MIT License | ||
|
||
2. llama.cpp master | ||
Copyright (c) 2023-2024 The ggml authors | ||
All rights reserved. | ||
Licensed under the terms of the MIT License | ||
|
||
The terms of the MIT License: | ||
-------------------------------------------------------------------- | ||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# RKLLM-Server Demo | ||
## Before Run | ||
Before running the demo, you need to prepare the following files: | ||
- The transformed RKLLM model file in board. | ||
- check the IP address of the board with 'ifconfig' command. | ||
|
||
## RKLLM-Server-Flask Demo | ||
### Build | ||
You can run the demo with the only command: | ||
```bash | ||
# ./build_rkllm_server_flask.sh [target_platform:rk3588/rk3576] [RKLLM-Server workshop] [transformed_rkllm_model_path in borad] | ||
./build_rkllm_server_flask.sh rk3588 /user/data/rkllm_server /user/data/rkllm_server/model.rkllm | ||
``` | ||
### Access with API | ||
After building the RKLLM-Server-Flask, You can use ‘chat_api_flask.py’ to access the RKLLM-Server-Flask and get the answser of RKLLM models. | ||
|
||
Attention: you should check the IP address of the board with 'ifconfig' command and replace the IP address in the ‘chat_api_flask.py’. | ||
|
||
## RKLLM-Server-Gradio Demo | ||
### Build | ||
You can run the demo with the only command: | ||
```bash | ||
# ./build_rkllm_server_gradio.sh [target_platform:rk3588/rk3576] [RKLLM-Server workshop] [transformed_rkllm_model_path in borad] | ||
./build_rkllm_server_gradio.sh rk3588 /user/data/rkllm_server /user/data/rkllm_server/model.rkllm | ||
``` | ||
### Access the Server | ||
After running the demo, You can access the RKLLM-Server-Gradio with two ways: | ||
1. Just Start your browser and access the URL: ‘http://[board_ip]:8080/’. You can chat with the RKLLM models in visual interface. | ||
2. Use the 'chat_api_gradio.py'(you need fix the IP address in the code previously) and get the answser of RKLLM models. | ||
|
61 changes: 61 additions & 0 deletions
61
rkllm-runtime/examples/rkllm_server_demo/build_rkllm_server_flask.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
#!/bin/bash | ||
|
||
#*****************************************************************************************# | ||
# 该脚本为 RKLLM-Server-Flask 服务的一键设置脚本 | ||
# 用户可以运行该脚本实现Linux板端的 RKLLM-Server-Flask 服务的自动化部署。 | ||
# 使用说明: ./build_rkllm_server_flask.sh [目标平台:rk3588/rk3576] [RKLLM-Server工作路径] [已转换的rkllm模型在板端的绝对路径] | ||
# example: ./build_rkllm_server_flask.sh rk3588 /user/data/rkllm_server /user/data/rkllm_server/model.rkllm | ||
#*****************************************************************************************# | ||
|
||
#################### 检查板端是否已经安装了 pip/gradio 库 #################### | ||
# 1.准备板端的gradio环境 | ||
adb shell << EOF | ||
# 检查是否安装了 pip3 | ||
if ! command -v pip3 &> /dev/null; then | ||
echo "-------- pip3 未安装,将进行安装... --------" | ||
# 安装 pip3 | ||
sudo apt update | ||
sudo apt install python3-pip -y | ||
else | ||
echo "-------- pip3 已经安装 --------" | ||
fi | ||
# 检查是否安装了 flask | ||
if ! python3 -c "import flask" &> /dev/null; then | ||
echo "-------- flask 未安装,将进行安装... --------" | ||
# 安装 flask | ||
pip install flask==2.2.2 Werkzeug==2.2.2 -i https://pypi.tuna.tsinghua.edu.cn/simple | ||
else | ||
echo "-------- flask 已经安装 --------" | ||
fi | ||
exit | ||
EOF | ||
|
||
#################### 推送 server 运行的相关文件进入板端 #################### | ||
# 2.检查需要推送进板端的路径是否存在 | ||
adb shell ls $2 > /dev/null 2>&1 | ||
if [ $? -ne 0 ]; then | ||
# 如果路径不存在,则创建路径 | ||
adb shell mkdir -p $2 | ||
echo "-------- rkllm_server 工作目录不存在,已创建目录 --------" | ||
else | ||
echo "-------- rkllm_server 工作目录已存在 --------" | ||
fi | ||
|
||
# 3.更新 ./rkllm_server/lib 中的 librkllmrt.so 文件 | ||
cp ../../runtime/Linux/librkllm_api/aarch64/librkllmrt.so ./rkllm_server/lib/ | ||
|
||
# 4.推送文件到 Linux 板端 | ||
adb push ./rkllm_server $2 | ||
|
||
#################### 进入板端并启动 server 服务 #################### | ||
# 5.进入板端启动 server 服务 | ||
adb shell << EOF | ||
cd $2/rkllm_server/ | ||
python3 flask_server.py --target_platform $1 --rkllm_model_path $3 | ||
EOF |
61 changes: 61 additions & 0 deletions
61
rkllm-runtime/examples/rkllm_server_demo/build_rkllm_server_gradio.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
#!/bin/bash | ||
|
||
#*****************************************************************************************# | ||
# 该脚本为 RKLLM-Server-Gradio 服务的一键设置脚本 | ||
# 用户可以运行该脚本实现Linux板端的 RKLLM-Server-Gradio 服务的自动化部署。 | ||
# 使用说明: ./build_rkllm_server_gradio.sh [目标平台:rk3588/rk3576] [RKLLM-Server工作路径] [已转换的rkllm模型在板端的绝对路径] | ||
# example: ./build_rkllm_server_gradio.sh rk3588 /user/data/rkllm_server /user/data/rkllm_server/model.rkllm | ||
#*****************************************************************************************# | ||
|
||
#################### 检查板端是否已经安装了 pip/gradio 库 #################### | ||
# 1.准备板端的gradio环境 | ||
adb shell << EOF | ||
# 检查是否安装了 pip3 | ||
if ! command -v pip3 &> /dev/null; then | ||
echo "-------- pip3 未安装,将进行安装... --------" | ||
# 安装 pip3 | ||
sudo apt update | ||
sudo apt install python3-pip -y | ||
else | ||
echo "-------- pip3 已经安装 --------" | ||
fi | ||
# 检查是否安装了 gradio | ||
if ! python3 -c "import gradio" &> /dev/null; then | ||
echo "-------- Gradio 未安装,将进行安装... --------" | ||
# 安装 Gradio | ||
pip3 install gradio>=4.24.0 -i https://pypi.tuna.tsinghua.edu.cn/simple/ | ||
else | ||
echo "-------- Gradio 已经安装 --------" | ||
fi | ||
exit | ||
EOF | ||
|
||
#################### 推送 server 运行的相关文件进入板端 #################### | ||
# 2.检查需要推送进板端的路径是否存在 | ||
adb shell ls $2 > /dev/null 2>&1 | ||
if [ $? -ne 0 ]; then | ||
# 如果路径不存在,则创建路径 | ||
adb shell mkdir -p $2 | ||
echo "-------- rkllm_server 工作目录不存在,已创建目录 --------" | ||
else | ||
echo "-------- rkllm_server 工作目录已存在 --------" | ||
fi | ||
|
||
# 3.更新 ./rkllm_server/lib 中的 librkllmrt.so 文件 | ||
cp ../../runtime/Linux/librkllm_api/aarch64/librkllmrt.so ./rkllm_server/lib/ | ||
|
||
# 4.推送文件到 Linux 板端 | ||
adb push ./rkllm_server $2 | ||
|
||
#################### 进入板端并启动 server 服务 #################### | ||
# 5.进入板端启动 server 服务 | ||
adb shell << EOF | ||
cd $2/rkllm_server/ | ||
python3 gradio_server.py --target_platform $1 --rkllm_model_path $3 | ||
EOF |
Oops, something went wrong.