From 88a1e81ca6a3c944d385dd33b5c3d9a25ac9156c Mon Sep 17 00:00:00 2001 From: Nikita Savelyev Date: Sun, 17 Nov 2024 17:23:08 +0100 Subject: [PATCH] Add export command for compressed VLMs --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3073c4f9f4..fe7a6d3533 100644 --- a/README.md +++ b/README.md @@ -108,7 +108,11 @@ For more examples check out our [LLM Inference Guide](https://docs.openvino.ai/2 ### Converting and compressing the model from Hugging Face library ```sh -optimum-cli export openvino --model openbmb/MiniCPM-V-2_6 --trust-remote-code MiniCPM-V-2_6 +#(Basic) download and convert to OpenVINO MiniCPM-V-2_6 model +optimum-cli export openvino --model openbmb/MiniCPM-V-2_6 --trust-remote-code --weight-format fp16 MiniCPM-V-2_6 + +#(Recommended) Same as above but with compression: language model is compressed to int4, other model components are compressed to int8 +optimum-cli export openvino --model openbmb/MiniCPM-V-2_6 --trust-remote-code --weight-format int4 MiniCPM-V-2_6 ``` ### Run generation using VLMPipeline API in Python