-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Option to override HuggingFace's configurations #5205
Comments
By default the LLM model is downloaded from Huggingface/ModelScope. Is there a way we can load model from local filepath or private repository or s3 object storage? How to get models from local storage path (supported models as in vLLM ) while we try to deploy it in local environment ? |
Actually, this is already supported - just pass a filepath to |
apiVersion: serving.kserve.io/v1beta1 But its giving me the below error: |
Oh, I missed the part where you are using object storage. I only meant that local filepaths are supported. |
So it means that it doesn't support persistent storage volume path of object storage. It can only support local filepaths |
Yes, that is true. I think supporting non-local filepaths would warrant its own PR/issue. |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
Hi @DarkLight1337 ! Do you have a solution for this? |
You can take a look at the code under |
thanks for the callout re #2547 @DarkLight1337 - it seems like maybe being able to override individual fields in |
🚀 The feature, motivation and pitch
The configuration files on HuggingFace may have missing information (e.g. #2051) or contain bugs (e.g. #4008). In such cases, it may be necessary to provide/override the configuration files to enable the model to be loaded correctly. However, apart from chat templates, there is currently no method of doing so; we have to update the source HuggingFace repository directly. It may take time for the authors of those repositories to respond, especially if they are unofficial ones which are not as well-maintained.
It would be great if we could provide our own
config.json
,tokenizer_config.json
, etc., through the vLLM CLI to apply patches as necessary.Related work
#1756 lets us specify alternative chat templates or provide a chat template when it is missing from
tokenizer_config.json
. However, it currently only applies to the OpenAI API-compatible server. #5049 will add chat method to the main LLM entrypoint, but does not provide a built-in way to load the chat template automatically like in #1756.Some vLLM models have already hardcoded patches to HuggingFace
config.json
; these can be found undervllm/transformers_utils/configs
.The text was updated successfully, but these errors were encountered: