[Feature]: Option to override HuggingFace's configurations #5205

DarkLight1337 · 2024-06-03T05:45:53Z

🚀 The feature, motivation and pitch

The configuration files on HuggingFace may have missing information (e.g. #2051) or contain bugs (e.g. #4008). In such cases, it may be necessary to provide/override the configuration files to enable the model to be loaded correctly. However, apart from chat templates, there is currently no method of doing so; we have to update the source HuggingFace repository directly. It may take time for the authors of those repositories to respond, especially if they are unofficial ones which are not as well-maintained.

It would be great if we could provide our own config.json, tokenizer_config.json, etc., through the vLLM CLI to apply patches as necessary.

Related work

#1756 lets us specify alternative chat templates or provide a chat template when it is missing from tokenizer_config.json. However, it currently only applies to the OpenAI API-compatible server. #5049 will add chat method to the main LLM entrypoint, but does not provide a built-in way to load the chat template automatically like in #1756.

Some vLLM models have already hardcoded patches to HuggingFace config.json; these can be found under vllm/transformers_utils/configs.

The text was updated successfully, but these errors were encountered:

Suvralipi · 2024-06-04T04:49:50Z

By default the LLM model is downloaded from Huggingface/ModelScope. Is there a way we can load model from local filepath or private repository or s3 object storage? How to get models from local storage path (supported models as in vLLM ) while we try to deploy it in local environment ?

DarkLight1337 · 2024-06-04T05:17:40Z

By default the LLM model is downloaded from Huggingface/ModelScope. Is there a way we can load model from local filepath or private repository or s3 object storage? How to get models from local storage path (supported models as in vLLM ) while we try to deploy it in local environment ?

Actually, this is already supported - just pass a filepath to --model.

Suvralipi · 2024-06-04T05:49:07Z

Actually, this is already supported - just pass a filepath to --model.
I am trying to deploy model using Kserve with vLLM with inference service as below: The pvc filepath has the required model which can be accessible with other running pods. I am getting similar error if I try to use s3 filepath as well. Both the paths contain the model and config files as expected.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: opt-125m-vllm
spec:
predictor:
containers:
- args:
- --port
- "8080"
- --model
- "pvc://kubeflow-shared-pvc/llm-mlflow/opt-125m"
command:
- python3
- -m
- vllm.entrypoints.api_server
env:
- name: STORAGE_URI
value: "pvc://kubeflow-shared-pvc/llm-mlflow/opt-125m"
- name: PYTORCH_CUDA_ALLOC_CONF
value: "max_split_size_mb:2048"
image: kserve/vllmserver:latest
name: kserve-container
resources:
limits:
cpu: "4"
memory: 8Gi
nvidia.com/gpu: "1"
requests:
cpu: "1"
memory: 8Gi
nvidia.com/gpu: "1"

But its giving me the below error:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.8/dist-packages/vllm/entrypoints/api_server.py", line 78, in
engine = AsyncLLMEngine.from_engine_args(engine_args)
File "/usr/local/lib/python3.8/dist-packages/vllm/engine/async_llm_engine.py", line 226, in from_engine_args
engine_configs = engine_args.create_engine_configs()
File "/usr/local/lib/python3.8/dist-packages/vllm/engine/arg_utils.py", line 147, in create_engine_configs
model_config = ModelConfig(self.model, self.tokenizer,
File "/usr/local/lib/python3.8/dist-packages/vllm/config.py", line 57, in init
self.hf_config = get_config(model, trust_remote_code)
File "/usr/local/lib/python3.8/dist-packages/vllm/transformers_utils/config.py", line 17, in get_config
config = AutoConfig.from_pretrained(
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py", line 1007, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/transformers/configuration_utils.py", line 620, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/transformers/configuration_utils.py", line 696, in _get_config_dict
raise EnvironmentError(
OSError: Can't load the configuration of 'pvc://kubeflow-shared-pvc/llm-mlflow/opt-125m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'pvc://kubeflow-shared-pvc/llm-mlflow/opt-125m' is the correct path to a directory containing a config.json file

DarkLight1337 · 2024-06-04T05:54:33Z

Oh, I missed the part where you are using object storage. I only meant that local filepaths are supported.

Suvralipi · 2024-06-04T05:56:57Z

Oh, I missed the part where you are using object storage. I only meant that local filepaths are supported.

So it means that it doesn't support persistent storage volume path of object storage. It can only support local filepaths

DarkLight1337 · 2024-06-04T06:00:08Z

Oh, I missed the part where you are using object storage. I only meant that local filepaths are supported.

So it means that it doesn't support persistent storage volume path of object storage. It can only support local filepaths

Yes, that is true. I think supporting non-local filepaths would warrant its own PR/issue.

github-actions · 2024-10-26T01:59:57Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

zwhe99 · 2024-10-29T12:30:45Z

Hi @DarkLight1337 ! Do you have a solution for this?

DarkLight1337 · 2024-10-29T12:41:56Z

You can take a look at the code under vllm.transformers_utils and figure out how to pass user configs to override the configs loaded from HF there.

K-Mistele · 2024-10-30T04:12:26Z

thanks for the callout re #2547 @DarkLight1337 - it seems like maybe being able to override individual fields in config.json through engine/CLI args would be a good approach? if so, I think llama.cpp has a good reference implementation; they allow doing this to override GGUF file key / value pairs with a custom value

DarkLight1337 added the feature request label Jun 3, 2024

DarkLight1337 added the good first issue Good for newcomers label Jun 14, 2024

github-actions bot added the stale label Oct 26, 2024

DarkLight1337 mentioned this issue Oct 30, 2024

Allow passing hf config args with openai server #2547

Closed

github-actions bot added unstale and removed stale labels Nov 1, 2024

DarkLight1337 mentioned this issue Nov 2, 2024

[Frontend][Core] Override HF config.json via CLI #5836

Merged

DarkLight1337 closed this as completed in #5836 Nov 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Option to override HuggingFace's configurations #5205

[Feature]: Option to override HuggingFace's configurations #5205

DarkLight1337 commented Jun 3, 2024 •

edited

Loading

Suvralipi commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

Suvralipi commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

Suvralipi commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

github-actions bot commented Oct 26, 2024

zwhe99 commented Oct 29, 2024

DarkLight1337 commented Oct 29, 2024

K-Mistele commented Oct 30, 2024

[Feature]: Option to override HuggingFace's configurations #5205

[Feature]: Option to override HuggingFace's configurations #5205

Comments

DarkLight1337 commented Jun 3, 2024 • edited Loading

🚀 The feature, motivation and pitch

Related work

Suvralipi commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

Suvralipi commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

Suvralipi commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

github-actions bot commented Oct 26, 2024

zwhe99 commented Oct 29, 2024

DarkLight1337 commented Oct 29, 2024

K-Mistele commented Oct 30, 2024

DarkLight1337 commented Jun 3, 2024 •

edited

Loading