Allow trust_remote_code and use_fast to be specified in args for Hugging Face #2644

yifanmai · 2024-05-13T23:44:32Z

`trust_remote_code`

Currently, attempting to set trust_remote_code in args for HuggingFaceClient fails because trust_remote_code is already hardcoded to True, resulting in the error message:

Traceback (most recent call last):
  File "/.../helm/venv/lib/python3.8/site-packages/retrying.py", line 251, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "/.../helm/src/helm/clients/auto_client.py", line 111, in make_request_with_retry
    return client.make_request(request)
  File "/.../helm/src/helm/clients/huggingface_client.py", line 267, in make_request
    huggingface_model: HuggingFaceServer = HuggingFaceServerFactory.get_server(
  File "/.../helm/src/helm/clients/huggingface_client.py", line 209, in get_server
    HuggingFaceServerFactory._servers[helm_model_name] = HuggingFaceServer(
  File "/.../helm/src/helm/clients/huggingface_client.py", line 89, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(
TypeError: from_pretrained() got multiple values for keyword argument 'trust_remote_code'

We fix this by only passing trust_remote_code=True to AutoModelForCausalLM if it is not specified in args. Eventually we will break backwards compatibility - we will stop passing trust_remote_code=True if unspecified, and require users to specify it trust_remote_code in args if they need it. This is because the trust_remote_code=True default is a security risk.

`use_fast`

Currently, it is not possible to set use_fast for Hugging Face tokenizers, because two different Hugging Face tokenizers are instantiated for a run that uses HuggingFaceClient, one inside HuggingFaceClient here and one inside HuggingFaceTokenizer. This is problematic because the first tokenizer is instantiated using the args from the model_deployment.yaml, not the args from tokenizer_config.yaml, whereas the second tokenizer is instantiated using the args from tokenizer_config.yaml. Attempting to set use_fast in model_deployments.yaml results in this error:

Traceback (most recent call last):
  File "/.../helm/venv/lib/python3.8/site-packages/retrying.py", line 251, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "/.../helm/src/helm/clients/auto_client.py", line 111, in make_request_with_retry
    return client.make_request(request)
  File "/.../helm/src/helm/clients/huggingface_client.py", line 267, in make_request
    huggingface_model: HuggingFaceServer = HuggingFaceServerFactory.get_server(
  File "/.../helm/src/helm/clients/huggingface_client.py", line 209, in get_server
    HuggingFaceServerFactory._servers[helm_model_name] = HuggingFaceServer(
  File "/.../helm/src/helm/clients/huggingface_client.py", line 89, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(
  File "/.../helm/venv/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/.../helm/venv/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
TypeError: __init__() got an unexpected keyword argument 'use_fast'

We fix this by deleting the first tokenizer and only using the second tokenizer.

Fixes #2639

yifanmai · 2024-06-08T00:36:35Z

These changes were merged by #2647

Allow trust_remote_code to be specified in HuggingFaceClient kwargs

2e97b09

yifanmai requested review from percyliang and farzaank May 13, 2024 23:44

yifanmai mentioned this pull request May 13, 2024

Don't set trust_remote_code=True by default in HuggingFaceClient #2645

Open

yifanmai changed the title ~~Allow trust_remote_code to be specified in HuggingFaceClient kwargs~~ Allow trust_remote_code and use_fast to be specified in kwargs for Hugging Face May 14, 2024

yifanmai changed the title ~~Allow trust_remote_code and use_fast to be specified in kwargs for Hugging Face~~ Allow trust_remote_code and use_fast to be specified in args for Hugging Face May 14, 2024

yifanmai added 4 commits May 13, 2024 17:59

More changes

1095751

More fixes

eaccf94

Some changes

8a5e714

Merge branch 'main' into yifanmai/fix-trust-remote-code-duplicate

6f87032

yifanmai closed this Jun 8, 2024

yifanmai deleted the yifanmai/fix-trust-remote-code-duplicate branch June 8, 2024 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow trust_remote_code and use_fast to be specified in args for Hugging Face #2644

Allow trust_remote_code and use_fast to be specified in args for Hugging Face #2644

yifanmai commented May 13, 2024 •

edited

Loading

yifanmai commented Jun 8, 2024

Allow trust_remote_code and use_fast to be specified in args for Hugging Face #2644

Allow trust_remote_code and use_fast to be specified in args for Hugging Face #2644

Conversation

yifanmai commented May 13, 2024 • edited Loading

trust_remote_code

use_fast

yifanmai commented Jun 8, 2024

yifanmai commented May 13, 2024 •

edited

Loading

`trust_remote_code`

`use_fast`