-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Community contribution - optimum.exporters.onnx
support for new models!
#555
Comments
Hi! I'm trying to add support for VisualBERT, which works for VQA, VCR, NLVR and RPG. The problem comes when implementing the inputs property... What is it that this property specifies? In the guide, I see that this inputs are exactly BERT's tokenizer's output keys, and values are the tensor dimensions for each key of the tokenizer's output. This will vary task-wise so I'd have to make a different axis for each task. Is this ok? Thanks for the help! EDIT: I see VisualBERT is implemented separately by task, but VisualBertForPreTraining is also provided for customized down-stream tasks. Should I implement a diferent configuration for each task? EDIT II: I see this issue was previously in the transformers repo, it seems like the docs on how to add the ONNX configuration are written in a way that ignores the current optimum implementation, I have sorted some of the difficulties that arise from this assuming one ONNX config for the whole model. Can I help with an update for this guide? |
Hi @mszsorondo , indeed the page https://huggingface.co/docs/transformers/serialization#export-to-onnx is a bit outdated. I'll do a PR to fix it. In your EDIT II, were you referring to this page? I'd recommend to refer to: https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute . If you see any issue / unclear steps in the guide, don't hesitate to open a PR! As for VisualBERT, I guess you haven't picked the easiest one :) I think you can leave Indeed
I don't think you need to implement configs for each tasks. Apparently all tasks take as inputs To implement the optimum/optimum/exporters/onnx/model_configs.py Lines 523 to 528 in 9ac1703
You can very well do an if/else in the input/output keys (or axis) depending on the task, for example BART: optimum/optimum/exporters/onnx/model_configs.py Lines 382 to 389 in 9ac1703
I think the piece where you will have the most work to do is to extend the dummy inputs generators. They are meant to generate inputs for the model, without using a preprocessor, and help to flexibly generate inputs of various shapes for example (for export validation). You would need to extend an existing one, or create a new input generator to support the |
Thanks for your help @fxmarty
I was actually referring to the second guide (https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute), there are some minor issues with two function calls at the export step + one lacking import. Submitted PR #662 I advanced with the inputs function and did the export step, and indeed got an error regarding |
Hi @michaelbenayoun! Is someone working on adding the Pegasus ONNX config? If not, I would like to look into it 😄(under your guidance, since I haven't done written a ONNXConfig yet) |
Hi @bhavnicksm , @mht-sharma just merged the Pegasus ONNX config yesterday! #620 |
@fxmarty Still facing an issue
I installed optimum directly from source here using !pip install --quiet git+https://github.com/huggingface/optimum.git I tried to use Pegasus with an inference right now using ORTModelforSeq2SeqLM, using the following code: from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from optimum.onnxruntime import ORTModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("tuner007/pegasus_paraphrase")
model = AutoModelForSeq2SeqLM.from_pretrained("tuner007/pegasus_paraphrase")
ort_model = ORTModelForSeq2SeqLM.from_pretrained("tuner007/pegasus_paraphrase", from_transformers=True) and it gives me the following error: /usr/local/lib/python3.8/dist-packages/transformers/models/pegasus/modeling_pegasus.py:234: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/usr/local/lib/python3.8/dist-packages/transformers/models/pegasus/modeling_pegasus.py:241: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attention_mask.size() != (bsz, 1, tgt_len, src_len):
/usr/local/lib/python3.8/dist-packages/transformers/models/pegasus/modeling_pegasus.py:273: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-7-2e0907dfd025>](https://localhost:8080/#) in <module>
----> 1 ort_model = ORTModelForSeq2SeqLM.from_pretrained("tuner007/pegasus_paraphrase", from_transformers=True)
9 frames
[/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/modeling_ort.py](https://localhost:8080/#) in from_pretrained(cls, model_id, from_transformers, force_download, use_auth_token, cache_dir, subfolder, config, local_files_only, provider, session_options, provider_options, **kwargs)
555 `ORTModel`: The loaded ORTModel model.
556 """
--> 557 return super().from_pretrained(
558 model_id,
559 from_transformers=from_transformers,
[/usr/local/lib/python3.8/dist-packages/optimum/modeling_base.py](https://localhost:8080/#) in from_pretrained(cls, model_id, from_transformers, force_download, use_auth_token, cache_dir, subfolder, config, local_files_only, **kwargs)
323
324 from_pretrained_method = cls._from_transformers if from_transformers else cls._from_pretrained
--> 325 return from_pretrained_method(
326 model_id=model_id,
327 config=config,
[/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/modeling_seq2seq.py](https://localhost:8080/#) in _from_transformers(cls, model_id, config, use_auth_token, revision, force_download, cache_dir, subfolder, local_files_only, use_cache, provider, session_options, provider_options, use_io_binding, task)
1144 output_names.append(ONNX_DECODER_WITH_PAST_NAME)
1145 models_and_onnx_configs = get_encoder_decoder_models_for_export(model, onnx_config)
-> 1146 export_models(
1147 models_and_onnx_configs=models_and_onnx_configs,
1148 opset=onnx_config.DEFAULT_ONNX_OPSET,
[/usr/local/lib/python3.8/dist-packages/optimum/exporters/onnx/convert.py](https://localhost:8080/#) in export_models(models_and_onnx_configs, output_dir, opset, output_names, device, input_shapes)
534
535 outputs.append(
--> 536 export(
537 model=submodel,
538 config=sub_onnx_config,
[/usr/local/lib/python3.8/dist-packages/optimum/exporters/onnx/convert.py](https://localhost:8080/#) in export(model, config, output, opset, device, input_shapes)
605 f" got: {torch.__version__}"
606 )
--> 607 return export_pytorch(model, config, opset, output, device=device, input_shapes=input_shapes)
608
609 elif is_tf_available() and issubclass(type(model), TFPreTrainedModel):
[/usr/local/lib/python3.8/dist-packages/optimum/exporters/onnx/convert.py](https://localhost:8080/#) in export_pytorch(model, config, opset, output, device, input_shapes)
368 # Export can work with named args but the dict containing named args has to be the last element of the args
369 # tuple.
--> 370 onnx_export(
371 model,
372 (dummy_inputs,),
[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, custom_opsets, export_modules_as_functions)
502 """
503
--> 504 _export(
505 model,
506 args,
[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _export(model, args, f, export_params, verbose, training, input_names, output_names, operator_export_type, export_type, opset_version, do_constant_folding, dynamic_axes, keep_initializers_as_inputs, fixed_batch_size, custom_opsets, add_node_names, onnx_shape_inference, export_modules_as_functions)
1527 _validate_dynamic_axes(dynamic_axes, model, input_names, output_names)
1528
-> 1529 graph, params_dict, torch_out = _model_to_graph(
1530 model,
1531 args,
[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _model_to_graph(model, args, verbose, input_names, output_names, operator_export_type, do_constant_folding, _disable_torch_constant_prop, fixed_batch_size, training, dynamic_axes)
1113
1114 try:
-> 1115 graph = _optimize_graph(
1116 graph,
1117 operator_export_type,
[/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py](https://localhost:8080/#) in _optimize_graph(graph, operator_export_type, _disable_torch_constant_prop, fixed_batch_size, params_dict, dynamic_axes, input_names, module)
662
663 graph = _C._jit_pass_onnx(graph, operator_export_type)
--> 664 _C._jit_pass_onnx_lint(graph)
665 _C._jit_pass_lint(graph)
666
RuntimeError: Unable to cast from non-held to held instance (T& to Holder<T>) (#define PYBIND11_DETAILED_ERROR_MESSAGES or compile in debug mode for type information) |
@bhavnicksm Can you open an issue in Optimum with your environment details? We can track it there! |
@fxmarty Please re-open this. 🤗 |
Thanks! |
I can look into ImageGPT, if it has not yet been claimed. |
Feel free! Don't hesitate to ask any question if needed. |
Can I take TAPAS if it's not yet been claimed? |
Hello, Can I work on RegNet? |
Yes to both, feel free! |
Hi @michaelbenayoun, I went into the codebase recently and I think the list above may not be the latest update. I found that a few models such as
already have their own configurations in this file. |
thank you @hazrulakmal , I updated the list! |
hi , is optimum supports converting Llama (alpaca-lora) to onnx ? It would be great if i get some insights in this |
Yes, this is supported and was introduced in #975. You'll need to have Optimum v1.8 to do it. |
The Are you doing a PR that will be merged on _SUPPORTED_MODEL_TYPE = {
....,
"custom": supported_task_mapping("text-classification", ...., onnx="CustomOnnxConfig")
} But if you are not doing a PR that will be merged in register_for_onnx = TasksManager.create_register("onnx")
@register_for_onnx("model_type_here", "text-classification", ...)
class CustomOnnxConfig(TextEncoderOnnxConfig):
... |
If you do it programatically I do not think you need to register anything. |
Alright, could you open a PR for your issue please? |
Thank you for spending time on me! I think PR will be a difficult thing to do, since I am not that proficient and do not think many people will want to use my architecture anyway. Maybe you can advice how to do it code just for my library? base_model = CustomBertForTokenClassification.from_pretrained("my-checkpoint")
|
Sorry I meant a separete issue... |
Thank you a lot, I'll delete my comments here since they are unrelated to the discussion. I asked on discussion forum |
I can work on CvT, if its open |
Hi @rishabbala , sounds good, let us know if you encounter any help! A good reference is https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute |
According to the above list, export of BLOOM models to ONNX is already supported, right? Is export to ONNX already supposed to work for base models that have been finetuned with PEFT / LoRA? Using the |
Hello, I would like to add onnx exporter support for Funnel Transformer. |
Hi @sidistic! Feel free to open a PR here and we'll help you if there is any issue 🙂 |
Hello @regisss! I have opened a PR. This is my first ever PR on an open source project so looking forward to hearing your advice and learning from you. |
Hello, is anyone working to implement this? If not then I might look into it
|
Hi, I'm trying to export ChatGLM2 & Qwen models to onnx using hf optimum.
I'm using this code to export chatglm2: https://gist.github.com/manishghop/9be5aee6ed3d7551c751cc5d9f7eb8c3
Thanks in advance |
Hey all, Wanted to see if I could pick up doing the Canine implementation. I saw @RaghavPrabhakar66 was doing some work on it in the previous issue thread, but didn't see an official PR on it. |
@mattsthilaire For sure, feel free to open a PR! |
Could you please add support for Florence2 model? |
@mattsthilaire Hi. Were you able to work on CANINE? |
Hey @ozancaglayan, unfortunately no. Last I left it off on my local branch, I was getting shape mismatches on the upsampling part. Tried to troubleshoot using Netron to no avail. Happy to pass if off to you if you wanted to take a pass at it or try pairing up on it to see where we get. |
@ozancaglayan Hi, I have opened a draft PR with some of the work done. When I try run model for QA tasks with Some weights of CanineForQuestionAnswering were not initialized from the model checkpoint at google/canine-s and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Framework not specified. Using pt to export the model.
Some weights of CanineForQuestionAnswering were not initialized from the model checkpoint at google/canine-s and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
***** Exporting submodel 1/1: CanineForQuestionAnswering *****
Using framework PyTorch: 2.4.0+cu121
Overriding 1 configuration item(s)
- use_cache -> False
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/transformers/models/canine/modeling_canine.py:604: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
chunk_end = min(from_seq_length, chunk_start + self.attend_from_chunk_width)
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/transformers/models/canine/modeling_canine.py:612: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
chunk_end = min(to_seq_length, chunk_start + self.attend_to_chunk_width)
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/transformers/models/canine/modeling_canine.py:1073: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
remainder_length = torch.fmod(torch.tensor(char_seq_length), torch.tensor(rate)).item()
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/transformers/models/canine/modeling_canine.py:1073: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
remainder_length = torch.fmod(torch.tensor(char_seq_length), torch.tensor(rate)).item()
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/transformers/models/canine/modeling_canine.py:1073: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
remainder_length = torch.fmod(torch.tensor(char_seq_length), torch.tensor(rate)).item()
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/torch/onnx/_internal/jit_utils.py:314: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at ../torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/torch/onnx/utils.py:739: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at ../torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/torch/onnx/utils.py:1244: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at ../torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_graph_shape_type_inference(
Traceback (most recent call last):
File "/home/raghav/Dev/huggingface/optimum/test.py", line 25, in <module>
DEBUG: Input shapes: {'input_ids': torch.Size([1, 24]), 'token_type_ids': torch.Size([1, 24]), 'attention_mask': torch.Size([1, 24])}
DEBUG: PT Model Output: QuestionAnsweringModelOutput(loss=None, start_logits=tensor([[ 0.1626, 0.2406, 0.2992, 0.2548, 0.2493, 0.1242, -0.0853, 0.0104,
model = ORTModelForQuestionAnswering.from_pretrained(
File "/home/raghav/Dev/huggingface/optimum/optimum/onnxruntime/modeling_ort.py", line 738, in from_pretrained
return super().from_pretrained(
File "/home/raghav/Dev/huggingface/optimum/optimum/modeling_base.py", line 424, in from_pretrained
return from_pretrained_method(
File "/home/raghav/Dev/huggingface/optimum/optimum/onnxruntime/modeling_ort.py", line 599, in _from_transformers
return cls._export(
File "/home/raghav/Dev/huggingface/optimum/optimum/onnxruntime/modeling_ort.py", line 668, in _export
return cls._from_pretrained(
File "/home/raghav/Dev/huggingface/optimum/optimum/onnxruntime/modeling_ort.py", line 554, in _from_pretrained
model = ORTModel.load_model(
File "/home/raghav/Dev/huggingface/optimum/optimum/onnxruntime/modeling_ort.py", line 397, in load_model
return ort.InferenceSession(
File "/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/raghav/.micromamba/envs/optimum/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (/canine/final_char_encoder/layer.0/attention/self/MatMul) Op (MatMul) [ShapeInferenceError] Incompatible dimensions Code to reproduce it: from transformers import (
AutoConfig,
AutoModelForQuestionAnswering,
AutoTokenizer,
)
from optimum.onnxruntime import ORTModelForQuestionAnswering
model_name = "google/canine-s"
config = AutoConfig.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
dummy_inputs = tokenizer("This is a sample input", return_tensors="pt")
input_shapes = {k: v.shape for k, v in dummy_inputs.items()}
print(f"DEBUG: Input shapes: {input_shapes}")
pt_model = AutoModelForQuestionAnswering.from_pretrained(model_name)
outputs = pt_model(**dummy_inputs)
print(f"DEBUG: PT Model Output: {outputs}")
model = ORTModelForQuestionAnswering.from_pretrained(
model_name,
export=True,
)
outputs = model(**dummy_inputs)
print(f"DEBUG: ONNX Model Output: {outputs}") |
Following what was done by @chainyo in Transformers, in the ONNXConfig: Add a configuration for all available models issue, the idea is to add support for exporting new models in
optimum.exporters.onnx
.This issue is about the working group specially created for this task. If you are interested in helping out, reply here, take a look at this organization, or add
ChainYo#3610
on discord.We want to contribute to Hugging Face's ONNX export implementation for all available models on Hugging Face Hub. There are already a lot of architectures implemented for converting PyTorch models to ONNX, but we need more! We need them all!
Feel free to join us in this adventure! Join the org by clicking here
Here is a non-exhaustive list of models that all models available:
🛠️ next to a model suggests that the PR is in progress. If there is nothing next to a model, it means that ONNX does not yet support the model, and thus we need to add support for it.
If you need help implementing an unsupported model, here is a guide from HuggingFace Optimum documentation.
The text was updated successfully, but these errors were encountered: