Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emotion2vec onnx #2291

Open
thewh1teagle opened this issue Dec 11, 2024 · 4 comments
Open

emotion2vec onnx #2291

thewh1teagle opened this issue Dec 11, 2024 · 4 comments
Labels
question Further information is requested

Comments

@thewh1teagle
Copy link

thewh1teagle commented Dec 11, 2024

Can you add support for onnx with emotion2vec
I tried to add it, I understand that we need to implement export method in model.py and then add export_meta.py file and finally export with export.py. That's correct?
If so, can you add support for it or guide me how to export it to onnx?
Thanks.

Related: ddlBoJack/emotion2vec#55

@thewh1teagle thewh1teagle added the question Further information is requested label Dec 11, 2024
@LauraGPT
Copy link
Collaborator

Sorry, I am busy now. I evaluate the process of export onnx is easy. You could ref to the docs: https://pytorch.org/docs/stable/onnx_torchscript.html#torch.onnx.export

@thewh1teagle
Copy link
Author

Sorry, I am busy now. I evaluate the process of export onnx is easy. You could ref to the docs: https://pytorch.org/docs/stable/onnx_torchscript.html#torch.onnx.export

I tried by the following:

git clone https://github.com/modelscope/FunASR
cd FunASR
uv venv -p 3.9
source .venv/bin/activate
uv pip install -e .

uv pip install torch torchvision torchaudio setuptools pydub onnx onnxconverter_common
uv run export.py

with export.py:

from funasr import AutoModel
import torch

dummy_input = torch.randn(1, 16000)
model = AutoModel(model="iic/emotion2vec_base_finetuned")
torch.onnx.export(model.model, dummy_input, opset_version=13, input_names=['input'], output_names=['output'])

But there's issues with compute_mask_indices

#2085

After omitting the invalid arguments it failed with:

 uv run export.py
/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
funasr version: 1.2.0.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
You are using the latest version of funasr-1.2.0
Downloading Model to directory: /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned
2024-12-12 13:04:52,124 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.0.0.weight, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.0.0.bias, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.1.0.weight, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.1.0.bias, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.2.0.weight, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.2.0.bias, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.3.0.weight, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.3.0.bias, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.proj.weight, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.proj.bias, /Users/user/.cache/modelscope/hub/iic/emotion2vec_base_finetuned/emotion2vec_base.pt
Traceback (most recent call last):
  File "/Volumes/Internal/audio/emotion/FunASR/export.py", line 6, in <module>
    torch.onnx.export(model.model, dummy_input, opset_version=13, input_names=['input'], output_names=['output'])
  File "/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/torch/onnx/__init__.py", line 375, in export
    export(
  File "/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/torch/onnx/utils.py", line 502, in export
    _export(
  File "/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/torch/onnx/utils.py", line 1564, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/torch/onnx/utils.py", line 1169, in _model_to_graph
    _set_input_and_output_names(graph, input_names, output_names)
  File "/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/torch/onnx/utils.py", line 1696, in _set_input_and_output_names
    set_names(list(graph.outputs()), output_names, "output")
  File "/Volumes/Internal/audio/emotion/FunASR/.venv/lib/python3.9/site-packages/torch/onnx/utils.py", line 1673, in set_names
    raise RuntimeError(
RuntimeError: number of output names provided (1) exceeded number of outputs (0)

@altunenes
Copy link

I also encountered similar issues with the masking mechanism when I tried to convert onnx, so my conversions didn't not preserve the complete model architecture :( What I'm missing I wonder :-)

@thewh1teagle
Copy link
Author

I made some progress here: thewh1teagle@586e81d. I was able to export the model as ONNX, but currently, it only accepts a fixed input size of a single 16000 sample waveform. Maybe one of you could help with modifying it to accept dynamic input sizes and map the output to label scores?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants