Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to decrease inference time of LiLT? #284

Open
piegu opened this issue Apr 29, 2023 · 3 comments
Open

How to decrease inference time of LiLT? #284

piegu opened this issue Apr 29, 2023 · 3 comments

Comments

@piegu
Copy link

piegu commented Apr 29, 2023

Hi,

I'm using Hugging Face libraries in order to run LiLT.
How can I decrease inference time? Which code to use?

I've already try BetterTransformer (Optimum) and ONNX but none of them accepts LiLT model.

  • BetterTransformer: NotImplementedError: The model type lilt is not yet supported to be used with BetterTransformer.
  • ONNX: KeyError: "lilt is not supported yet.

Thank you.

Note: I asked this question here, too: jpWang/LiLT#42

@piegu piegu changed the title How to improve inference time of LiLT? How to decrease inference time of LiLT? Apr 30, 2023
@piegu
Copy link
Author

piegu commented May 2, 2023

Issue opened in the Optimum library: huggingface/optimum#1024

@bkocis
Copy link

bkocis commented Jun 27, 2023

Have you considered making a smaller model? What is your model size?

@NielsRogge
Copy link
Owner

NielsRogge commented Jul 3, 2023

One thing you can try (especially if you're using a multilingual model like https://huggingface.co/nielsr/lilt-xlm-roberta-base), then you can remove token embeddings of tokens of languages that you don't need.

See this blog post for more info: https://medium.com/@coding-otter/reduce-your-transformers-model-size-by-removing-unwanted-tokens-and-word-embeddings-eec08166d2f9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants