-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing deployment part on TensorRt #2
Comments
We did not opensource our code for TensorRT deployment. We are planning to deploy our model using TVM which I think is a more suitable framework for an opensource project, but cannot be sure of the exact date. |
Thank you @kssteven418 for your answer. Don't know about the open source thing... most of us are already using That said, I understand your view, TVM is a great project, well run, with a community first approach, so it makes sense to push that project that is not enough well known (IMO) in the NLP community (compared to ORT for instance). Anyway, if possible, I would really appreciate any guideline to run in a performant way your model on a GPU :-) (even if no code is provided) |
@kssteven418 I've also been trying to export the model to ONNX (from pytorch) for deploying on TRT. It seems like it needs a custom operator for the I also agree with @pommedeterresautee 's point on benchmarking the differences, so it'd be fantastic if you were able to share the deployment code or custom onnx ops for TRT. Thanks! |
❓ Questions and Help
You make reference in the paper and on Huggingface to a tensorRt deployment but I can't find the code.
Do you plan to share it too?
As far as I know the nvidia repo has only examples for their own models (all bert based), it's a bit hard to try it on our own without an example.
The text was updated successfully, but these errors were encountered: