-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for setGraphExecutorOptimize with torchscript models. #904
Added support for setGraphExecutorOptimize with torchscript models. #904
Conversation
…required model warmup for different batch sizes.
@frankfliu I created it as a draft pull request as this is my first pull request. Please have a look and you can mark it ready for review. |
After doing some testing, there is currently a bug with multi-threaded inference where it will have the same multi-second delay after a few inferences. It does not happen single threaded. I have fixed it by running I propose that we add another parameter to I am going to continue investigating to see if I can figure out why the value is not being respected when set in the PtSymbolicBlock on the first inference and only when using more than one thread. |
I was able to resolve this by calling I think the best course of action would to be add a parameter to |
… since this is not respected in a multi-threaded environment.
It might be a good idea to just allow the developer to utilize the JNI function how they want rather than focusing it one way or another. This flexibility will be needed in some situations, like if you want to load models on one thread with optimization and another set of models without optimization. With this global approach, it is much less flexible and it doesn't enforce any changes. I just reverted my changes to PtSymbolBlock as I think flexibility is the way to go. |
Change-Id: I59b7ef2e2b24543d34a9c15e73add232ef55afc6
Codecov Report
@@ Coverage Diff @@
## master #904 +/- ##
============================================
- Coverage 70.34% 70.32% -0.02%
Complexity 5085 5085
============================================
Files 501 501
Lines 22432 22437 +5
Branches 2332 2335 +3
============================================
Hits 15779 15779
- Misses 5412 5417 +5
Partials 1241 1241
Continue to review full report at Codecov.
|
Description
This PR adds support for torch::jit::setGraphExecutorOptimize which allows the user to prevent model "warmup" periods while torchscript optimizes the model on GPU.
Users can disable the torchscript optimization with the following code:
Since this feature is enabled by default in torchscript, it will only disable optimization when the the method above is called. Since JNI maintains an individual environment per thread, the above method must be called for each thread that is using the model which optimization should be disabled for.
This change is backwards compatible and does not alter the usage of any existing code.