You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i have trouble to use the triton server with the GPU of a Jetson Xavier AGX. The inference time feels too slow.
To me it seems like the model is loaded on the GPU but when i run a few inferences, i get an average inference time of 0.4s. In Comparison to this, when i load the model into the project of dusty-nv https://github.com/dusty-nv/jetson-inference
i get a much faster inference of around 0.06s. The Model used is imageNet, so the same as in the project of dusty-nv.
When i add the onnx to tensorrt execution accelerator as discribed in https://github.com/triton-inference-server/onnxruntime_backend i get much worse inferece time at about 2s. Also the logs seem more like using a cpu but im struggling to undestand what is going on since im a novice.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello,
i have trouble to use the triton server with the GPU of a Jetson Xavier AGX. The inference time feels too slow.
To me it seems like the model is loaded on the GPU but when i run a few inferences, i get an average inference time of 0.4s. In Comparison to this, when i load the model into the project of dusty-nv https://github.com/dusty-nv/jetson-inference
i get a much faster inference of around 0.06s. The Model used is imageNet, so the same as in the project of dusty-nv.
When i add the onnx to tensorrt execution accelerator as discribed in https://github.com/triton-inference-server/onnxruntime_backend i get much worse inferece time at about 2s. Also the logs seem more like using a cpu but im struggling to undestand what is going on since im a novice.
10.2.89-1
8.0.0.180-1+cuda10.2
7.1.3.0-1+cuda10.2
4.5.1-b17
Used Installation guide: https://github.com/triton-inference-server/server/releases/tag/v2.11.0
Logs from Startup (quite long since im not sure what the relevant part is):
Model Configuration File:
I did test the time of the triton server with the pref_analyzer and checked it with the grccp client also.
The Jetpack version etc is quite low since its an older project based on https://github.com/dusty-nv/jetson-inference and i would like to migrate it over to a triton server solution.
Thanks in advice.
Beta Was this translation helpful? Give feedback.
All reactions