Ensemble model with shared memory #5418
-
Is your feature request related to a problem? Please describe. Describe the solution you'd likeDescribe alternatives you've consideredAdditional contextThanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
If you're asking whether ensemble shares memory between models, the ensemble scheduler passes pointers to the tensors between models to avoid copies. However, there is a copy at the end of the ensemble for the final output. The backends may also make copies during execution (unrelated to the ensemble). This could be due to the dynamic batcher (if enabled, copies during gathering/scattering of inputs and outputs), pinned memory manager (if used to improve performance), and models moving tensors between host and device memory. There are also other backend-specific situations, like moving data between models pertaining to different backends could introduce copies. CC: @GuanLuo @Tabrizian |
Beta Was this translation helpful? Give feedback.
If you're asking whether ensemble shares memory between models, the ensemble scheduler passes pointers to the tensors between models to avoid copies. However, there is a copy at the end of the ensemble for the final output.
The backends may also make copies during execution (unrelated to the ensemble). This could be due to the dynamic batcher (if enabled, copies during gathering/scattering of inputs and outputs), pinned memory manager (if used to improve performance), and models moving tensors between host and device memory. There are also other backend-specific situations, like moving data between models pertaining to different backends could introduce copies.
CC: @GuanLuo @Tabrizian