-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU] Add interface to release compiled model internal memory #26390
[CPU] Add interface to release compiled model internal memory #26390
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for future releases, we need to add behavior tests for this feature:
- release_memory throws when called during running inferences
- inference request can execute w/o issues after release_meomry was called
for (auto&& graph : m_graphs) { | ||
GraphGuard::Lock graph_lock{graph}; | ||
auto ctx = graph_lock._graph.getGraphContext(); | ||
ctx->getNetworkMemoryControl()->releaseMemory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we release memory only for graphs which don't have running inference requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case we would have to add a delayed release for the others. I'm not sure we are really want to have nonuniform memory release across streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blocking comments from my side
/** | ||
* @brief Release intermediate memory. | ||
* | ||
* This methods forces the Compiled model to release memory allocated for intermediate structures, e.g. caches, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
methods
-> method
@@ -77,6 +77,14 @@ bool Multinomial::needPrepareParams() const { | |||
return true; | |||
} | |||
|
|||
void Multinomial::createPrimitive() { | |||
if (!m_const_inputs[NUM_SAMPLES_PORT]) { | |||
CPU_NODE_ASSERT(isDynamicNode(), "is static while the samples input is a variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot we just move m_samples_count
computation into execute()? I suppose it shouldn't introduce any overheads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably yes. Actually I'm going to prepare a list of leftovers for this PR, which then have to be put to the backlog and planned accordingly, so I'll include this one too.
// !! Fallback to individual memory allocation !! | ||
// if you like to check infer without reuse just call this function without arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if the same debug functionality is still available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly it isn't, since to have such option we need to replace the memory manager used for the dynamic tensors allocation. However, such change is rather trivial.
This test has been already introduced, please refer to 297e65c |
8c9d4be
### Details: Port #26390 to master ### Tickets: - CVS-145873
Details:
This PR introduces an
ov::CompiledModel
level interface that allows to release memory allocated by the compiled model. In this PR the interface is only supported by the CPU plugin.Tickets: