-
Notifications
You must be signed in to change notification settings - Fork 6.8k
fix memory-related issues to enable ASAN tests #14223
Conversation
4846bba
to
fb43b71
Compare
Can you make them blocking as part of this PR? |
@mxnet-label-bot add [Memory, pr-awaiting-review] |
There still seems to be some problem with shutdown order http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-gpu/detail/PR-14223/3/pipeline#step-102-log-1650 |
b5bc2da
to
4f2549c
Compare
@szha After digging into the CI crash, two more problems are addressed:
@marcoabreu ASAN tests are blocking in CI now. |
14ad85e
to
4e724c0
Compare
bfba55e
to
9d20133
Compare
Why can't I trigger the full CI process? |
@arcadiaphy there seems to be some problem with the CI right now. |
@szha Everything seems OK now, the only problem is I have changed the code in the submodule of mshadow and dmlc-core. @marcoabreu The asan log looks clean too. |
@arcadiaphy thanks! Feel free to PR those changes to the respective repos. Once merged, you can change the submodules to point to the new commits there. |
@szha Submodules are merged and pointed to new commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! I noticed that the patch is made to the mxnet-stable branch in dmlc-core. I don't think that is a good sign - we do not want to diverge from dmlc-core master. @szha what do you think
I agree that mxnet-stable branch should be merged back to master ASAP. @hcho3 informed me that he's taking a look now. For now, I think that effort can be taken separate from this PR |
* fix heap overflow * fix memory leak of optimizer and executer * uncomment memory pool free * run cleanup in engine shutdown phase * make asan tests blocking * fix abort in mxnet shutdown, use forked submodules temporally for tests * trigger CI * change submodule mshadow * change submodule dmlc-core
* fix heap overflow * fix memory leak of optimizer and executer * uncomment memory pool free * run cleanup in engine shutdown phase * make asan tests blocking * fix abort in mxnet shutdown, use forked submodules temporally for tests * trigger CI * change submodule mshadow * change submodule dmlc-core
Description
Continuing the discussion in #14176, this PR fixes memory-related issues detected by ASAN:
Currently ASAN tests is non blocking, after this PR, the checks are green.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments