Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Static memory allocation for cached_op #10817

Merged
merged 28 commits into from
May 31, 2018
Merged

Conversation

piiswrong
Copy link
Contributor

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@@ -145,6 +145,11 @@ class OpStatePtr {
void reset() {
ptr_.reset();
}
/* \brief checks whether the managed object is managed only by the current
OpStatePtr instance */
bool unique() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used? why expose this detail?

@piiswrong piiswrong force-pushed the exec2 branch 2 times, most recently from b223a77 to 78d178a Compare May 15, 2018 00:33
@piiswrong piiswrong requested a review from szha as a code owner May 17, 2018 20:42
@piiswrong piiswrong force-pushed the exec2 branch 2 times, most recently from b6c263b to b1bf748 Compare May 23, 2018 20:34
@piiswrong piiswrong changed the title [WIP] Do Not Merge. Static memory allocation for cached_op Static memory allocation for cached_op May 31, 2018
@piiswrong piiswrong merged commit 2dbd143 into apache:master May 31, 2018
marcoabreu added a commit that referenced this pull request May 31, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 1, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 1, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
@ThomasDelteil
Copy link
Contributor

ThomasDelteil commented Jun 12, 2018

@piiswrong I have mixed results with this PR. Did you benchmark your code?
I am running this notebook: https://github.com/ilkarman/DeepLearningFrameworks/blob/master/notebooks/Gluon_MultiGPU.ipynb

It is DenseNet121 running on multilabel classification task on multigpu. Batches are fixed shapes (N, 3, 224, 224)

Here are my results:

  • baseline:
    static_shape=False, static_alloc=False -> 8min28

  • Supposedly most optimized, is slowest by far
    static_shape=True, static_alloc=True -> 9min32

  • only static_alloc, ~speed slightly faster but within normal variations
    static_shape=False, static_alloc=True -> 8min25

  • only static_shape, based on docs this shouldn't be possible because:

            Must also
            set static_alloc to True. Change of input shapes is still allowed
            but slower. 

No warning or error was thrown. ~speed slightly faster and outside of normal variations.
static_shape=True, static_alloc=False -> 8min23

@marcoabreu

marcoabreu added a commit to marcoabreu/incubator-mxnet that referenced this pull request Jun 15, 2018
tqchen added a commit that referenced this pull request Jun 15, 2018
piiswrong added a commit to piiswrong/mxnet that referenced this pull request Jun 15, 2018
@tqchen tqchen mentioned this pull request Jun 15, 2018
7 tasks
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants