Fix for duplicate subgraph inputs/outputs #16131

samskalicky · 2019-09-09T19:23:30Z

Description

Subgraph API can create duplicated tensors for both inputs to a subgraph and outputs from a subgraph. This happens when there are multiple nodes in a subgraph that consume the same input, or when there are multiple nodes that consume a single subgraph output. This results in tensor duplication, and causes OOM due to excessive memory usage.

Fix for duplicate inputs

Currently, inputs to a subgraph are duplicated when there are multiple nodes in a subgraph consuming the same input (see diagram below). In this PR we change the behavior of CutGraphInputs function to store a NodeEntry for each input to the subgraph in a map, and re-use that node for each node in the subgraph that consumes that input (see diagram below). This prevents input nodes from being duplicated, and results in less copies of tensors at runtime and a much smaller memory footprint.

Fix for duplicate outputs

Similar to the inputs, when an output from a subgraph is consumed by multiple nodes outside the subgraph, the output is duplicated for each consumer (see diagram below). In this PR we change the behavior of CreateSubgraphNode to collapse duplicate output dependencies to a single output by checking if neighboring output entries are the same. If so we do not add another output, preventing duplicates.

Then in the ConnectSubgraphOutputs function we create dependencies from each node outside the subgraph that consumes a subgraph output. If neighboring nodes share the same subgraph output, they point to the same subgraph output.

Example

Added an example graph with duplicate inputs/outputs to the test_subgraph_op.py. Heres the original graph partitioned before this PR. Notice that it has two input nodes: data0 and data1 that both come from the same top level data. Similarly, notice that it has two outputs coming from the same node _plus0.

{
  "nodes": [
    {
      "op": "null", 
      "name": "data", 
      "attrs": {"__shape__": "(1,)"}, 
      "inputs": []
    }, 
    {
      "op": "_CachedOp", 
      "name": "_CachedOp0", 
      "inputs": [[0, 0, 0], [0, 0, 0]], 
      "subgraphs": [
        {
          "nodes": [
            {
              "op": "null", 
              "name": "data0", 
              "inputs": []
            }, 
            {
              "op": "sin", 
              "name": "sin0", 
              "inputs": [[0, 0, 0]]
            }, 
            {
              "op": "null", 
              "name": "data1", 
              "inputs": []
            }, 
            {
              "op": "sin", 
              "name": "sin1", 
              "inputs": [[2, 0, 0]]
            }, 
            {
              "op": "elemwise_add", 
              "name": "_plus0", 
              "inputs": [[1, 0, 0], [3, 0, 0]]
            }
          ], 
          "arg_nodes": [0, 2], 
          "node_row_ptr": [0, 1, 2, 3, 4, 5], 
          "heads": [[4, 0, 0], [4, 0, 0]]
        }
      ]
    }, 
    {
      "op": "cos", 
      "name": "cos0", 
      "inputs": [[1, 0, 0]]
    }, 
    {
      "op": "cos", 
      "name": "cos1", 
      "inputs": [[1, 1, 0]]
    }, 
    {
      "op": "elemwise_sub", 
      "name": "_minus0", 
      "inputs": [[2, 0, 0], [3, 0, 0]]
    }
  ], 
  "arg_nodes": [0], 
  "node_row_ptr": [0, 1, 3, 4, 5, 6], 
  "heads": [[4, 0, 0]], 
  "attrs": {"mxnet_version": ["int", 10700]}
}

Heres the same graph partitioned after this PR. Notice that now there is only 1 input node data0 and both sin ops use it. Also notice that now there is only 1 output from the subgraph, and both cos ops use it.

{
  "nodes": [
    {
      "op": "null", 
      "name": "data", 
      "attrs": {
        "__profiler_scope__": "<unk>:", 
        "__shape__": "(1,)"
      }, 
      "inputs": []
    }, 
    {
      "op": "_CachedOp", 
      "name": "_CachedOp0", 
      "inputs": [[0, 0, 0]], 
      "subgraphs": [
        {
          "nodes": [
            {
              "op": "null", 
              "name": "data0", 
              "inputs": []
            }, 
            {
              "op": "sin", 
              "name": "sin0", 
              "attrs": {"__profiler_scope__": "<unk>:"}, 
              "inputs": [[0, 0, 0]]
            }, 
            {
              "op": "sin", 
              "name": "sin1", 
              "attrs": {"__profiler_scope__": "<unk>:"}, 
              "inputs": [[0, 0, 0]]
            }, 
            {
              "op": "elemwise_add", 
              "name": "_plus0", 
              "attrs": {"__profiler_scope__": "<unk>:"}, 
              "inputs": [[1, 0, 0], [2, 0, 0]]
            }
          ], 
          "arg_nodes": [0], 
          "node_row_ptr": [0, 1, 2, 3, 4], 
          "heads": [[3, 0, 0]]
        }
      ]
    }, 
    {
      "op": "cos", 
      "name": "cos0", 
      "attrs": {"__profiler_scope__": "<unk>:"}, 
      "inputs": [[1, 0, 0]]
    }, 
    {
      "op": "cos", 
      "name": "cos1", 
      "attrs": {"__profiler_scope__": "<unk>:"}, 
      "inputs": [[1, 0, 0]]
    }, 
    {
      "op": "elemwise_sub", 
      "name": "_minus0", 
      "attrs": {"__profiler_scope__": "<unk>:"}, 
      "inputs": [[2, 0, 0], [3, 0, 0]]
    }
  ], 
  "arg_nodes": [0], 
  "node_row_ptr": [0, 1, 2, 3, 4, 5], 
  "heads": [[4, 0, 0]], 
  "attrs": {"mxnet_version": ["int", 20000]}
}

samskalicky · 2019-09-10T18:57:36Z

@zheng-da @reminisce could you please review?

samskalicky · 2019-09-10T18:57:59Z

@mxnet-label-bot add [pr-awaiting-review]

samskalicky · 2019-09-11T00:17:33Z

@ZhennanQin for review. Any suggestions on how to better fix the problem?

src/operator/subgraph/subgraph_property.h

src/operator/subgraph/build_subgraph.cc

samskalicky · 2019-09-17T06:47:10Z

@ZhennanQin ive made the changes, do they still look good? If so we can start going through and updating all other subgraph properties.

ZhennanQin · 2019-09-17T06:51:02Z

LGTM. Please coordinate all backends with new ConnectSubgraphOutputs().

DickJC123 · 2019-10-30T22:32:16Z

I've noticed that MXNet does not preserve what I'm calling 'output independence' during it's memory planning (issue forthcoming). While I'm talking here about the model as a whole, I've noticed many graph passes are applied to subgraphs as well, so perhaps the issue is relevent to this PR. Any update on status here?

samskalicky · 2019-10-31T01:21:00Z

Hi @DickJC123 sorry I havent been able to prioritize finishing off this PR. IMO the problem is similar and endemic in mxnet but not related to this PR. Please do file an issue at the very least for posterity.

samskalicky · 2020-02-15T22:29:37Z

@mseth10 looks like we made an aux an arg somehow, thats why the MKLDNN tests are failing:

in_args_map	arg_names	arg_names
data	data	data
conv1_weight	conv1_weight	conv1_weight
bn1_gamma	bn1_gamma	bn1_gamma
bn1_beta	bn1_beta	bn1_beta
conv2_weight	conv2_weight	conv2_weight
bn2_gamma	bn2_gamma	bn2_gamma
bn2_beta	bn2_beta	bn2_beta
		bn2_moving_mean

So this is why the error says

MXNetError: Check failed: arg_names.size() == in_args_map.size() (8 vs. 7) :

Theres one extra name in arg_names for bn2_moving_mean

…to subgraph_fix

src/operator/subgraph/subgraph_property.h

src/operator/subgraph/build_subgraph.cc

HahTK · 2020-09-04T16:55:06Z

src/operator/subgraph/build_subgraph.cc

@@ -537,10 +537,11 @@ void FindOutputEntries(nnvm::Graph* g,
 */
 void CutGraphInputs(const std::vector<nnvm::NodeEntry*> &input_entries,
                    std::vector<nnvm::NodeEntry> *orig_entries,
+                    std::vector<nnvm::NodeEntry> *unique_inputs,


There is an alternative implementation that does not need t track the actual unique inputs separately.
A counter would allow us to correctly modify orig_entries into unique_entries.
This keeps the function signature unchanged and minimizes changes elsewhere.

It is already working in a private build. I add it here for consideration if desired

Thanks for the suggestion @HahTK, but we cant modify orig_entries since we need it unmodified for ReattachGraphInputs:
https://github.com/apache/incubator-mxnet/blob/e249d71ad4621afcba7f0f3af77095a3f9a4bc83/src/operator/subgraph/build_subgraph.cc#L578
this is used to reject a subgraph from the reviewSubgraph API:
https://github.com/apache/incubator-mxnet/blob/e249d71ad4621afcba7f0f3af77095a3f9a4bc83/src/operator/subgraph/build_subgraph.cc#L657
the example you provided doesnt work in the latest version of MXNet where you can reject a subgraph by returning nullptr from CreateSubgraphNode here: https://github.com/apache/incubator-mxnet/blob/e249d71ad4621afcba7f0f3af77095a3f9a4bc83/src/operator/subgraph/build_subgraph.cc#L630-L633

The proposed change was tested on 1.5.1 and it seems like newer versions of MXNet have new features that needs orig_entries to be preserved. So acknowledged !

waytrue17

Looks good to me, thanks!

mseth10

LGTM. Thanks for the fix.

* fix for duplicate inputs * fixed error * fixed whitespace * Remove duplicate outputs from subgraphs * changed subgraph to create map of outputs * added static_cast * changed map<int,v> to vector * sanity fix * sanity2 * updated backends with new connectSubgraphOutputs API * fixed map creation logic * added updates for reattach function * creating node only if it is not an input to subgraph * creating object based on var_name only * updating ConnectSubgraphOutputs for mkldnn_elemwisemul_post_quantize_property.h * add debug prints to debug error in CI * remove prints * added prints to debug in the CI * revert changes * reverted changes * deduplicaated inputs to subgraph * deduplicated subgraph inputs * simplified inputs * cleaned up * deduplicate outputs * cleand up * added deduplication to subgraph node outputs * fixed prev compare * fixed issue with inputs and added test * fixd whitespace, removed prints Co-authored-by: Sam Skalicky <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Manu Seth <[email protected]> Co-authored-by: Ubuntu <[email protected]>

samskalicky · 2020-09-11T01:05:18Z

@mseth10 looks like we made an aux an arg somehow, thats why the MKLDNN tests are failing:

in_args_map arg_names arg_names
data data data
conv1_weight conv1_weight conv1_weight
bn1_gamma bn1_gamma bn1_gamma
bn1_beta bn1_beta bn1_beta
conv2_weight conv2_weight conv2_weight
bn2_gamma bn2_gamma bn2_gamma
bn2_beta bn2_beta bn2_beta
bn2_moving_mean
So this is why the error says
MXNetError: Check failed: arg_names.size() == in_args_map.size() (8 vs. 7) : 
Theres one extra name in arg_names for bn2_moving_mean

For the record @mseth10 i finally found the issue that caused this "making an aux an arg" problem: #19112 (comment)

…9112) * Fix for duplicate subgraph inputs/outputs (#16131) * fix for duplicate inputs * fixed error * fixed whitespace * Remove duplicate outputs from subgraphs * changed subgraph to create map of outputs * added static_cast * changed map<int,v> to vector * sanity fix * sanity2 * updated backends with new connectSubgraphOutputs API * fixed map creation logic * added updates for reattach function * creating node only if it is not an input to subgraph * creating object based on var_name only * updating ConnectSubgraphOutputs for mkldnn_elemwisemul_post_quantize_property.h * add debug prints to debug error in CI * remove prints * added prints to debug in the CI * revert changes * reverted changes * deduplicaated inputs to subgraph * deduplicated subgraph inputs * simplified inputs * cleaned up * deduplicate outputs * cleand up * added deduplication to subgraph node outputs * fixed prev compare * fixed issue with inputs and added test * fixd whitespace, removed prints Co-authored-by: Sam Skalicky <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Manu Seth <[email protected]> Co-authored-by: Ubuntu <[email protected]> * added flag to enable dedupe ondemand * fixed dedup logic * improved dedup logic * fixed sanity * propogated option * check option in custom subgraph prop * fixed options map * fixed missing * added dedup to subgraph_prop base class for testing * added test for dedup * added comments Co-authored-by: Sam Skalicky <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Manu Seth <[email protected]> Co-authored-by: Ubuntu <[email protected]>

marcoabreu added the pr-awaiting-review PR is waiting for code review label Sep 10, 2019

samskalicky changed the title ~~[WIP] fix for duplicate subgraph inputs/outputs~~ Fix for duplicate subgraph inputs/outputs Sep 11, 2019

ZhennanQin reviewed Sep 12, 2019

View reviewed changes

src/operator/subgraph/subgraph_property.h Show resolved Hide resolved

ZhennanQin reviewed Sep 17, 2019

View reviewed changes

src/operator/subgraph/build_subgraph.cc Outdated Show resolved Hide resolved

ZhennanQin approved these changes Sep 17, 2019

View reviewed changes

samskalicky changed the title ~~Fix for duplicate subgraph inputs/outputs~~ Fix for duplicate subgraph inputs/outputs Sep 18, 2019

samskalicky changed the title ~~Fix for duplicate subgraph inputs/outputs~~ [WIP] Fix for duplicate subgraph inputs/outputs Sep 18, 2019

samskalicky changed the title ~~[WIP] Fix for duplicate subgraph inputs/outputs~~ Fix for duplicate subgraph inputs/outputs Sep 18, 2019

DickJC123 mentioned this pull request Oct 31, 2019

Memory planner doesn't respect 'output independence'. More optimizations possible. #16685

Open

Sam Skalicky and others added 12 commits February 13, 2020 19:46

fix for duplicate inputs

b010ffa

fixed error

a6ad15a

fixed whitespace

00499c9

Remove duplicate outputs from subgraphs

87b665e

changed subgraph to create map of outputs

8045bab

added static_cast

82df19d

changed map<int,v> to vector

77f3f31

sanity fix

a4b42db

sanity2

c6077dd

updated backends with new connectSubgraphOutputs API

3631bc7

fixed map creation logic

fdd477b

added updates for reattach function

7a210d6

mseth10 force-pushed the subgraph_fix branch from 1e4fdc5 to 7a210d6 Compare February 13, 2020 22:52

creating node only if it is not an input to subgraph

a610f1c

Merge branch 'master' into subgraph_fix

b49b431

HahTK mentioned this pull request Aug 18, 2020

[RFC] v1.8.0 release #18800

Open

szha added this to the v1.8.0 milestone Aug 23, 2020

Ubuntu added 10 commits August 30, 2020 20:06

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

cd4f4d7

…to subgraph_fix

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

6a20b0e

…to subgraph_fix

revert changes

324f29b

reverted changes

e0abc55

deduplicaated inputs to subgraph

93a8f35

deduplicated subgraph inputs

d155801

simplified inputs

6deea77

cleaned up

270a8fc

deduplicate outputs

165bbd1

cleand up

5572d4c

samskalicky commented Aug 31, 2020

View reviewed changes

src/operator/subgraph/subgraph_property.h Outdated Show resolved Hide resolved

added deduplication to subgraph node outputs

7672a5f

samskalicky commented Aug 31, 2020

View reviewed changes

src/operator/subgraph/build_subgraph.cc Outdated Show resolved Hide resolved

fixed prev compare

e249d71

HahTK reviewed Sep 4, 2020

View reviewed changes

HahTK approved these changes Sep 4, 2020

View reviewed changes

waytrue17 approved these changes Sep 9, 2020

View reviewed changes

Ubuntu added 2 commits September 10, 2020 01:04

fixed issue with inputs and added test

4a49810

fixd whitespace, removed prints

af9f177

mseth10 approved these changes Sep 10, 2020

View reviewed changes

samskalicky merged commit 6b01dc2 into apache:master Sep 10, 2020

samskalicky mentioned this pull request Sep 10, 2020

[1.x] Backport Fix for duplicate subgraph inputs/outputs (#16131) #19112

Merged

samskalicky mentioned this pull request Sep 28, 2020

Add dedup flag to master from #19112 #19246

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for duplicate subgraph inputs/outputs #16131

Fix for duplicate subgraph inputs/outputs #16131

samskalicky commented Sep 9, 2019 •

edited

Loading

samskalicky commented Sep 10, 2019

samskalicky commented Sep 10, 2019

samskalicky commented Sep 11, 2019

samskalicky commented Sep 17, 2019 •

edited

Loading

ZhennanQin commented Sep 17, 2019

DickJC123 commented Oct 30, 2019

samskalicky commented Oct 31, 2019

samskalicky commented Feb 15, 2020 •

edited

Loading

HahTK Sep 4, 2020

samskalicky Sep 4, 2020

HahTK Sep 4, 2020

waytrue17 left a comment

mseth10 left a comment

samskalicky commented Sep 11, 2020

Fix for duplicate subgraph inputs/outputs #16131

Fix for duplicate subgraph inputs/outputs #16131

Conversation

samskalicky commented Sep 9, 2019 • edited Loading

Description

Fix for duplicate inputs

Fix for duplicate outputs

Example

samskalicky commented Sep 10, 2019

samskalicky commented Sep 10, 2019

samskalicky commented Sep 11, 2019

samskalicky commented Sep 17, 2019 • edited Loading

ZhennanQin commented Sep 17, 2019

DickJC123 commented Oct 30, 2019

samskalicky commented Oct 31, 2019

samskalicky commented Feb 15, 2020 • edited Loading

HahTK Sep 4, 2020

Choose a reason for hiding this comment

samskalicky Sep 4, 2020

Choose a reason for hiding this comment

HahTK Sep 4, 2020

Choose a reason for hiding this comment

waytrue17 left a comment

Choose a reason for hiding this comment

mseth10 left a comment

Choose a reason for hiding this comment

samskalicky commented Sep 11, 2020

samskalicky commented Sep 9, 2019 •

edited

Loading

samskalicky commented Sep 17, 2019 •

edited

Loading

samskalicky commented Feb 15, 2020 •

edited

Loading