Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELAY] Re-wrote the Graph Partitioner to support multiple outputs #5143

Merged
merged 5 commits into from
Mar 31, 2020

Conversation

manupak
Copy link
Contributor

@manupak manupak commented Mar 24, 2020

This PR aims to include support for multiple outputs in the regions that get outlined as functions for different compiler backends in the existing Partition Graph Pass. Such regions are annotated and bounded by compiler_begin and compiler_end annotation ops.

This is a required step as step 4 in RFC - BYOC. Moreover, this feature is requested in prior discussions such as Improved graph partitioning algorithm

This PR uses the utility functions provided by the AnnotationRegionSet PR.

@manupak
Copy link
Contributor Author

manupak commented Mar 24, 2020

cc : @zhiics @comaniac @tqchen

/*! \brief Nodes in this subgraph. */
std::unordered_set<Expr, ObjectHash, ObjectEqual> nodes;
};
static const Op &compiler_begin_op = Op::Get("annotation.compiler_begin");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep the original code style: static const Op&

Same for all the other places in the rest of PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Prolly feedback this to lint?

Copy link
Member

@masahi masahi Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still many places where the style is violated. I recommend running clang-format from your editor. We have our own .clang-format file at the root dir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masahi thanks for tip! fixed this in this follow-up PR

*
* Expected Usecase :
* This pass is intended to run as the last pass in a series of passes as follows :
* 1) Annotate Supported Single Ops - annotated each single op with supported backends.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. and 2) are inaccurate, we don't have these two annotations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the use-case segment. This is not needed now; as things are still in discussion as how this pass would potentially fit in the bigger picture.


if (region->GetOutputs().size() == 1) {
// If there is only a single output; no need to add a tuplegetitem node
return Call(glob_func, param_expr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to create another call, just return ret

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

int subgraph_id_{0};
std::unordered_set<std::shared_ptr<Subgraph>> subgraphs_;
/*!
* \brief Get the region an expression belongs to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align "*", same to the other functions below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not aligned, you want something like:

/*!
 * \brief
 */

Same to the other places as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully done!

@@ -678,6 +678,81 @@ def expected():
check_result(mod, {"y": y_data}, (8, 8), np.log(np_add))


def test_multiple_outputs():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test case for something like conv2d + batch_norm?

Copy link
Contributor Author

@manupak manupak Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that I get you; Can you elaborate more as to how multiple outputs in regions are exercised in conv2d + batch_norm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_norm has 3 outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isnt it returned as a tuple already ? Looks like a single output to me. Unless, you do some procesing afterwards in the region. Is that what you imply?

Copy link
Contributor

@comaniac comaniac Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes. The case in my mind is conv2d->BN->relu. In this case, the region output would be a new tuple of the rest 2 BN outputs and 1 relu output (relu could be any op tho).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the testcase

@manupak manupak force-pushed the partition_graph_multiout branch from 5cc9590 to a4653fd Compare March 26, 2020 12:02
return BaseFunc(nullptr);
}

/*!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully done!

return AnnotatedRegion(nullptr);
}

/*!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully done!

int subgraph_id_{0};
std::unordered_set<std::shared_ptr<Subgraph>> subgraphs_;
/*!
* \brief Get the region an expression belongs to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not aligned, you want something like:

/*!
 * \brief
 */

Same to the other places as well


return mod

# print(create_merged_graph())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the debugging stmt as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@manupak manupak force-pushed the partition_graph_multiout branch from 99a02f9 to beda3bc Compare March 27, 2020 14:07
manupak added 5 commits March 31, 2020 16:48
Input : A Relay module that have functions with disjoint annotated regions
        using compiler_begin and compiler_end. There could be multiple outputs.

Output : A Relay module with global functions for such disjoint annotated regions
         with calls inserted at the respective location

Dependencies : AnnotatedRegionSet Utility class.

Methodology :
      1) The AnnotatedRegionSet utility class is able to construct a collection of
         nodes that are bound by a give annotation -- here we use compiler_begin
         and compiler_end
      2) Initially, for each function in the module AnnotatedRegionSets are populated.
      3) Then, Vistor pass is traversed until a compiler_end node is encountered
         that belongs to a "region".
      4) When the first compiler_end of a given annotated region is found, a function is
         formed and inserted.
         a) if the region has multiple outputs, a Tuple node (capturing all outputs)
            is returned.
      5) Thereafter, if we encounter an another output of the same annotated region,
         it is important to note that the function is already formed. Therefore, it will
         lookup the function and add a TupleGetItemNode is inserted.
          a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet
             as TupleGetItemNode index.
      6) Therefore, functions will be created for all annotated regions. The name for each
         global function is created using "Region" id and the compiler name.

Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c
    *removed the expected use-case as we are taking broken-down PR approach
    *code style fixes
    *some trivial one liners
    *code style changes for comments
    *renamed test case multiple outputs --> mixed single multiple outputs
        Since the existing test case checks for both single and multiple
        output scenarios
    *added a new test case with conv2d + batch_norm
    *some var name changes in the test
@manupak manupak force-pushed the partition_graph_multiout branch from beda3bc to ed2415f Compare March 31, 2020 15:51
Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiics zhiics merged commit 14ae3a6 into apache:master Mar 31, 2020
@zhiics
Copy link
Member

zhiics commented Mar 31, 2020

Thanks @manupa-arm @comaniac

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020
…pache#5143)

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

Input : A Relay module that have functions with disjoint annotated regions
        using compiler_begin and compiler_end. There could be multiple outputs.

Output : A Relay module with global functions for such disjoint annotated regions
         with calls inserted at the respective location

Dependencies : AnnotatedRegionSet Utility class.

Methodology :
      1) The AnnotatedRegionSet utility class is able to construct a collection of
         nodes that are bound by a give annotation -- here we use compiler_begin
         and compiler_end
      2) Initially, for each function in the module AnnotatedRegionSets are populated.
      3) Then, Vistor pass is traversed until a compiler_end node is encountered
         that belongs to a "region".
      4) When the first compiler_end of a given annotated region is found, a function is
         formed and inserted.
         a) if the region has multiple outputs, a Tuple node (capturing all outputs)
            is returned.
      5) Thereafter, if we encounter an another output of the same annotated region,
         it is important to note that the function is already formed. Therefore, it will
         lookup the function and add a TupleGetItemNode is inserted.
          a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet
             as TupleGetItemNode index.
      6) Therefore, functions will be created for all annotated regions. The name for each
         global function is created using "Region" id and the compiler name.

Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *removed the expected use-case as we are taking broken-down PR approach
    *code style fixes
    *some trivial one liners

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *fixed an implicit copy to a move

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *code style changes for comments
    *renamed test case multiple outputs --> mixed single multiple outputs
        Since the existing test case checks for both single and multiple
        output scenarios
    *added a new test case with conv2d + batch_norm
    *some var name changes in the test

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

	*rebased
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020
…pache#5143)

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

Input : A Relay module that have functions with disjoint annotated regions
        using compiler_begin and compiler_end. There could be multiple outputs.

Output : A Relay module with global functions for such disjoint annotated regions
         with calls inserted at the respective location

Dependencies : AnnotatedRegionSet Utility class.

Methodology :
      1) The AnnotatedRegionSet utility class is able to construct a collection of
         nodes that are bound by a give annotation -- here we use compiler_begin
         and compiler_end
      2) Initially, for each function in the module AnnotatedRegionSets are populated.
      3) Then, Vistor pass is traversed until a compiler_end node is encountered
         that belongs to a "region".
      4) When the first compiler_end of a given annotated region is found, a function is
         formed and inserted.
         a) if the region has multiple outputs, a Tuple node (capturing all outputs)
            is returned.
      5) Thereafter, if we encounter an another output of the same annotated region,
         it is important to note that the function is already formed. Therefore, it will
         lookup the function and add a TupleGetItemNode is inserted.
          a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet
             as TupleGetItemNode index.
      6) Therefore, functions will be created for all annotated regions. The name for each
         global function is created using "Region" id and the compiler name.

Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *removed the expected use-case as we are taking broken-down PR approach
    *code style fixes
    *some trivial one liners

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *fixed an implicit copy to a move

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *code style changes for comments
    *renamed test case multiple outputs --> mixed single multiple outputs
        Since the existing test case checks for both single and multiple
        output scenarios
    *added a new test case with conv2d + batch_norm
    *some var name changes in the test

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

	*rebased
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants