-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELAY] Re-wrote the Graph Partitioner to support multiple outputs #5143
Conversation
/*! \brief Nodes in this subgraph. */ | ||
std::unordered_set<Expr, ObjectHash, ObjectEqual> nodes; | ||
}; | ||
static const Op &compiler_begin_op = Op::Get("annotation.compiler_begin"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep the original code style: static const Op&
Same for all the other places in the rest of PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Prolly feedback this to lint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still many places where the style is violated. I recommend running clang-format from your editor. We have our own .clang-format file at the root dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* | ||
* Expected Usecase : | ||
* This pass is intended to run as the last pass in a series of passes as follows : | ||
* 1) Annotate Supported Single Ops - annotated each single op with supported backends. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- and 2) are inaccurate, we don't have these two annotations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the use-case segment. This is not needed now; as things are still in discussion as how this pass would potentially fit in the bigger picture.
|
||
if (region->GetOutputs().size() == 1) { | ||
// If there is only a single output; no need to add a tuplegetitem node | ||
return Call(glob_func, param_expr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to create another call, just return ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
int subgraph_id_{0}; | ||
std::unordered_set<std::shared_ptr<Subgraph>> subgraphs_; | ||
/*! | ||
* \brief Get the region an expression belongs to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align "*", same to the other functions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not aligned, you want something like:
/*!
* \brief
*/
Same to the other places as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully done!
@@ -678,6 +678,81 @@ def expected(): | |||
check_result(mod, {"y": y_data}, (8, 8), np.log(np_add)) | |||
|
|||
|
|||
def test_multiple_outputs(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case for something like conv2d + batch_norm
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that I get you; Can you elaborate more as to how multiple outputs in regions are exercised in conv2d + batch_norm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch_norm has 3 outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But isnt it returned as a tuple already ? Looks like a single output to me. Unless, you do some procesing afterwards in the region. Is that what you imply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes. The case in my mind is conv2d->BN->relu
. In this case, the region output would be a new tuple of the rest 2 BN outputs and 1 relu output (relu could be any op tho).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the testcase
5cc9590
to
a4653fd
Compare
return BaseFunc(nullptr); | ||
} | ||
|
||
/*! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully done!
return AnnotatedRegion(nullptr); | ||
} | ||
|
||
/*! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully done!
int subgraph_id_{0}; | ||
std::unordered_set<std::shared_ptr<Subgraph>> subgraphs_; | ||
/*! | ||
* \brief Get the region an expression belongs to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not aligned, you want something like:
/*!
* \brief
*/
Same to the other places as well
|
||
return mod | ||
|
||
# print(create_merged_graph()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the debugging stmt as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
99a02f9
to
beda3bc
Compare
Input : A Relay module that have functions with disjoint annotated regions using compiler_begin and compiler_end. There could be multiple outputs. Output : A Relay module with global functions for such disjoint annotated regions with calls inserted at the respective location Dependencies : AnnotatedRegionSet Utility class. Methodology : 1) The AnnotatedRegionSet utility class is able to construct a collection of nodes that are bound by a give annotation -- here we use compiler_begin and compiler_end 2) Initially, for each function in the module AnnotatedRegionSets are populated. 3) Then, Vistor pass is traversed until a compiler_end node is encountered that belongs to a "region". 4) When the first compiler_end of a given annotated region is found, a function is formed and inserted. a) if the region has multiple outputs, a Tuple node (capturing all outputs) is returned. 5) Thereafter, if we encounter an another output of the same annotated region, it is important to note that the function is already formed. Therefore, it will lookup the function and add a TupleGetItemNode is inserted. a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet as TupleGetItemNode index. 6) Therefore, functions will be created for all annotated regions. The name for each global function is created using "Region" id and the compiler name. Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c
*removed the expected use-case as we are taking broken-down PR approach *code style fixes *some trivial one liners
*fixed an implicit copy to a move
*code style changes for comments *renamed test case multiple outputs --> mixed single multiple outputs Since the existing test case checks for both single and multiple output scenarios *added a new test case with conv2d + batch_norm *some var name changes in the test
beda3bc
to
ed2415f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @manupa-arm @comaniac |
…pache#5143) * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs Input : A Relay module that have functions with disjoint annotated regions using compiler_begin and compiler_end. There could be multiple outputs. Output : A Relay module with global functions for such disjoint annotated regions with calls inserted at the respective location Dependencies : AnnotatedRegionSet Utility class. Methodology : 1) The AnnotatedRegionSet utility class is able to construct a collection of nodes that are bound by a give annotation -- here we use compiler_begin and compiler_end 2) Initially, for each function in the module AnnotatedRegionSets are populated. 3) Then, Vistor pass is traversed until a compiler_end node is encountered that belongs to a "region". 4) When the first compiler_end of a given annotated region is found, a function is formed and inserted. a) if the region has multiple outputs, a Tuple node (capturing all outputs) is returned. 5) Thereafter, if we encounter an another output of the same annotated region, it is important to note that the function is already formed. Therefore, it will lookup the function and add a TupleGetItemNode is inserted. a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet as TupleGetItemNode index. 6) Therefore, functions will be created for all annotated regions. The name for each global function is created using "Region" id and the compiler name. Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *removed the expected use-case as we are taking broken-down PR approach *code style fixes *some trivial one liners * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *fixed an implicit copy to a move * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *code style changes for comments *renamed test case multiple outputs --> mixed single multiple outputs Since the existing test case checks for both single and multiple output scenarios *added a new test case with conv2d + batch_norm *some var name changes in the test * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *rebased
…pache#5143) * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs Input : A Relay module that have functions with disjoint annotated regions using compiler_begin and compiler_end. There could be multiple outputs. Output : A Relay module with global functions for such disjoint annotated regions with calls inserted at the respective location Dependencies : AnnotatedRegionSet Utility class. Methodology : 1) The AnnotatedRegionSet utility class is able to construct a collection of nodes that are bound by a give annotation -- here we use compiler_begin and compiler_end 2) Initially, for each function in the module AnnotatedRegionSets are populated. 3) Then, Vistor pass is traversed until a compiler_end node is encountered that belongs to a "region". 4) When the first compiler_end of a given annotated region is found, a function is formed and inserted. a) if the region has multiple outputs, a Tuple node (capturing all outputs) is returned. 5) Thereafter, if we encounter an another output of the same annotated region, it is important to note that the function is already formed. Therefore, it will lookup the function and add a TupleGetItemNode is inserted. a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet as TupleGetItemNode index. 6) Therefore, functions will be created for all annotated regions. The name for each global function is created using "Region" id and the compiler name. Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *removed the expected use-case as we are taking broken-down PR approach *code style fixes *some trivial one liners * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *fixed an implicit copy to a move * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *code style changes for comments *renamed test case multiple outputs --> mixed single multiple outputs Since the existing test case checks for both single and multiple output scenarios *added a new test case with conv2d + batch_norm *some var name changes in the test * [RELAY] Re-wrote the Graph Partitioner to support multiple outputs *rebased
This PR aims to include support for multiple outputs in the regions that get outlined as functions for different compiler backends in the existing Partition Graph Pass. Such regions are annotated and bounded by compiler_begin and compiler_end annotation ops.
This is a required step as step 4 in RFC - BYOC. Moreover, this feature is requested in prior discussions such as Improved graph partitioning algorithm
This PR uses the utility functions provided by the AnnotationRegionSet PR.