-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common Subgraph Elimination #5752
Conversation
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
Signed-off-by: Michal Zientkiewicz <[email protected]>
…ons. Signed-off-by: Michał Zientkiewicz <[email protected]>
@@ -41,7 +41,8 @@ inline std::shared_ptr<Argument> DeserializeProtobufVectorImpl(const DaliProtoPr | |||
auto args = arg.extra_args(); | |||
std::vector<T> ret_val; | |||
for (auto& a : args) { | |||
const T& elem = DeserializeProtobuf(a)->Get<T>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After changing Argument::Get
to returning a reference, this became a bug: returning a reference to a field of the result of DeserializeProtobuf
- since we get a temporary shared_ptr
, it would be destroyed (along with the referenced field!) before the field reference could be used.
ref = pipe_out_cpu | ||
else: | ||
check_batch(pipe_out_cpu, ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test checked too little and took too long - now it's more accurate and much faster.
CI MESSAGE: [21516079]: BUILD STARTED |
- restore _module and _display name in the default schema - add preserve(_name) in Pipeline tests where CSE is unwanted - fix and extend Python tests - extend documentation Signed-off-by: Michał Zientkiewicz <[email protected]>
Signed-off-by: Michał Zientkiewicz <[email protected]>
Signed-off-by: Michał Zientkiewicz <[email protected]>
CI MESSAGE: [21520256]: BUILD STARTED |
CI MESSAGE: [21520256]: BUILD PASSED |
Signed-off-by: Michal Zientkiewicz <[email protected]>
CI MESSAGE: [21574171]: BUILD STARTED |
CI MESSAGE: [21574171]: BUILD PASSED |
class CSE { | ||
public: | ||
void Run(OpGraph &graph) { | ||
for (auto &node : graph.OpNodes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpNodes returns the nodes in topological order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the graph is sorted by the Builder.
Signed-off-by: Michal Zientkiewicz <[email protected]>
CI MESSAGE: [21610648]: BUILD STARTED |
bool per_frame = false; //< The (tensor) argument can be expanded to multiple frames | ||
bool internal = false; //< The argument cannot be set by the user in Python | ||
bool hidden = false; //< The argument doesn't appear in the documentation | ||
bool ignore_cmp = false; //< Two operators can be considered equal if this argument differs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick
bool ignore_cmp = false; //< Two operators can be considered equal if this argument differs | |
bool ignore_cmp = false; //< Two operators can be considered equal even if this argument differs |
AddInternalArg("preserve_name", R"(When true, the operator cannot be renamed. | ||
This disables merging this operator with another one with a different name.)", | ||
false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be the case always when an operator was explicitly named by the user?
I understand that having two equal ops (Readers?) but with different names is not a clever way to use DALI, but it is possible for the user to do that. Let's say that for whatever reason I have two readers doing the same work, but named differently ("Reader1", "Reader2"). I wrapped the pipeline into something (an iterator?) that relies on a presence of "Reader2" in the pipeline. This is not a good code, but it works. And after CSE it doesn't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly how it works. If you specify a name manually, then preserve_name
is set to True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see:
self._spec.AddArg("preserve_name", not self._autoname)
Thank you!
@@ -2333,3 +2332,61 @@ def def_ref(): | |||
(ref,) = ref_pipe.run() | |||
check_batch(cpu, ref, bs, 0, 0, "HWC") | |||
check_batch(gpu, ref, bs, 0, 0, "HWC") | |||
|
|||
|
|||
def test_cse(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a test with External Source (to make sure it's not eliminated) and manually named operators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually named operator: see line 2343.
ES - we already have tests with multiple external sources and they did fail before adding the fix, so I'd call it covered.
CI MESSAGE: [21610648]: BUILD PASSED |
Category:
New feature (non-breaking change which adds functionality)
Description:
This change introduces a graph-level optimization that removes common subgraphs.
The algorithm goes over graph nodes in topological order and computes a key from the operator's schema name, arguments and (renamed) inputs. If the operator is found, it's replaced with the previously found one and the output names are added to the renaming map. When subsequent operators are processed, the inputs of the replaced operator are renamed accordingly.
Example:
This optimization is particularly useful with constants, the use of
.shape
function and subscripts.In the following example:
The
bbox[0]
andbbox[2]
will be reused.Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Some Python tests needed to be adjusted.
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-4159