-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELAY][AUTOTVM] Extract tuning tasks from Relay programs #2181
Conversation
If someone has pointers on the current way to do relay inference, we can also add relay versions of the |
I am also porting the |
|
||
logger = logging.getLogger('autotvm') | ||
|
||
def serialize_args(args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this by adding custom json encoder and decoder in https://github.com/dmlc/tvm/blob/master/python/tvm/autotvm/record.py#L54
# NOTE: To add more ops, you only need to change the following lists | ||
# relay op -> topi compute | ||
self.op2topi = { | ||
relay.op.nn.conv2d: [topi.nn.conv2d, topi.nn.depthwise_conv2d_nchw, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This op2topi
is the only different part between TaskExtractEnv
for nnvm and relay.
I think we can move TaskExtractEnv
to topi_integration.py, and let nnvm and relay part pass in this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signatures are also a little different because nnvm and relay require different parameters (e.g., relay gets param dict, nnvm requires shape). How should we handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am referring to the class TaskExtractEnv
, which is almost TOPI level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good call, we can merge these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to notice is that we pass params to relay.build, which means we will do constant folding and call tvm functions.
Then the python multiprocessing package will become invalid and we cannot tune with multiprocessing. Do you notice this in your tuning? A possible workaround is to run relay.build
in a separate thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI breaks now after merging in the test when nnvm is imported. Is this a circular import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the issue that the CI script does not add NNVM to the pythonpath for relay tests?
https://github.com/dmlc/tvm/blob/master/tests/scripts/task_python_integration.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. Because this is a tvm test. This's why I put the original test to nnvm folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the reason I put it there was because it was a "relay/autotvm" test which doesn't really have anything to do with nnvm. I can move it to the nnvm directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we should not move it to nnvm. See my comment below
@@ -54,6 +54,7 @@ class TaskExtractEnv: | |||
def __init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do better decoupling here, as we will drop nnvm in the future. This can also solve the import problem.
Move TaskExtractEnv
to topi_integration.py,
Move all import nnvm
to nnvm_integration.py,
Move all import relay
to relay_integration.py.
test nnvm_integration.py in nnvm/test
test relay_integration.py in tvm/test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will try to refactor for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, is there any reason we use relay
/nnvm
functions to specify which operators to match? Could we just use the topi functions directly or some syntax sugar otherwise?
Currently this seems to introduce some circular import problems because if we do import relay at the top level (as relay does import autotvm
). We can hack this by moving the imports to within functions, but I am not sure that is ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because one operators conv
can correspond to many topi functions.
We have to move import nnvm
, import relay
inside functions as the old code does.
Overall this looks good to me, one question is there a reason we need to fully build the Relay function with the tracing target? isn't the set of operators just a statically inferable property? Can we not extract the type information and call sites from the program and use that as our tuning work load? |
I'm not sure about the details here, but is TOPI a blackbox from the perspective of Relay? It seems we would still need to go through TOPI's dispatch system even if we had complete information for Relay's side. But overall this is just a case of "don't fix it if ain't broke" as this implementation is basically borrowed from the NNVM version. |
We want to keep the autotvm tuning tasks at TOPI level. Using tracing mode can make the extraction irrelevant to graph ir. So one version of code works for both nnvm and relay. Besides, the dispatching mechanism from relay op to topi is complicated. For example, one relay op can correspond to many topi functions. (e.g., a conv op may call vanila conv, group conv, or depthwise conv in topi). On vta, one op can be run on either CPU or FPGA. |
Not sure what the build error is here; do I need to do a rebase or just retrigger CI? |
Looks good to me. Maybe I will do a follow up refactor for VTA. |
Sorry, can you add some test cases? |
topi.nn.group_conv2d_nchw], | ||
nnvm.sym.conv2d_transpose: [topi.nn.conv2d_transpose_nchw], | ||
nnvm.sym.dense: [topi.nn.dense], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we combine these two SYMBOL2TOPI
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps; @merrymercy should we do this? There have also been discuss board questions from people who didn't know to add conv2d_transpose
to the tutorial example when tuning other networks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yzhliu is referring to combine L52 and L123.
This is not related to the problem in the forum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we don't combine them currently seems to be because the imports here are tricky to do outside of a function. Should we just delete the extract_from_multiple_graphs
function? I do not see it used often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using it. Maybe we can create another function for these lists
topi.nn.group_conv2d_nchw], | ||
tvm.relay.op.nn.conv2d_transpose: [topi.nn.conv2d_transpose_nchw], | ||
tvm.relay.op.nn.dense: [topi.nn.dense], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so as these two OP2TOPI
@yzhliu please take another look at the pr |
Thanks, @yzhliu @eqy @merrymercy this is merged. |
This is a copy-paste of nnvm task extraction for relay for autotvm tuning.
CC @merrymercy @jroesch