-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay][Any] Add shape func for dynamic shape #3606
Conversation
I'm interested. Why doesn't anyone comment on it |
Is this ready for review? |
@junrushao1994 Yes, you can start to review this PR except for the VM compiler part. |
Except WIP, this LGTM. |
src/relay/backend/vm/compiler.cc
Outdated
Emit(alloc); | ||
unpacked_arg_regs.push_back(alloc.dst); | ||
} | ||
// if (const TensorTypeNode* ttype = ret_type.as<TensorTypeNode>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove these lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, still WIP for this file.
"Not enough dims in input shape for -3" | ||
out[dst_idx] = x[src_idx] * x[src_idx+1] | ||
src_idx += 2 | ||
dst_idx += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are those only for the very ad-hoc reshape functions in MXNet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be out of the scope of this PR but can we think about removing the ad-hoc reshape from the Relay core and possibly elaborating away i.e MxNet dialect.
First, this PR mainly focuses on operators where we are able to pre-compute output types once they know their input value (np.arange), not those that we only know the output types after computation is done (e.g. np.unique). Wouldn’t it be more informative if we come up with another name? Second, this kind of operators occurs only when those input values are like scalars, like shape array (tuple of integers). In our previous practice, if they are known statically, we would prefer to put them into Attrs. However, in this PR, we bypass type relation and use pre-defined shape function that could be lowered to LLVM to do code generation. Just saying, will it be possible if we have some mechanism to put those scalars back to Attrs so that we can reuse previous efforts in defining TypeRelation? For example, imagine we have primitive types in Relay, then we allow one operator to produce tvm::Arraytvm::Integer as output. Then in the following up operator, we let the system to put these Integers as attributes, then we are able to reuse those TypeRelations we previously defined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only a few questions to clarify.
@jroesch @junrushao1994 @were Could you help review this PR again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
def any_dims(ndim): | ||
shape = [] | ||
for _ in range(ndim): | ||
shape.append(relay.Any()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be more convenient to have dimensions with -1 internally map to relay.Any rather than requiring it to be explicit. This would match the syntax of all other frameworks when it comes to dynamic shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the argument for having convenient ways to specify dynamic shapes from the frontends, but for analysis and transformation having dynamic shapes be treated specially is important instead of overloading integers. Annoyingly some of the Relay operators have the semantics of MxNet's operators baked in right now. My 2c is that we should make greater use of dialects of Relay in which the operators macro expand to the ones that are easy to operate and analyze.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should remove those ad-hoc semantics like those in reshape.
src/relay/backend/compile_engine.cc
Outdated
@@ -283,6 +305,248 @@ class ScheduleGetter : | |||
Array<Operation> scalars_; | |||
}; | |||
|
|||
// The getter to get shape function from functor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we possibly use a better name here? ShapeFuncGetter is not very descriptive imo, MakeShapeFunc
, GetShapeFunc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -310,7 +310,124 @@ class Interpreter : | |||
return MakeClosure(func); | |||
} | |||
|
|||
Value InvokePrimitiveOp(Function func, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we factor this code out somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you mean set inputs and outputs part to call the packed function? It has some difference in between invoking shape function and invoking primitive ops since shape function could need either data or shape.
Further this part of code is coupled with interpreter since it involves TVMValue that only is used by interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, we can revisit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left a few comments.
const auto *tuple_type = param->type_as<TupleTypeNode>(); | ||
CHECK(tuple_type); | ||
for (Type field : tuple_type->fields) { | ||
const auto *ttype = field.as<TensorTypeNode>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you support recursive tuple of tensor? It is not much work, and a lot of other piece also support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There're quite many places that don't support recursive tuple. I left a few TODOs in the code, though there could be more. Let's leave them in the future PR to systematically fix this issue.
CHECK(rtype); | ||
for (size_t i = 0; i < rtype->fields.size(); ++i) { | ||
auto ttype = rtype->fields[i].as<TensorTypeNode>(); | ||
CHECK(ttype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow recursive tuple
<< "Shape function input sizes mismatch"; | ||
|
||
auto fset_shape_output = [&](size_t i, Type val_type) { | ||
const TensorTypeNode* rtype = val_type.as<TensorTypeNode>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recursive. see gradient for how to do recursive tuple.
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
This PR aims to make interpreter and VM support dynamic shape.
Dynamic shape function is responsible for calculating the output shape of an op at runtime. It can have two different semantics:
op.shape_data_dependant=true
), the shape function's inputs should be the same data tensors to this op;Current progress:
Caveats:
Any
dimension as 1, and therefore gives wrong outputs whenAny
dimension is not 1. For example,a : Tensor[Any] + b : Tensor[3, 2]
is allowed in both Relay and TOPI. Whena.shape=[1]
at runtime, TOPI will give the correct result. However, ifa.shape=[2]
, the result will be wrong. The solution to this issue needs to use auto broadcast buffer ([Codegen] Support broadcast op with symbolic shape #3389). I plan to leave this to a follow-up PR.@jroesch @zhiics @wweic @tqchen @MarisaKirisame @junrushao1994