Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][Any] Add shape func for dynamic shape #3606

Merged
merged 22 commits into from
Sep 1, 2019
Merged

Conversation

icemelon
Copy link
Member

@icemelon icemelon commented Jul 23, 2019

This PR aims to make interpreter and VM support dynamic shape.
Dynamic shape function is responsible for calculating the output shape of an op at runtime. It can have two different semantics:

  • If the output shapes of an op are data dependant (op.shape_data_dependant=true), the shape function's inputs should be the same data tensors to this op;
  • Otherwise, the shape function's inputs should be the shape tensors of inputs to this op.

Current progress:

  • Interpreter can run relay program with dynamic shape as long as all ops with dynamic shape have shape function defined
  • Wrote shape functions for broadcast ops, arange, concatenate, and reshape using hybrid script
  • VM supports dynamic shapes w/o and w/ op fusion enabled.

Caveats:

  • Dynamic shape function doesn't support CUDA. Need to wait until VM supports heterogeneous execution
  • TOPI broadcast ops currently treat an Any dimension as 1, and therefore gives wrong outputs when Any dimension is not 1. For example, a : Tensor[Any] + b : Tensor[3, 2] is allowed in both Relay and TOPI. When a.shape=[1] at runtime, TOPI will give the correct result. However, if a.shape=[2], the result will be wrong. The solution to this issue needs to use auto broadcast buffer ([Codegen] Support broadcast op with symbolic shape #3389). I plan to leave this to a follow-up PR.

@jroesch @zhiics @wweic @tqchen @MarisaKirisame @junrushao1994

@icemelon icemelon changed the title [Any] Add shape func for dynamic shape [Relay][Any] Add shape func for dynamic shape Jul 23, 2019
@a1010428282
Copy link

I'm interested. Why doesn't anyone comment on it

python/tvm/hybrid/parser.py Outdated Show resolved Hide resolved
@junrushao
Copy link
Member

Is this ready for review?

@icemelon
Copy link
Member Author

icemelon commented Aug 2, 2019

@junrushao1994 Yes, you can start to review this PR except for the VM compiler part.

@were
Copy link
Contributor

were commented Aug 2, 2019

Except WIP, this LGTM.

Emit(alloc);
unpacked_arg_regs.push_back(alloc.dst);
}
// if (const TensorTypeNode* ttype = ret_type.as<TensorTypeNode>()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove these lines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, still WIP for this file.

"Not enough dims in input shape for -3"
out[dst_idx] = x[src_idx] * x[src_idx+1]
src_idx += 2
dst_idx += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are those only for the very ad-hoc reshape functions in MXNet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be out of the scope of this PR but can we think about removing the ad-hoc reshape from the Relay core and possibly elaborating away i.e MxNet dialect.

@junrushao
Copy link
Member

junrushao commented Aug 2, 2019

First, this PR mainly focuses on operators where we are able to pre-compute output types once they know their input value (np.arange), not those that we only know the output types after computation is done (e.g. np.unique). Wouldn’t it be more informative if we come up with another name?

Second, this kind of operators occurs only when those input values are like scalars, like shape array (tuple of integers). In our previous practice, if they are known statically, we would prefer to put them into Attrs. However, in this PR, we bypass type relation and use pre-defined shape function that could be lowered to LLVM to do code generation.

Just saying, will it be possible if we have some mechanism to put those scalars back to Attrs so that we can reuse previous efforts in defining TypeRelation?

For example, imagine we have primitive types in Relay, then we allow one operator to produce tvm::Arraytvm::Integer as output. Then in the following up operator, we let the system to put these Integers as attributes, then we are able to reuse those TypeRelations we previously defined.

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only a few questions to clarify.

src/relay/backend/vm/compiler.cc Show resolved Hide resolved
tests/python/relay/test_any.py Outdated Show resolved Hide resolved
src/relay/backend/compile_engine.h Show resolved Hide resolved
src/relay/backend/compile_engine.cc Outdated Show resolved Hide resolved
src/relay/backend/compile_engine.cc Show resolved Hide resolved
src/relay/backend/vm/compiler.cc Outdated Show resolved Hide resolved
src/relay/backend/vm/compiler.cc Outdated Show resolved Hide resolved
@icemelon
Copy link
Member Author

@jroesch @junrushao1994 @were Could you help review this PR again?

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

def any_dims(ndim):
shape = []
for _ in range(ndim):
shape.append(relay.Any())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more convenient to have dimensions with -1 internally map to relay.Any rather than requiring it to be explicit. This would match the syntax of all other frameworks when it comes to dynamic shape.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the argument for having convenient ways to specify dynamic shapes from the frontends, but for analysis and transformation having dynamic shapes be treated specially is important instead of overloading integers. Annoyingly some of the Relay operators have the semantics of MxNet's operators baked in right now. My 2c is that we should make greater use of dialects of Relay in which the operators macro expand to the ones that are easy to operate and analyze.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should remove those ad-hoc semantics like those in reshape.

tqchen
tqchen previously requested changes Aug 22, 2019
include/tvm/relay/op.h Outdated Show resolved Hide resolved
@@ -283,6 +305,248 @@ class ScheduleGetter :
Array<Operation> scalars_;
};

// The getter to get shape function from functor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we possibly use a better name here? ShapeFuncGetter is not very descriptive imo, MakeShapeFunc, GetShapeFunc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -310,7 +310,124 @@ class Interpreter :
return MakeClosure(func);
}

Value InvokePrimitiveOp(Function func,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we factor this code out somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean set inputs and outputs part to call the packed function? It has some difference in between invoking shape function and invoking primitive ops since shape function could need either data or shape.
Further this part of code is coupled with interpreter since it involves TVMValue that only is used by interpreter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we can revisit.

Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left a few comments.

@icemelon
Copy link
Member Author

@tqchen @jroesch could you take another took?

const auto *tuple_type = param->type_as<TupleTypeNode>();
CHECK(tuple_type);
for (Type field : tuple_type->fields) {
const auto *ttype = field.as<TensorTypeNode>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you support recursive tuple of tensor? It is not much work, and a lot of other piece also support that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're quite many places that don't support recursive tuple. I left a few TODOs in the code, though there could be more. Let's leave them in the future PR to systematically fix this issue.

CHECK(rtype);
for (size_t i = 0; i < rtype->fields.size(); ++i) {
auto ttype = rtype->fields[i].as<TensorTypeNode>();
CHECK(ttype);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow recursive tuple

<< "Shape function input sizes mismatch";

auto fset_shape_output = [&](size_t i, Type val_type) {
const TensorTypeNode* rtype = val_type.as<TensorTypeNode>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recursive. see gradient for how to do recursive tuple.

@jroesch jroesch dismissed tqchen’s stale review September 1, 2019 01:50

Tianqi on vacation

@jroesch jroesch merged commit eef35a5 into apache:master Sep 1, 2019
@icemelon icemelon deleted the any-sf branch September 6, 2019 18:52
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
* init shape func in interpreter and vm compiler

* Update interpreter

* fix

* lint

* lint

* fix

* remove hack

* update

* fix

* fix

* update

* address comments & update for shape_of

* fix lint

* update

* fix hybrid

* lint

* fix bug & add take shape func

* lint

* lint

* update

* fix flaky test

* add todo
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
* init shape func in interpreter and vm compiler

* Update interpreter

* fix

* lint

* lint

* fix

* remove hack

* update

* fix

* fix

* update

* address comments & update for shape_of

* fix lint

* update

* fix hybrid

* lint

* fix bug & add take shape func

* lint

* lint

* update

* fix flaky test

* add todo
wweic pushed a commit to neo-ai/tvm that referenced this pull request Sep 16, 2019
* init shape func in interpreter and vm compiler

* Update interpreter

* fix

* lint

* lint

* fix

* remove hack

* update

* fix

* fix

* update

* address comments & update for shape_of

* fix lint

* update

* fix hybrid

* lint

* fix bug & add take shape func

* lint

* lint

* update

* fix flaky test

* add todo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants