[Relay] Higher order reverse mode automatic differentiation that work with control flow #2496

MarisaKirisame · 2019-01-23T09:15:12Z

as promised, it is simpler then the first order case, as using reference and closure in the object language(Relay) instead of the metalanguage(C++) simplify our code.
reference code is also here, but is on a seprate pr (#2489 ). we can merge this after merging #2489 .
@ZihengJiang @junrushao1994 @masahi @reminisce can you guys review?

junrushao · 2019-01-23T09:17:53Z

So excited to see we reached the point of having higher-order AD!! Thanks Marisa!

Will review the code on Friday night.

python/tvm/relay/ir_pass.py

masahi · 2019-01-23T15:32:48Z

Is there an usage example?

MarisaKirisame · 2019-01-23T16:01:34Z

test_ad is the usage example. the mode does not change the interface, only what is generated (the type and semantic is still the same!).
@masahi

src/relay/pass/gradient.cc

python/tvm/relay/ir_pass.py

src/relay/pass/gradient.cc

sgrechanik-h · 2019-02-07T13:19:42Z

src/relay/pass/gradient.cc

+      for (const ADValue& adval : args) {
+        call_args.push_back(adval->get<ADTensor>().forward);
+      }
+      auto orig = CallNode::make(op_ref, call_args, attrs, type_args);


Is it possible to use the real original node instead of a reconstruction? Reconstructing a node may lead to losing some information, e.g. the inferred type checked_type_.

maybe, but it will require big change in code structure. if such a case come up i will do it.

I need checked_type_ in the integration with the tensor expression ad, mostly for finding out the number of the outputs of the original operation. However, I think I can get this information from other sources. Would passing and reassigning just checked_type_ be dangerous in this case?

@sgrechanik-h can i just rerun type infer? right now every pass will destroy checked_type_ and rebuild from type infer.

@MarisaKirisame Not sure what you mean, but rerunning type inference sounds like a bit of an overkill, and I'm not sure it can be done before calling the FPrimalGradient attribute. If the checked_type_ must be reset after running the differentiation pass, then one of the solutions could be setting it before calling FPrimalGradient to the original value and then resetting it to nullptr after FPrimalGradient has finished, but this feels kinda hacky.

(Also currently I think that in my particular case the proper solution would be to fix the signature of FTVMCompute so that it accept input types, not only the out_type. And this is not connected to the automatic differentiation pass.)

@sgrechanik-h all pass (FuseOps, AD, ANF, GNF, DeadCodeElimination, FoldScaleAxis) remove the type annotation and rerun it AFAIK. I am not sure why it is an AD-specific issue.

@MarisaKirisame I think some passes may benefit from using type information, and, of course, they should use it before erasing it (or recreating the node, I don't think checked_type_ gets literally erased anywhere). In the case of the code we are currently discussing the node is recreated (and thus type information is erased) before calling to FPrimalGradient function which could use type information if it was still there. I don't insist on fixing it if it's difficult or unnatural, because I have only one case where this might be useful, moreover in this single case it would be better to fix a completely different part of Relay.

My other passes use type info too. But we just rerun type infer, and we are encoding (rerunning type infer) into pass manager too.

reminisce

Some minor comments.

src/relay/ir/alpha_equal.cc

src/relay/pass/gradient.cc

merrymercy

Where is the testcase for control flow?
Besides, neither this version nor the old first order AD supports Tuple/TupleGetItem.

tqchen · 2019-02-20T17:22:10Z

@merrymercy can you open an issue to track the first order AD tuple support?

MarisaKirisame · 2019-02-22T14:36:31Z

@merrymercy this version support tuplegetitem using exprmutator. there is just no need for any code for it.
i will add a test for tuple, and control flow.

merrymercy · 2019-02-23T05:55:37Z

@MarisaKirisame How about using tuple as arguments and return value? Some ops will use tuple as arguments, e.g. concatenate.

This example will crash

fn (%tup: Tuple[Tensor[(10, 10), float32], Tensor[(10, 10), float32]]) {
    %tup.0
}

python/tvm/relay/ir_pass.py

src/relay/pass/fuse_ops.cc

MarisaKirisame · 2019-02-25T21:23:14Z

@merrymercy I had wrote a test case using tuple, and a test case using adt, higher order function, closure, pattern matching(control flow) and recursion. do i address your issue?

MarisaKirisame · 2019-02-26T21:31:03Z

@merrymercy can you review?

merrymercy · 2019-02-27T02:05:34Z

Could you add my testcase? This example still crash.

def test_tuple_arg():                                             
    shape = (10, 10)                                              
    dtype = 'float32'                                             
    t = relay.TensorType(shape, dtype)                            
    x = relay.var("x", t)                                         
    y = relay.var("y", t)                                         
    tup = relay.var('tup', relay.TupleType([t, t]))               
    func = relay.Function([tup], relay.TupleGetItem(tup, 0))      
    print(func)                                                   
    back_func = relay.ir_pass.infer_type(gradient(func))          
    back_func = relay.ir_pass.dead_code_elimination(back_func)    
    print(back_func)

We should suppot tuple as arguments and return value in both first order and high order AD.

I found the generated back_func is very complicated and sometimes redundant (both in first order case and higher order case) . How do we execute them efficiently? Do we need more optimization passes or do we need a powerful runtime?

merrymercy · 2019-02-27T02:08:36Z

src/relay/pass/gradient.cc

@@ -85,10 +85,10 @@ using ADValue = std::shared_ptr<ADValueNode>;

 /*! \brief AD over a program which generates a tensor output. */


What if the program generates a tuple of tensor as output?
ADFunction and ADTensor cannot cover this case.

MarisaKirisame · 2019-02-27T04:51:07Z

@merrymercy

the test case will not work. the interface can only use Tensor as of now, but you can use whatever you want inside them. is there any need for it? you can always flatten it before passing in. I would prefer to do it on a seprate issue as i am really busy working on the partial evaluator.
i am working on a partial evaluator pass which will take care of this right now.

merrymercy · 2019-02-28T04:38:19Z

Some operators use tuple as arguments (e.g. concatenate) and return value (e.g. split).
We have to use tuple because we don't know the number of arguments.

I am happy to leave it to the next PR, but this feature is necessary.

MarisaKirisame · 2019-02-28T05:00:02Z

@merrymercy can we leave it in next pr then? i am working on the Partial Evaluator, and it need this branch as test case. fixing this branch mean less rebasing.

MarisaKirisame · 2019-02-28T23:16:41Z

@merrymercy can you approve if you give thumb up?

add test remove dead code stash do it add more test

… with control flow (apache#2496) add test remove dead code stash do it add more test

MarisaKirisame mentioned this pull request Jan 23, 2019

[Relay][RFC] Automatic Differentiation #2237

Closed

masahi reviewed Jan 23, 2019

View reviewed changes

python/tvm/relay/ir_pass.py Show resolved Hide resolved

junrushao reviewed Feb 1, 2019

View reviewed changes

src/relay/pass/gradient.cc Outdated Show resolved Hide resolved

junrushao reviewed Feb 1, 2019

View reviewed changes

python/tvm/relay/ir_pass.py Show resolved Hide resolved

junrushao reviewed Feb 1, 2019

View reviewed changes

src/relay/pass/gradient.cc Show resolved Hide resolved

MarisaKirisame force-pushed the ad branch from e643a46 to 085457a Compare February 7, 2019 04:17

sgrechanik-h reviewed Feb 7, 2019

View reviewed changes

reminisce reviewed Feb 11, 2019

View reviewed changes

merrymercy requested changes Feb 17, 2019

View reviewed changes

MarisaKirisame force-pushed the ad branch from c82fbdb to 72bcfcf Compare February 22, 2019 14:20

ZihengJiang self-assigned this Feb 22, 2019

ZihengJiang reviewed Feb 25, 2019

View reviewed changes

python/tvm/relay/ir_pass.py Show resolved Hide resolved

src/relay/pass/fuse_ops.cc Show resolved Hide resolved

MarisaKirisame force-pushed the ad branch from c8233e6 to 00c56e2 Compare February 25, 2019 21:20

merrymercy requested changes Feb 27, 2019

View reviewed changes

MarisaKirisame force-pushed the ad branch from 00c56e2 to 3285d2e Compare March 1, 2019 15:51

MarisaKirisame mentioned this pull request Mar 1, 2019

[Relay] Partial Evaluation #2714

Merged

rebase

818e84f

add test remove dead code stash do it add more test

MarisaKirisame force-pushed the ad branch from 3285d2e to 818e84f Compare March 1, 2019 17:10

yzhliu mentioned this pull request Mar 2, 2019

[DEV] TVM v0.6 Roadmap #2623

Closed

28 tasks

merrymercy approved these changes Mar 3, 2019

View reviewed changes

ZihengJiang merged commit eae76b3 into apache:master Mar 4, 2019

bwasti pushed a commit to facebookexperimental/tvm that referenced this pull request Mar 6, 2019

[Relay] Higher order reverse mode automatic differentiation that work…

8ce0930

… with control flow (apache#2496) add test remove dead code stash do it add more test

wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 9, 2019

[Relay] Higher order reverse mode automatic differentiation that work…

75e19e3

… with control flow (apache#2496) add test remove dead code stash do it add more test

wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 12, 2019

[Relay] Higher order reverse mode automatic differentiation that work…

1f04aed

… with control flow (apache#2496) add test remove dead code stash do it add more test

wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 12, 2019

[Relay] Higher order reverse mode automatic differentiation that work…

846dd88

… with control flow (apache#2496) add test remove dead code stash do it add more test

MarisaKirisame deleted the ad branch March 28, 2019 03:24

tqchen unassigned ZihengJiang Nov 4, 2019

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay] Higher order reverse mode automatic differentiation that work with control flow #2496

[Relay] Higher order reverse mode automatic differentiation that work with control flow #2496

MarisaKirisame commented Jan 23, 2019

junrushao commented Jan 23, 2019

masahi commented Jan 23, 2019 •

edited

Loading

MarisaKirisame commented Jan 23, 2019 •

edited

Loading

sgrechanik-h Feb 7, 2019

MarisaKirisame Feb 7, 2019

sgrechanik-h Feb 11, 2019

MarisaKirisame Feb 22, 2019

sgrechanik-h Feb 22, 2019

MarisaKirisame Feb 22, 2019

sgrechanik-h Feb 26, 2019

MarisaKirisame Feb 26, 2019

reminisce left a comment

merrymercy left a comment

tqchen commented Feb 20, 2019

MarisaKirisame commented Feb 22, 2019

merrymercy commented Feb 23, 2019 •

edited

Loading

MarisaKirisame commented Feb 25, 2019

MarisaKirisame commented Feb 26, 2019

merrymercy commented Feb 27, 2019

merrymercy Feb 27, 2019 •

edited

Loading

MarisaKirisame commented Feb 27, 2019

merrymercy commented Feb 28, 2019

MarisaKirisame commented Feb 28, 2019

MarisaKirisame commented Feb 28, 2019

		@@ -85,10 +85,10 @@ using ADValue = std::shared_ptr<ADValueNode>;

		/! \brief AD over a program which generates a tensor output. /

[Relay] Higher order reverse mode automatic differentiation that work with control flow #2496

[Relay] Higher order reverse mode automatic differentiation that work with control flow #2496

Conversation

MarisaKirisame commented Jan 23, 2019

junrushao commented Jan 23, 2019

masahi commented Jan 23, 2019 • edited Loading

MarisaKirisame commented Jan 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reminisce left a comment

Choose a reason for hiding this comment

merrymercy left a comment

Choose a reason for hiding this comment

tqchen commented Feb 20, 2019

MarisaKirisame commented Feb 22, 2019

merrymercy commented Feb 23, 2019 • edited Loading

MarisaKirisame commented Feb 25, 2019

MarisaKirisame commented Feb 26, 2019

merrymercy commented Feb 27, 2019

merrymercy Feb 27, 2019 • edited Loading

Choose a reason for hiding this comment

MarisaKirisame commented Feb 27, 2019

merrymercy commented Feb 28, 2019

MarisaKirisame commented Feb 28, 2019

MarisaKirisame commented Feb 28, 2019

masahi commented Jan 23, 2019 •

edited

Loading

MarisaKirisame commented Jan 23, 2019 •

edited

Loading

merrymercy commented Feb 23, 2019 •

edited

Loading

merrymercy Feb 27, 2019 •

edited

Loading