[RELAY][FRONTEND] Tensorflow frontend. #2216

srkreddy1238 · 2018-12-01T17:21:06Z

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers.

tests/python/frontend/tensorflow/test_forward.py

srkreddy1238 · 2018-12-06T15:11:50Z

@jroesch @Huyuwei & @tqchen welcome to review.
Except LSTM now it's equivalent to NNVM frontend.

LSTM is suffering with some strange errors, that too different errors for python2 (LLVM/FCmp) and python3(schedule_injective). I am working on it....

jroesch

Looks good. I think we should try to remove all references to NNVM from Relay code so we don't confuse new users.

python/tvm/relay/frontend/tensorflow.py

tests/python/frontend/tensorflow/test_forward.py

tutorials/relay/from_tensorflow.py

srkreddy1238 · 2019-01-07T16:40:51Z

@jroesch thanks for the review.
I enhanced with docstrings cleanup and moved the testing.tf to relay. You may have another look.
#2382 is a dependency for this PR. Now except LSTM everything work good.

I came across some strange error for LSTM, yet to debug

Does the below log ring any bell ??

Traceback (most recent call last):
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/backend/compile_engine.py", line 78, in lower
    return _backend._CompileEngineLower(self, key)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 185, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/base.py", line 72, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: TVMCall CFunc Error:
Traceback (most recent call last):
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 55, in cfun
    rv = local_pyfunc(*pyargs)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/op/op.py", line 186, in schedule_injective
    return topi.generic.schedule_injective(outputs)
  File "<decorator-gen-62>", line 2, in schedule_injective
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/target.py", line 273, in dispatch_func
    return generic_func_node(*args)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/target.py", line 135, in __call__
    return _api_internal._GenericFuncCallFunc(self, *args)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 185, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/base.py", line 72, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: TVMCall CFunc Error:
Traceback (most recent call last):
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 55, in cfun
    rv = local_pyfunc(*pyargs)
  File "/home/srk/.local/lib/python3.6/site-packages/topi-0.5.dev0-py3.6.egg/topi/x86/injective.py", line 26, in schedule_injective
    if len(s[x].op.axis) >= 5:
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/node.py", line 59, in __getattr__
    "'%s' object has no attribute '%s'" % (str(type(self)), name))
AttributeError: '<class 'tvm.tensor.PlaceholderOp'>' object has no attribute 'axis'

jroesch · 2019-01-09T01:38:22Z

It looks like a bug in the schedule, maybe fusion is introducing a pattern the master schedule is not written for? Do you know what operator this is failing for? we should probably modify the compile engine to dump more debug information when an exception happens while scheduling.

srkreddy1238 · 2019-01-09T05:06:52Z

@jroesch
Above situation is caused due to Concatenate operator with one Tensor in list. For now I handled it in front end.
We could decide to make Concatenate behave nice with this situation or leave it.

srkreddy1238 · 2019-01-09T17:44:13Z

@jroesch
Concatenate issue persists still. It's reproducible when place holders are passed to concatenate.

Traceback (most recent call last):
  File "./tests/python/frontend/tensorflow/test_forward.py", line 1059, in <module>
    test_forward_ptb()
  File "./tests/python/frontend/tensorflow/test_forward.py", line 871, in test_forward_ptb
    params, m = _get_tvm_graph_module(graph_def)
  File "./tests/python/frontend/tensorflow/test_forward.py", line 817, in _get_tvm_graph_module
    graph, lib, params = relay.build(sym, target, params=params)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/build_module.py", line 241, in build
    graph_json, lowered_funcs, params = graph_gen.codegen(func)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/backend/graph_runtime_codegen.py", line 349, in codegen
    self.heads = self.visit(func.body)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/expr_functor.py", line 27, in visit
    res = self.visit_call(expr)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/backend/graph_runtime_codegen.py", line 235, in visit_call
    cached_func = self.compile_engine.lower(func, self.target)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/backend/compile_engine.py", line 86, in lower
    raise RuntimeError(msg)
RuntimeError: Traceback (most recent call last):
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/backend/compile_engine.py", line 78, in lower
    return _backend._CompileEngineLower(self, key)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 185, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/base.py", line 72, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: TVMCall CFunc Error:
Traceback (most recent call last):
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 55, in cfun
    rv = local_pyfunc(*pyargs)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/op/op.py", line 186, in schedule_injective
    return topi.generic.schedule_injective(outputs)
  File "<decorator-gen-62>", line 2, in schedule_injective
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/target.py", line 273, in dispatch_func
    return generic_func_node(*args)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/target.py", line 135, in __call__
    return _api_internal._GenericFuncCallFunc(self, *args)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 185, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/base.py", line 72, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: TVMCall CFunc Error:
Traceback (most recent call last):
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 55, in cfun
    rv = local_pyfunc(*pyargs)
  File "/home/srk/.local/lib/python3.6/site-packages/topi-0.5.dev0-py3.6.egg/topi/x86/injective.py", line 26, in schedule_injective
    if len(s[x].op.axis) >= 5:
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/node.py", line 59, in __getattr__
    "'%s' object has no attribute '%s'" % (str(type(self)), name))
AttributeError: '<class 'tvm.tensor.PlaceholderOp'>' object has no attribute 'axis'


Error during compile func
--------------------------
fn (%p0: Tensor[(1, 10000), float32],
    %p1: Tensor[(1, 2, 1, 200), float32],
    %p2: Tensor[(1, 2, 1, 200), float32])
    -> Tuple[Tensor[(1, 10000), float32], Tensor[(2, 2, 1, 200), float32]] {
  %0 = (%p1, %p2)
  %1 = concatenate(%0) # ty=Tensor[(2, 2, 1, 200), float32]
  %2 = (%p0, %1)
  %2
}
--------------------------

srkreddy1238 · 2019-01-10T06:49:04Z

@jroesch #2412 fixes the above problem.

But LLVM has below issue with fuse. Disabling fuse works fine though.

Traceback (most recent call last):
  File "tests/python/frontend/tensorflow/test_forward.py", line 1060, in <module>
    _test_forward_ptb()
  File "tests/python/frontend/tensorflow/test_forward.py", line 872, in _test_forward_ptb
    params, m = _get_tvm_graph_module(graph_def)
  File "tests/python/frontend/tensorflow/test_forward.py", line 818, in _get_tvm_graph_module
    graph, lib, params = relay.build(sym, target, params=params)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/relay/build_module.py", line 242, in build
    mod = _tvm_build_module(lowered_funcs, target=target, target_host=target_host)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/build_module.py", line 592, in build
    mhost = codegen.build_module(fhost_all, str(target_host))
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/codegen.py", line 20, in build_module
    return _Build(lowered_func, target)
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/_ctypes/function.py", line 185, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/home/srk/.local/lib/python3.6/site-packages/tvm-0.5.dev0-py3.6-linux-x86_64.egg/tvm/_ffi/base.py", line 72, in check_call
    raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: [12:16:42] /home/srk/work/DMLC/tvm/src/codegen/llvm/llvm_module.cc:173: LLVM module verification failed with the following errors: 
Invalid operand types for FCmp instruction
  %293 = fcmp oeq i8* %34, %292

srkreddy1238 · 2019-01-11T04:16:06Z

All test cases pass now. Above issue specific to PTB model is unrelated to frontend.
I disabled optimisation only for that test case for now.
@Huyuwei @zhreshold welcome to review.

srkreddy1238 · 2019-01-17T05:26:10Z

@jroesch welcome to further review ?

tqchen · 2019-01-18T01:11:36Z

@jroesch @Huyuwei @kazum please take a look and https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly

srkreddy1238 · 2019-02-03T05:24:20Z

@kazum thanks for the review. You may take another look and conclude.

srkreddy1238 · 2019-02-03T05:36:51Z

@ZihengJiang & @yzhliu request to consider this for 0.5 release if possible.

tqchen · 2019-02-03T17:50:22Z

@srkreddy1238 please look into http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2216/32/pipeline

@kazum @jroesch please take another look

kazum · 2019-02-04T08:13:09Z

@srkreddy1238 My previous comments are not addressed in the latest version.

srkreddy1238 · 2019-02-04T08:50:22Z

@kazum handled now :)

kazum

Looks good, thanks!

yzhliu

Sorry @srkreddy1238 I missed your message. I am peronally very willing to bring this into v0.5 release, while on the other hand we’ve already started voting against a commit-id. What’s your thought? @tqchen @ZihengJiang

yongwww · 2019-02-05T08:28:28Z

good to see this pr get approved, it would be great to see it get merged soon.

srkreddy1238 · 2019-02-05T08:52:10Z

@yzhliu no problem :)

yzhliu · 2019-02-05T23:10:57Z

Thanks everyone for contribution and reviewing. Let's merge.

* [RELAY][FRONTEND] Tensorflow frontend support. * * LSTM removed for a while. * * basic ops are good. * * nn wip * * wip * * python2.7 corrections. * * NN ops are good. * * e2e models working good * * all good except LSTM * * rebase, tutorials and CI trigger. * * CI errors. * * enable opt_level=3 * * Docstrings cleanup. testing.tf utils moved to relay from nnvm. * * tutorials update. * * LSTM work good now. * * Rebase * * CI error * * enable PTB. * * rebase. * * tutorials * Update python/tvm/relay/frontend/tensorflow.py Co-Authored-By: srkreddy1238 <[email protected]> * * review comments. * CI fix. * * review comments.

This solved the issue with LWP that appears with maxpool. The problem was that the LWP handler was forgetting to save p0 (used by the handler). This predicate register needs to be saved too, just like r0-r5, as it had been decided that it was the responsibility of the handler to save everything (even these theoretically caller-saved registers). Said differently, since it had been decided that calling the LWP handler would not follow the normal ABI, and that the LWP handler would save everything it touches (even normally caller-saved registers like r0-r15 and p0-3), then it absolutely needs to save the predicate registers too (in particular p0, which was causing the issue). The issue appeared only with maxpool because it's the only one that had a state saved in p0 before calling the LWP handler. And this call destroyed the content of what it had saved, making it subsequently branch to different portions of the code. Fix: Allocate 32 bytes (instead of 24 previously), in order to save p3:0, and I save those at the bottom of the stack. Restore it at the end of the LWP handler.

* Fix LWP assembly handler (predicate register) (#2216) This solved the issue with LWP that appears with maxpool. The problem was that the LWP handler was forgetting to save p0 (used by the handler). This predicate register needs to be saved too, just like r0-r5, as it had been decided that it was the responsibility of the handler to save everything (even these theoretically caller-saved registers). Said differently, since it had been decided that calling the LWP handler would not follow the normal ABI, and that the LWP handler would save everything it touches (even normally caller-saved registers like r0-r15 and p0-3), then it absolutely needs to save the predicate registers too (in particular p0, which was causing the issue). The issue appeared only with maxpool because it's the only one that had a state saved in p0 before calling the LWP handler. And this call destroyed the content of what it had saved, making it subsequently branch to different portions of the code. Fix: Allocate 32 bytes (instead of 24 previously), in order to save p3:0, and I save those at the bottom of the stack. Restore it at the end of the LWP handler. * Remove training spaces --------- Co-authored-by: Slama, Franck <[email protected]>

tqchen assigned Huyuwei and tqchen Dec 1, 2018

srkreddy1238 force-pushed the tf-relay branch from 89284d2 to 92ec3f8 Compare December 1, 2018 17:39

jroesch requested changes Dec 2, 2018

View reviewed changes

tests/python/frontend/tensorflow/test_forward.py Outdated Show resolved Hide resolved

srkreddy1238 force-pushed the tf-relay branch 4 times, most recently from 79df1dc to 90b99db Compare December 6, 2018 15:06

tqchen added the status: need review label Dec 6, 2018

srkreddy1238 mentioned this pull request Dec 7, 2018

[RELAY][FRONEND] Frontend scalfolding #2246

Closed

8 tasks

srkreddy1238 changed the title ~~[WIP][RELAY][FRONTEND] Tensorflow frontend.~~ [RELAY][FRONTEND] Tensorflow frontend. Dec 11, 2018

srkreddy1238 force-pushed the tf-relay branch 2 times, most recently from 7a04cdf to 6f9be59 Compare December 31, 2018 15:21

jroesch requested changes Jan 6, 2019

View reviewed changes

srkreddy1238 force-pushed the tf-relay branch 2 times, most recently from 901af11 to 9a704fd Compare January 7, 2019 16:32

srkreddy1238 force-pushed the tf-relay branch from f562489 to 4fcae90 Compare January 9, 2019 17:40

srkreddy1238 force-pushed the tf-relay branch from 4fcae90 to 45e11ef Compare January 10, 2019 03:04

srkreddy1238 mentioned this pull request Jan 11, 2019

[RELAY] Filter PlaceholderOp from schedule. #2412

Merged

srkreddy1238 force-pushed the tf-relay branch from 2f9ac16 to 7613c95 Compare January 14, 2019 06:29

srkreddy1238 force-pushed the tf-relay branch from 791bc20 to f56cb12 Compare January 19, 2019 02:07

* review comments.

5cfeda0

srkreddy1238 force-pushed the tf-relay branch from 3cbc68a to 5cfeda0 Compare February 3, 2019 05:17

srkreddy1238 force-pushed the tf-relay branch from 2ab9d88 to d69f1a8 Compare February 3, 2019 19:19

CI fix.

6e977c2

srkreddy1238 force-pushed the tf-relay branch from d69f1a8 to 6e977c2 Compare February 4, 2019 01:32

* review comments.

5b5a2ca

kazum approved these changes Feb 4, 2019

View reviewed changes

yzhliu approved these changes Feb 5, 2019

View reviewed changes

jroesch approved these changes Feb 5, 2019

View reviewed changes

yzhliu added status: accepted and removed status: need review status: need update need update based on feedbacks labels Feb 5, 2019

yzhliu merged commit 2f859d7 into apache:master Feb 5, 2019

yzhliu mentioned this pull request Mar 2, 2019

[DEV] TVM v0.6 Roadmap #2623

Closed

28 tasks

tqchen unassigned Huyuwei Nov 4, 2019

srkreddy1238 deleted the tf-relay branch January 24, 2020 04:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELAY][FRONTEND] Tensorflow frontend. #2216

[RELAY][FRONTEND] Tensorflow frontend. #2216

srkreddy1238 commented Dec 1, 2018

srkreddy1238 commented Dec 6, 2018

jroesch left a comment

srkreddy1238 commented Jan 7, 2019

jroesch commented Jan 9, 2019

srkreddy1238 commented Jan 9, 2019

srkreddy1238 commented Jan 9, 2019

srkreddy1238 commented Jan 10, 2019

srkreddy1238 commented Jan 11, 2019

srkreddy1238 commented Jan 17, 2019

tqchen commented Jan 18, 2019

srkreddy1238 commented Feb 3, 2019

srkreddy1238 commented Feb 3, 2019

tqchen commented Feb 3, 2019

kazum commented Feb 4, 2019

srkreddy1238 commented Feb 4, 2019

kazum left a comment

yzhliu left a comment

yongwww commented Feb 5, 2019

srkreddy1238 commented Feb 5, 2019

yzhliu commented Feb 5, 2019

[RELAY][FRONTEND] Tensorflow frontend. #2216

[RELAY][FRONTEND] Tensorflow frontend. #2216

Conversation

srkreddy1238 commented Dec 1, 2018

srkreddy1238 commented Dec 6, 2018

jroesch left a comment

Choose a reason for hiding this comment

srkreddy1238 commented Jan 7, 2019

jroesch commented Jan 9, 2019

srkreddy1238 commented Jan 9, 2019

srkreddy1238 commented Jan 9, 2019

srkreddy1238 commented Jan 10, 2019

srkreddy1238 commented Jan 11, 2019

srkreddy1238 commented Jan 17, 2019

tqchen commented Jan 18, 2019

srkreddy1238 commented Feb 3, 2019

srkreddy1238 commented Feb 3, 2019

tqchen commented Feb 3, 2019

kazum commented Feb 4, 2019

srkreddy1238 commented Feb 4, 2019

kazum left a comment

Choose a reason for hiding this comment

yzhliu left a comment

Choose a reason for hiding this comment

yongwww commented Feb 5, 2019

srkreddy1238 commented Feb 5, 2019

yzhliu commented Feb 5, 2019