Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Merge apache/incubator-tvm #87

Merged
merged 267 commits into from
Mar 2, 2020
Merged

Conversation

alexwong
Copy link

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

Don't review yet, will be taking another look tonight. Encountered most merge conflicts in conv2d schedules due to the relay op strategy PR. Will request reviews once I think it is ready.

@alexwong
Copy link
Author

Guess linter was upgraded? Our original pytorch parser has some errors so we must wait until #88 is through to remove it.

@zhiics
Copy link

zhiics commented Feb 27, 2020

sure. let's remove it. We will pull the updated version. Is there any difference in user APIs?

@alexwong
Copy link
Author

alexwong commented Feb 27, 2020

sure. let's remove it. We will pull the updated version. Is there any difference in user APIs?

Yes, we will change from_pytorch_neo to from_pytorch.

tqchen and others added 27 commits February 27, 2020 23:41
This PR moves a few base types from relay to the ir sub-folder.
These types will serve as a common type system across the stack.

Notably, we want to be able to use the same FuncType for all function signatures.
I tried to make a minimum move to bring the necessary dependencies for a FuncType.
We can discuss what additional things we want to move as a follow-up.

Notably, because the TensorType will have a dependency on low-level Expr,
we will need to break the type.h into two files and introduce a
tensor_type.h(or leave them in relay for now).
TVM_REGSISTER_API is an alias of TVM_REGISTER_GLOBAL.
In the spirit of simplify redirections, this PR removes
the original TVM_REGISTER_API macro and directly use TVM_REGISTER_GLOBAL.

This type of refactor will also simplify the IDE navigation tools
such as FFI navigator to provide better code reading experiences.

Move EnvFunc's definition to node.
Rationale: printer is a common infra that is shared across all nodes.
…et_body_typed (apache#4623)

Previously we support a limited case of function type deduction and in many places
we have to supply the type twice during set_body_typed (one in the template parameter, another in the lambda signature).

This PR improves the deduce function by enablng automatic function signature deduction.

```
TVM_REGISTER_GLOBAL("sub")
.set_body_typed([](int x, int y) -> int { return x - y; });
```

Unfortunately, because of template conflict, we can not support the original case
where both type signature and lambda are supplied through set_body_typed.

This PR refactors the existing regsitration to the new style.
…4618)

* Support empty tensor

* Fix schedule

* Refactor

* Minor fix

* Fix pylint

* Merge cpp and python is_empty_shape
* [CONV] Asymmetic padding

* fix lint error

* update for legalize, rocm and cudnn

* add more test cases

* change more symmetric padding

* change conv2d winograd tests according orginal cases

* remove 'alter_op_layout.h' header in bitserial.cc
* [REFACTOR][IR] Introduce SeqStmt to replace Block

ir::Block was used to represent a sequence of Stmts in the original low-level IR.
The nested ir::Block structure is not really friendly for recursive visits,
especially when the statements are unrolled.

This PR introduce a SeqStmt that directly stores a sequence of statements in an Array container.
The new SeqStmt will be used as a replacement of the original Block structure.

* [REFACTOR] Migrate use of Block to SeqStmt.

* [REFACTOR] Remove Block

* Add more comments per yizhi's comment
* Improve commentary for operator fusion.

* Attempt to clarify what well formed checker is doing
…pache#4632)

* As a result of backwards incompatible changes released in pillow 7.0,
   torchvision crashes if you just "pip install pillow", as we do in
   a few places.

 * This patch sets pillow<7 to be installed in Dockerfiles and support
   material as tutorials and documentation.
* Fix typos on Docker image versions that we are currently running
   as part of CI

 * Add version comment in the same pattern for ci_lint image
…ontend (apache#4630)

* Make Relay Keras frontend support networks created using
   Tensorflow (1.13) Keras implementation (tf.Keras)
 * Modify Keras frontend tests to run from a class rather than a
   function based script
 * Adjust Keras frontend tests to run with both 'Keras' and 'tf.Keras'
 * Change "TestKeras.test_forward_merge" to validate instances by
   class name rather than instance type
…che#4637)

* [RUNTIME][DSO] Improve TVMBackendPackedCFunc to allow return value.

Previously the signature of LibraryModule's PackedFunc does not support return value.
This wasn't a limitation for our current usecase but could become one
as we start to generate more interesting functions.

This feature also start to get interesting as we move towards unified
object protocol and start to pass object around.
This PR enhances the function signature to allow return values.

We also created two macros TVM_DLL_EXPORT_PACKED_FUNC and TVM_DLL_EXPORT_TYPED_FUNC
to allow manual creation of functions that can be loaded by a LibraryModule.

Examples are added in apps/dso_plugin_module.
The change to TVMBackendPackedCFunc is backward compatible,
as previous function will simply ignore the return value field.

* address review comments
* [REFACTOR][IR] Variable -> VarNode

* [REFACTOR][IR] Add/Sub/Mul/Div -> AddNode/SubNode etc.

* [REFACTOR][IR] Min/Max/FloorDiv/FloorMod -> MinNode/MaxNode etc.

* [REFACTOR][IR] EQ/NE/LT/LE/GT/GE/Select -> EQNode/NENode etc.

* [REFACTOR][IR] Add Node suffix to Select/Call/Load/Ramp/Shuffle/Let

* [REFACTOR][IR] Add node suffix to IntImm/UIntImm/FloatImm/StringImm

* [REFACTOR][IR] Add Node suffix to Any, AttrStmt, AssertStmt

* [REFACTOR][IR] Add Node suffix to Store/Provide/Allocate/Free

* [REFACTOR][IR] Add Node suffix to ProducerConsumer

* Fix lint

* style updates, test fixes
* [RUNTIME] Fix windows build after the latest dso module change.

Switch to shared_ptr to get around a problem in latest MSVC.

* [CI] Add github action for win mac build.
* [AutoTVM] Use vm compile in extracting task from relay

* update

* restructure vm compiler to reduce task extraction time

* x

* fix

* update doc

* udpate doc

* lint
* Added 1D pooling to Topi

* Added 1D pooling relay op and tests.

* Added onnx parsing and tests for maxpool1d and averagepool1d

* formatting

* moved partial import.

* Fixed typo.
* [REFACTOR] relay::Module Def -> TypeDef

The term Def was not very clear about what is the object of interest(could be function def or type def).
Changes the term to TypeDef to be more explicit.

* Update include/tvm/relay/module.h

Co-Authored-By: Wei Chen <[email protected]>

Co-authored-by: Wei Chen <[email protected]>
Laurawly and others added 14 commits February 27, 2020 23:41
* get_valid_count accuracy issue fixed for individual tests but not for all tests running together

* minor fix

* initialize valid_count and PrefixSum buffers

* test updated

* udpate relay test as well

* update document

* fix lint

* address comment

* fix lint

* correct atomicAdd identifier name
* relay op strategy

fix lint

bitpack strategy

bitserial_dense (neo-ai#6)

* update strategy

* address comments

fix a few topi test

Dense strategy (neo-ai#5)

* dense

* add biforst; remove comments

* address comment

Refactor x86 conv2d_NCHWc (neo-ai#4)

* Refactor x86 conv2d

* Add x86 depthwise_conv2d_NCHWc

* Add back topi x86 conv2d_nchw

* Merge x86 conv2d_nchw and conv2d_NCHWc

* Minor fix for x86 conv2d

fix more strategy

Add x86 conv2d_NCHWc_int8 strategy (neo-ai#8)

* Add x86 conv2d_NCHWc_int8 strategy

* Remove contrib_conv2d_nchwc_int8

* Fix generic conv2d_NCHWc for int8

* Fix topi arm_cpu conv2d_NCHWc_int8

update x86 conv2d

enable specify relay ops to be tuned for autotvm

add cuda conv2d strategy

add conv2d strategy for rocm

add conv2d strategy for hls

add conv2d strategy for arm cpu

add conv2d strategy for mali

add conv2d strategy for bifrost

add conv2d strategy for intel graphics

clean up and fix lint

remove template keys from autotvm

remove 2 in the func name

address comments

fix

* fix bugs

* lint

* address comments

* add name to op implement

* Modify topi tests (neo-ai#9)

* Add pooling, reorg, softmax and vision

* Add lrn

* fix topi test

* fix more topi test

* lint

* address comments

* x

* fix more tests & bugs

* Modify more tests (neo-ai#10)

* Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn

* Minor fix

* More minor fix

* fix more test

* try to update vta using strategy

* fix cpptest

* x

* fix rebase err

* Fix two tests (neo-ai#11)

* change autotvm log format

* lint

* minor fix

* try fix vta test

* fix rebase err

* tweak

* tmp hack for vta pass

* fix tutorial

* fix

* fix more tutorials

* fix vta tutorial

* minor

* address comments

* fix

* address comments

* fix cpptest

* fix docs

* change data structure name and api

* address comments

* lint

* fix rebase err

* updates

* fix winograd test

* fix doc

* rebase

* upgrade tophub version number

* fix bug

* re-enable vta tsim test after tophub is upgraded

* fix vta test to use the correct args so the config can be found in tophub

Co-authored-by: Yao Wang <[email protected]>
GaussianDropout & GaussianNoise are active only during training time. This can be skipped during inference.
…e#4883)

* Use opencv reisze method for preprocessing of image in darknet

* Use opencv reisze method for preprocessing of image in darknet

* Fix pylint issues
* Add a PyTorch to Relay parser

* Add alexnet, googlenet, mnasnet, shufflenet wip

* Fix lint

* Remove fix for shufflenet

* Lower check

* Pull changes from neo-ai/tvm changes

* Remove commented out section

* Use infer_shape everywhere

* Change back to using trace instead of path in from_pytorch

* Parse state_dict to add param names

* Umbrella single_op under test_forwards

* Remove print and cleanup call

* Check if update to test broke CI

* Retrigger CI

* Add back in updated tests

* Try splitting up tests

* First pass at flexible typing, implemented for ones

* Add int32 for all ops

* Remove print statements

* Fix lint

* Broad except

* Add other tensor types

* Temporarily use old tests

* Retrigger CI

* Lower type names

* Use numpy to convert in dense op

* Fix lint

* Remove print

* Need to cleanup but verify int32 works for add

* Rough tests for different types, a lot of types are not supported on CPU

* Probably doesn't build, need to save work as I have to switch branches (constantly)

* Parse param type

* Remove print stmt in parser

* Clean up some code

* Working on flaot32 for bn

* Add resnet18 double type

* Fix lint

* Temporarily move PT tests first

* Temporarily add back refactored tests to fix mem issue

* Add more type test and temp remove some tests

* Comment out tests, hopefully CI prints a trace

* Get stack trace

* Remove operator dict key, rename op_name to node_id, remove dead code

* Make relay map a list

* Remove some hacky string stuff

* Move to PyTorch 1.4

* Remove input_type as param

* Remove _get_fill_value, fix full ops

* Remove unused code and combine ops for identity and none

* Remove fn_param

* Clean up main loop

* Remove useless if/else for outputs

* Remove ir_names, only used once

* Remove some string hacking

* Remove string parsing to get output name

* Fix bug with output sizes of nodes

* Use attributeNames in parse ops

* Remove continue and add_op in parse_op

* Do this everywhere, use assert instead of explciitly type casting

* Remove unnecessary swap

* Slight refactor for elemwise input parse

* Use a copy of graph everywhere

* Rename nid_to_node_name

* Refactor parse import prereqs

* Clean up input node kind check

* Clean up conditionals

* Clean up add_op

* Cleanup type for ones and zeros op

* Fix lint

* Add torch install to CI

* Actually use torch

* Try moving import torch to only where it's needed

* Import torch for CI

* Use take op for select

* Temporarily add ignore for jit inline pass for CI

* Use CompleteTensorType, might be a PT 1.2 only thing

* Use different types in elemwise op

* Use float16 ones

* Fix float16 test

* Remove the temp docker changes

* Remove temp test

* Temporarily comment out original tests

* Remove file

* Empty cache after each test

* Add some prints and lower input sizes

* Try using no grad

* Trying to globally set grad off

* Use no grad for torchvision

* Remove xfail tests

* Remove VGG and AlexNet due to some issues

* Combine pooling tests

* Remove extra test file

* Remove single op, remove larger pooling tests

* Remove maxpool3

* Remove debug prints

* Remove inference call and add no_grad in measure latency

* Use standard string start char

* Remove redundant infer_shape in slice

* Convert most to checks to just expr

* Remove extra paren

* More refactor of isinstance

* Add helper for creating typed constants

* Assert instead of return when no matching type

* Remove network variants

* Add no_grad when forward, remove deatch, fix lint

* Change isinstance to expr in transpose

* Use opnotimplemented, refactor

* Fix full ops, remove duplicate tests

* Never use shape field unless we know the type

* Remove comma, retrigger CI

* Add paren, retrigger CI

* Use inline if-else for flags

* Throw exception instead of assert

* Remove version check for CI

* Check version when doing inline pass

* Fix lint

* Lower more input sizes

* Add new line, conv2d only accepts weight as expr

* Use tvm.runtime.ndarray

* Remove change to torch version install

* Try no grad for mobilenet

* Fix lint

* Fix lint again

* Revert to last passing

* Delete test files

* Ignore lint

* Revert back

* Comment out mobilenet

* Clean up compare compiled and baseline outputs

* Use IRModule

* Add todos

* Refactor use_bias

* Add todo for fix conv op channels

* Change input to data type

* Remove todo

* Handle channel multiplier > 1
… args and output (apache#4934)

* Support int args and no extra buffers

* Fixes

* remove testing code

* fix style

* more style

* use const args

* style

Co-authored-by: Jon Soifer <[email protected]>
- llvm::StringRef to std::string conversion is explicit now.

Signed-off-by: Wei Pan <[email protected]>
* remove unnecessary spliting in the cached chunk

* remove unnecessary spliting in the cached chunk
* Initial TEDD for publishing.

* 1. Fix lint issues. 2. Print intrin.body instead of intrin.name in Schedule Tree.  3. Add examples to top level APIs' comments.  4. Top level APIs don't print Dot string by default, unless outputdotstring is True.

* Fix more lint issues.

* Update top level API argument names and use raw strings to avoid Python lint warnings in the tests.

* Disable TEDD verification, but keep TE construction.

* Stop importing tedd to avoid failure.

* Separate data extraction and visualization. 1. Add API tedd.dump_json(schedule) to dump a json string for the schedule data for visualization.  2. Update tests.  3. Add a tutorial.  4. Add range information to IterVars.

* Update TEDD about InferBound failure.  1. TEDD doesn't call inferbound for DFG. 2. Update tutorial about the InferBound failure.

* 1. Import IPython only if SVG is requested.  This is required to fix a tutorial publishing faliure.  2. Fix test about IPython availability check.
…he#4925)

* [DOCS] Fix Sphinx Warnings: the target found for cross-reference warnings

* Fix the warning: undefined label
* bump up dev version

* update
* Add a tutorial for PyTorch

* Fix sphinx formatting, add version support

* Remove space

* Remove version check

* Some refactoring

* Use no grad

* Rename input

* Update cat img source
* call graph for relay

* CallGraphEntryNode->CallGraphEntry, __getitem__->print_var

* fix typos
@zhiics
Copy link

zhiics commented Feb 28, 2020

@alexwong The error in the CI is fixed by apache#4947
Let's rebase to this PR (not the one more recently because it is huge). Thanks.

@alexwong
Copy link
Author

alexwong commented Feb 29, 2020

Let me rewrite order of commits to get my changes (ones not in usptream) on top and we can merge on Monday.

nhynes and others added 10 commits February 28, 2020 16:42
)

* [Frontend][TFLite] Add parser support for square operator

* Add parser implementation
* Add relevant tests
* Note: 'square' is an unary elemwise operator but it's added separately
  in the parser since there is no Relay 'square' op
  and instead we have to use 'multiply'

* Change relay operation from 'multiply' to 'power'

* Remove a redundant line as requested
* [VTA] YoloV3 Support

Issue:
YoloV3 use some operator and logic that not get good support by
existing vta logic, like nn.pad, upsample, and 255 output channel.

Solution:
add related logic to let darknet YoloV3 can running on VTA

* Fix small(0, or 1 heigh/width) detect frame issue.

* add yolov3-tiny turtorial

* add os import

* address review comments.

* rename tutorial file with a short name.

* rename deploy_vision_on_vta.py into deploy_classification.py.

* address review comment, fix plint eror in deploy_detection.py
* [TUTORIAL] Fix tedd tutorial after strategy change

* Remove scale, remove link to external gdoc
@zhiics zhiics merged commit 70055ab into neo-ai:dev Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.