-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Porting schedules (except convolutions) to C++ #763
Conversation
How about "power" operator in elementwise category? |
power would definitely be useful. Would it make sense to define two?
|
yes, two versions of pow make sense. |
Fixed issue where TVM_DECLARE_INTRIN_UNARY used the PureExtern flag instead of PureIntrinsic. Added softmax CUDA schedule.
global_pool and pool ops. Extended pad to allow specifying pad_value. Fixed issue where pad would throw if padding was zero in all dimensions.
…s and schedules. Added test of broadcast ops
…ixed corner case in default schedules of attempting to fuse 0 length axes. Added tests for reduce ops.
…dded split_sections op to cover the other mode of the python split op. Added tests.
topi/include/topi/reduction.h
Outdated
Tensor CommReduceIdx(const Tensor& data, | ||
const std::vector<int>& axis, | ||
FCommReduce func, | ||
bool keepdims = false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any update wrt this comment?
topi/include/topi/nn/dilate.h
Outdated
* \return The output tensor. | ||
*/ | ||
inline Tensor dilate(const Tensor& x, | ||
std::vector<int> strides, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change most of the std::vector arguments to Array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should an array of ints be represented? Should it be an Array<Expr>
filled with constants? (Array<int>
fails to compile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Array of note that strides can be passed in as symbolic expressions, although not frequently used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry array of what? Github may have broken your comment. Are you meaning Array of Expr?
I see that strides could be expressions, but nn/dilate.h:69 relies on being able to evaluate strides[i]
. This and surrounding lines are:
if (strides[i] != 1) {
index_tuple.push_back(indices[i] / strides[i]);
not_zero.push_back((indices[i] % strides[i]) == 0);
} else {
index_tuple.push_back(indices[i]);
}
Would a correct implementation of this with changing the signature to Array<Expr> strides
be this?
if (IsConstInt(strides[i]) && GetConstInt(strides[i]) == 1) {
index_tuple.push_back(indices[i]);
} else {
index_tuple.push_back(indices[i] / strides[i]);
not_zero.push_back((indices[i] % strides[i]) == 0);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, let us do that
topi/include/topi/detail/extern.h
Outdated
using FExtern = std::function<Expr(Array<Buffer>, Array<Buffer>)>; | ||
|
||
/*! \brief Create tensors representing the result of invoking an external function */ | ||
Array<Tensor> make_extern(const Array<Array<Expr>>& out_shapes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>> -> > >
note the space, for backward compatibility for MSVC
topi/include/topi/detail/extern.h
Outdated
} | ||
|
||
/*! \brief Pack a buffer object to be used as an argument to a PackedFunc */ | ||
Expr pack_buffer(Buffer buf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this belongs to a detail namespace. Add more comments to argument and return value.
This functions is used to create a DLTensor structure on heap to be able to pass a symbolic buffer as arguments to TVM PackedFunc
topi/include/topi/detail/fuse.h
Outdated
namespace topi { | ||
using namespace tvm; | ||
|
||
/*! \brief Fuse all of the given args */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document all arguments
topi/include/topi/detail/pad_utils.h
Outdated
namespace topi { | ||
using namespace tvm; | ||
|
||
/*! \brief Get padding size for each side given padding height and width */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
document all arguments, and what is expected in return
Thanks for all the effort to bring this set of changes! I like the structure overall and the code looks good as I skim through. Since there are a lot of changes. Please take a pass over the code to make sure we have good readability of the code, put in comments delibrately in places that you think requires explaination so it is easier for others to understand and improve upon this. |
Ok I've fixed up the comments on some of the more obscure bits of the code. Let me know if there's anything else that needs sorting. I'd like to get some kind of schedule registry working like there is in the python code but I think that's best left for a separate PR ;) |
Thanks! this is merged |
* Ported injective schedules to C++. Added some elementwise ops. * Fix lint errors * Added reduction ops and schedules * Fix lint errors * Fix lint errors * Fix lint errors * Added transform ops * Fix lint errors * Fix lint errors * Added softmax, log_softmax, leaky_relu and flatten ops. Fixed issue where TVM_DECLARE_INTRIN_UNARY used the PureExtern flag instead of PureIntrinsic. Added softmax CUDA schedule. * Fix lint * Fix lint * Added binary_dense, batch_norm_inference, dense, dilate, scale_shift_*, global_pool and pool ops. Extended pad to allow specifying pad_value. Fixed issue where pad would throw if padding was zero in all dimensions. * Fix lint * Fix lint * Added CUDA schedules for dense, pool and global_pool * Added extern schedules for generic and CUDA * Fix lint * Added x86 binary schedules * Fix lint * Added rocm dense schedule. Added rocBLAS and cuBLAS support to dense ops * Added pow ops. Added x86 default and injective schedules * Fix lint * Fix lint * Fix lint * Fix lint * Fix lint * Fix indent * Removed schedules directory * Changed left_shift, right_shift to operators. Changed pad_value in pad() to remove pointer usage * Fixed usage of pad in nn/pooling.h. Fixed declaration of operator>> * Fixed comments for shift operators * Added comments to utility functions * Added TOPI C++ library, exporting broadcast_add op * Fix lint * Share libinfo.py with TVM * Fix lint * Add other broadcast ops * Fix lint * Fix imports in topi * Fix lib names * Fixed build issue where windows builds don't apply correct definitions * Removed TVM_EXPORTS from topi library * Attempted CI build fix * Add topi lib to tvm_multilib * Fix Jenkinsfile * Added TOPI build target to Makefile * Fix nn op namespaces. * Fix lint * Renamed TOPI lib to libtvm_topi * Removed _ffi/base.py * Remove _ffi from topi, now shared with tvm. * Make libtvm_topi loading optional * Fix compiler warnings * Fix lint * Fix lint * Fix lint * Fix build error by making new libs argument to Target optional * Added C++ Target type interop. Added registration of remaining C++ ops and schedules. Added test of broadcast ops * Fix lint * Fix lint * Fix compile error * Fix compiler warnings * Fix compiler warnings * Fixed int vector interop. Fixed argmin incorrectly invoking argmax. Fixed corner case in default schedules of attempting to fuse 0 length axes. Added tests for reduce ops. * Refactored reduce builders * Fixed typos in topi.cc. Added basic test. * Fixed padding size error. Added dense, dilate, pooling tests * Fixed issue where clip would output a different dtype to the input. Added split_sections op to cover the other mode of the python split op. Added tests. * Changed extension type numbers to avoid clash with NNVM * Fix lint * Fix compiler warnings * Removed use of std::vector from the public TOPI API * Fix lint * Add TOPI C++ tests to CI * Fixed detail namespacing. Improved comments.
* Ported injective schedules to C++. Added some elementwise ops. * Fix lint errors * Added reduction ops and schedules * Fix lint errors * Fix lint errors * Fix lint errors * Added transform ops * Fix lint errors * Fix lint errors * Added softmax, log_softmax, leaky_relu and flatten ops. Fixed issue where TVM_DECLARE_INTRIN_UNARY used the PureExtern flag instead of PureIntrinsic. Added softmax CUDA schedule. * Fix lint * Fix lint * Added binary_dense, batch_norm_inference, dense, dilate, scale_shift_*, global_pool and pool ops. Extended pad to allow specifying pad_value. Fixed issue where pad would throw if padding was zero in all dimensions. * Fix lint * Fix lint * Added CUDA schedules for dense, pool and global_pool * Added extern schedules for generic and CUDA * Fix lint * Added x86 binary schedules * Fix lint * Added rocm dense schedule. Added rocBLAS and cuBLAS support to dense ops * Added pow ops. Added x86 default and injective schedules * Fix lint * Fix lint * Fix lint * Fix lint * Fix lint * Fix indent * Removed schedules directory * Changed left_shift, right_shift to operators. Changed pad_value in pad() to remove pointer usage * Fixed usage of pad in nn/pooling.h. Fixed declaration of operator>> * Fixed comments for shift operators * Added comments to utility functions * Added TOPI C++ library, exporting broadcast_add op * Fix lint * Share libinfo.py with TVM * Fix lint * Add other broadcast ops * Fix lint * Fix imports in topi * Fix lib names * Fixed build issue where windows builds don't apply correct definitions * Removed TVM_EXPORTS from topi library * Attempted CI build fix * Add topi lib to tvm_multilib * Fix Jenkinsfile * Added TOPI build target to Makefile * Fix nn op namespaces. * Fix lint * Renamed TOPI lib to libtvm_topi * Removed _ffi/base.py * Remove _ffi from topi, now shared with tvm. * Make libtvm_topi loading optional * Fix compiler warnings * Fix lint * Fix lint * Fix lint * Fix build error by making new libs argument to Target optional * Added C++ Target type interop. Added registration of remaining C++ ops and schedules. Added test of broadcast ops * Fix lint * Fix lint * Fix compile error * Fix compiler warnings * Fix compiler warnings * Fixed int vector interop. Fixed argmin incorrectly invoking argmax. Fixed corner case in default schedules of attempting to fuse 0 length axes. Added tests for reduce ops. * Refactored reduce builders * Fixed typos in topi.cc. Added basic test. * Fixed padding size error. Added dense, dilate, pooling tests * Fixed issue where clip would output a different dtype to the input. Added split_sections op to cover the other mode of the python split op. Added tests. * Changed extension type numbers to avoid clash with NNVM * Fix lint * Fix compiler warnings * Removed use of std::vector from the public TOPI API * Fix lint * Add TOPI C++ tests to CI * Fixed detail namespacing. Improved comments.
Operators
Generic schedules
CUDA schedules
Other schedules