Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Custom Operator Random Number Generator Support #17762

Merged
merged 28 commits into from
Apr 8, 2020

Conversation

rondogency
Copy link
Contributor

@rondogency rondogency commented Mar 4, 2020

Description

Add random number generator support for custom operator libraries.

Design

We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow:

mx.random.seed(128)
r1 = mx.nd.some_custom_random_op(data)
mx.random.seed(128)
r2 = mx.nd.some_custom_random_op(data)
assert (r1 == r2)

This PR is not

Let custom library generate exactly the same sequence of random numbers comparing to MXNet

Comments

This is a continuation of the custom operator project #15921 and #17270

@samskalicky
Copy link
Contributor

Thanks for this @rondogency! this is going to be a great addition for customOps.

A couple suggestions on the PR description:

Add random number generator support for custom operators libraries.

Custom library operators now can call MXNet random number generator to get random numbers through OpResource class in forward/backward computation.

include/mxnet/lib_api.h Outdated Show resolved Hide resolved
src/c_api/c_api.cc Outdated Show resolved Hide resolved
Copy link
Member

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. how are we testing this feature?

@samskalicky
Copy link
Contributor

Great. how are we testing this feature?

I think @rondogency's plan is to write a dropout operator

@rondogency
Copy link
Contributor Author

in the cmake, the line "target_compile_options(subgraph_lib PUBLIC -shared)" can be removed, as we added the library as shared already

Copy link
Member

@wkcn wkcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Work! LGTM : )

src/c_api/c_api.cc Outdated Show resolved Hide resolved
src/c_api/c_api.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@samskalicky samskalicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of minor items. If you dont want to do them here in this PR we can do them in #17885 instead. But while windows CI runs are blocking merging anyway, might as well just do it here ;-)

Other than that, LGTM. Thanks @rondogency for another great contribution!!!

@samskalicky
Copy link
Contributor

@mxnet-label-bot add [pr-awaiting-merge]

@lanking520 lanking520 added the pr-awaiting-merge Review and CI is complete. Ready to Merge label Mar 28, 2020
@wkcn
Copy link
Member

wkcn commented Apr 4, 2020

@mxnet-bot run ci [unix-gpu, centos-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [centos-cpu, unix-gpu]

@wkcn
Copy link
Member

wkcn commented Apr 5, 2020

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@rondogency
Copy link
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@rondogency
Copy link
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@rondogency
Copy link
Contributor Author

@mxnet-bot run ci [unix-gpu, windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu, unix-gpu]

@leezu leezu merged commit 16ddc6d into apache:master Apr 8, 2020
@rondogency rondogency deleted the custom_op_rng branch April 8, 2020 00:47
samskalicky pushed a commit to samskalicky/incubator-mxnet that referenced this pull request Apr 15, 2020
Add random number generator support for custom operator libraries.

Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow:

mx.random.seed(128)
r1 = mx.nd.some_custom_random_op(data)
mx.random.seed(128)
r2 = mx.nd.some_custom_random_op(data)
assert (r1 == r2)

This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet

This is a continuation of the custom operator project apache#15921 and apache#17270
pengzhao-intel pushed a commit that referenced this pull request Apr 16, 2020
…18069)

* Dynamic subgraph compile support (#17623)

This PR adds support for passing the NDArrays from the existing optimize_for API down to the reviewSubgraph function in an external library. It also adds a new API for HybridBlock called optimize_for that can partition the model without running a forward pass.

Feature changes

    Adds new API to HybridBlock optimize_for that partitions the model but does not call the cachedOp
    Modifies the subgraph library example to optionally require args to be provided
    Adds annotation on subgraph inputs for the name of the original param so that inputs can be mapped and passes annotations to input nodes of subgraphs
    Adds support for tensors in MKLDNN format, calls Reorder2Default

New tests

    Adds a new test to partition operators that directly consume params
    add a new model to test where ops to be partitioned have args/params

Bug Fixes

    fixes bug in passing ids vector by value instead of by reference
    fixes bug in passing copies of attributes instead of by reference
    fixes bug where _cached_graph was not updated after partitioning
    fixes memory leak where user-specified attributes on subgraph ops were not freed if subgraph was rejected
    fixes problem incorrectly indexing into shape/dtype maps when annotating the graph

Docs

    Updates the README doc with the latest changes described above

* Adding sparse support to MXTensor for custom operators (#17569)

* Added enum for sparse storage

* Add structure for Dense and Sparse

* redesign the data structure for MXSparse

* pull out aux data from sparse NDArray

* Added more sparse arguments to API interface

* Passed sparse from c_api to lib_api.h and set in MXTensor

* Fix indent

* fix segfault

* Fix NDArray to MXTensor errors

* Add a sample of sparse(CSR) transpose

* Make CSR transpose temporarily work by hardcoding

* Fixed sparse output size(Refined)

* Add tests for symbolic and stateful ops

* Added a sample for row sparse transpose

* Added real row sparse transpose

* Fix output size issue by adding lambda for CheckAndAlloc()

* Fix mixed storage formats error

* Added infer storage type function

* resolve comments

* Set inferSType as optional function

* Resolve comments

* Add error messages

* Resolve comments

* verify transpose ops results

* fix sanity check

* update MX_LIBRARY_VERSION to 5

* Custom Operator Random Number Generator Support (#17762)

Add random number generator support for custom operator libraries.

Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow:

mx.random.seed(128)
r1 = mx.nd.some_custom_random_op(data)
mx.random.seed(128)
r2 = mx.nd.some_custom_random_op(data)
assert (r1 == r2)

This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet

This is a continuation of the custom operator project #15921 and #17270

Co-authored-by: guanxinq <[email protected]>
Co-authored-by: Ziyi Mu <[email protected]>
pengzhao-intel pushed a commit that referenced this pull request Apr 16, 2020
* Dynamic subgraph compile support (#17623)

This PR adds support for passing the NDArrays from the existing optimize_for API down to the reviewSubgraph function in an external library. It also adds a new API for HybridBlock called optimize_for that can partition the model without running a forward pass.

Feature changes

    Adds new API to HybridBlock optimize_for that partitions the model but does not call the cachedOp
    Modifies the subgraph library example to optionally require args to be provided
    Adds annotation on subgraph inputs for the name of the original param so that inputs can be mapped and passes annotations to input nodes of subgraphs
    Adds support for tensors in MKLDNN format, calls Reorder2Default

New tests

    Adds a new test to partition operators that directly consume params
    add a new model to test where ops to be partitioned have args/params

Bug Fixes

    fixes bug in passing ids vector by value instead of by reference
    fixes bug in passing copies of attributes instead of by reference
    fixes bug where _cached_graph was not updated after partitioning
    fixes memory leak where user-specified attributes on subgraph ops were not freed if subgraph was rejected
    fixes problem incorrectly indexing into shape/dtype maps when annotating the graph

Docs

    Updates the README doc with the latest changes described above

* Adding sparse support to MXTensor for custom operators (#17569)

* Added enum for sparse storage

* Add structure for Dense and Sparse

* redesign the data structure for MXSparse

* pull out aux data from sparse NDArray

* Added more sparse arguments to API interface

* Passed sparse from c_api to lib_api.h and set in MXTensor

* Fix indent

* fix segfault

* Fix NDArray to MXTensor errors

* Add a sample of sparse(CSR) transpose

* Make CSR transpose temporarily work by hardcoding

* Fixed sparse output size(Refined)

* Add tests for symbolic and stateful ops

* Added a sample for row sparse transpose

* Added real row sparse transpose

* Fix output size issue by adding lambda for CheckAndAlloc()

* Fix mixed storage formats error

* Added infer storage type function

* resolve comments

* Set inferSType as optional function

* Resolve comments

* Add error messages

* Resolve comments

* verify transpose ops results

* fix sanity check

* update MX_LIBRARY_VERSION to 5

* Custom Operator Random Number Generator Support (#17762)

Add random number generator support for custom operator libraries.

Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow:

mx.random.seed(128)
r1 = mx.nd.some_custom_random_op(data)
mx.random.seed(128)
r2 = mx.nd.some_custom_random_op(data)
assert (r1 == r2)

This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet

This is a continuation of the custom operator project #15921 and #17270

Co-authored-by: guanxinq <[email protected]>
Co-authored-by: Ziyi Mu <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants