Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for training apis to support custom ops #16601

Merged
merged 19 commits into from
Jul 14, 2023

Conversation

baijumeswani
Copy link
Contributor

@baijumeswani baijumeswani commented Jul 5, 2023

So far, the training API did not include support to register custom op libraries.

One of the models we are working with requires the use of onnxruntime-extensions. onnxruntime-extensions is implemented as a custom op and needs to be registered with ONNX Runtime before an ONNX model containing the custom op nodes can be executed.

This pull request adds the necessary tooling to register the custom op library for both, artifact generation, and graph execution.

In addition, this pull request also adds support for executing models with string tensor inputs.

@baijumeswani baijumeswani added the training issues related to ONNX Runtime training; typically submitted using template label Jul 5, 2023
@baijumeswani baijumeswani requested review from askhade and pengwa July 5, 2023 19:22
try:
yield
finally:
_GLOBAL_CUSTOM_OP_LIBRARY = None

Check notice

Code scanning / CodeQL

Unused global variable

The global variable '_GLOBAL_CUSTOM_OP_LIBRARY' is not used.
Copy link
Contributor

@pengwa pengwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious how custom ops are needed in on device training scenarios? the product team want to write their own operator?

@baijumeswani
Copy link
Contributor Author

baijumeswani commented Jul 10, 2023

I am curious how custom ops are needed in on device training scenarios? the product team want to write their own operator?

One of our partner teams has a need for the op StringToHashFast defined in onnxruntime-extensions. In order to execute an op defined in onnxruntime-extensions library, the library must be registered with ONNX Runtime as a custom library.

This pull request allows for us to register any user provided custom op library (such as onnxruntime-extensions).

Copy link
Contributor

@pengwa pengwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some comments.

pengwa
pengwa previously approved these changes Jul 14, 2023
Copy link
Contributor

@pengwa pengwa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, there is one minor comments.

orttraining/orttraining/python/training/api/module.py Outdated Show resolved Hide resolved
@baijumeswani baijumeswani merged commit 9889f0f into main Jul 14, 2023
@baijumeswani baijumeswani deleted the baijumeswani/training-artifacts-with-custom-ops branch July 14, 2023 18:15
baijumeswani added a commit that referenced this pull request Jul 21, 2023
This pull request contains a few changes:

1. Adds support for string ort values.
2. Fixes the training minimal build (that was broken with #16601) by
putting custom op registration behind #ifdefs
3. Fixes the iOS pod package generation (that was again broken with
#16601) by explicitly providing paths to be copied during pod creation.
jchen351 pushed a commit that referenced this pull request Aug 12, 2023
This pull request contains a few changes:

1. Adds support for string ort values.
2. Fixes the training minimal build (that was broken with #16601) by
putting custom op registration behind #ifdefs
3. Fixes the iOS pod package generation (that was again broken with
#16601) by explicitly providing paths to be copied during pod creation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
training issues related to ONNX Runtime training; typically submitted using template
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants