Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the Onnx runtime option for setting the number of threads #5962

Merged
merged 5 commits into from
Oct 12, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions src/Microsoft.ML.OnnxTransformer/OnnxCatalog.cs
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@ public static OnnxScoringEstimator ApplyOnnxModel(this TransformsCatalog catalog
/// <param name="modelFile">The path of the file containing the ONNX model.</param>
/// <param name="gpuDeviceId">Optional GPU device ID to run execution on, <see langword="null" /> to run on CPU.</param>
/// <param name="fallbackToCpu">If GPU error, raise exception or fallback to CPU.</param>
/// <param name="interOpNumThreads">Controls the number of threads used to parallelize the execution of the graph (across nodes).</param>
/// <param name="intraOpNumThreads">Controls the number of threads to use to run the model.</param>
/// <example>
/// <format type="text/markdown">
/// <![CDATA[
Expand All @@ -95,9 +97,11 @@ public static OnnxScoringEstimator ApplyOnnxModel(this TransformsCatalog catalog
string inputColumnName,
string modelFile,
int? gpuDeviceId = null,
bool fallbackToCpu = false)
=> new OnnxScoringEstimator(CatalogUtils.GetEnvironment(catalog), new[] { outputColumnName }, new[] { inputColumnName },
modelFile, gpuDeviceId, fallbackToCpu);
bool fallbackToCpu = false,
int? interOpNumThreads = default,
int? intraOpNumThreads = default)
=> new OnnxScoringEstimator(CatalogUtils.GetEnvironment(catalog), new[] { outputColumnName }, new[] { inputColumnName },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the API compat is failing. You will need either need to make a new method here, or make an onnx options class that encapsulates these.

@eerhardt do you think an onnx options class would be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1798. The established pattern is:

  1. A "simple" public API that takes all the required arguments.
  2. An "advanced" overload that takes the required arguments plus an Options class.

And if (1) is sufficient, we can opt out of (2).

So, yes, I believe the new options should force us to create the Options class here.

modelFile, gpuDeviceId, fallbackToCpu, interOpNumThreads: interOpNumThreads, intraOpNumThreads: intraOpNumThreads);

/// <summary>
/// Create a <see cref="OnnxScoringEstimator"/>, which applies a pre-trained Onnx model to the <paramref name="inputColumnName"/> column.
Expand Down
25 changes: 20 additions & 5 deletions src/Microsoft.ML.OnnxTransformer/OnnxTransform.cs
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,12 @@ internal sealed class Options : TransformInputBase

[Argument(ArgumentType.AtMostOnce, HelpText = "Protobuf CodedInputStream recursion limit.", SortOrder = 6)]
public int RecursionLimit = 100;

[Argument(ArgumentType.AtMostOnce, HelpText = "Controls the number of threads used to parallelize the execution of the graph (across nodes).", SortOrder = 7)]
public int? InterOpNumThreads = null;

[Argument(ArgumentType.AtMostOnce, HelpText = "Controls the number of threads to use to run the model.", SortOrder = 8)]
public int? IntraOpNumThreads = null;
}

/// <summary>
Expand Down Expand Up @@ -244,7 +250,8 @@ private OnnxTransformer(IHostEnvironment env, Options options, byte[] modelBytes
Host.CheckNonWhiteSpace(options.ModelFile, nameof(options.ModelFile));
Host.CheckIO(File.Exists(options.ModelFile), "Model file {0} does not exists.", options.ModelFile);
// Because we cannot delete the user file, ownModelFile should be false.
Model = new OnnxModel(options.ModelFile, options.GpuDeviceId, options.FallbackToCpu, ownModelFile: false, shapeDictionary: shapeDictionary, options.RecursionLimit);
Model = new OnnxModel(options.ModelFile, options.GpuDeviceId, options.FallbackToCpu, ownModelFile: false, shapeDictionary: shapeDictionary, options.RecursionLimit,
options.InterOpNumThreads, options.IntraOpNumThreads);
}
else
{
Expand Down Expand Up @@ -309,8 +316,11 @@ internal OnnxTransformer(IHostEnvironment env, string modelFile, int? gpuDeviceI
/// <param name="fallbackToCpu">If GPU error, raise exception or fallback to CPU.</param>
/// <param name="shapeDictionary"></param>
/// <param name="recursionLimit">Optional, specifies the Protobuf CodedInputStream recursion limit. Default value is 100.</param>
/// <param name="interOpNumThreads">Controls the number of threads used to parallelize the execution of the graph (across nodes).</param>
/// <param name="intraOpNumThreads">Controls the number of threads to use to run the model.</param>
internal OnnxTransformer(IHostEnvironment env, string[] outputColumnNames, string[] inputColumnNames, string modelFile, int? gpuDeviceId = null, bool fallbackToCpu = false,
IDictionary<string, int[]> shapeDictionary = null, int recursionLimit = 100)
IDictionary<string, int[]> shapeDictionary = null, int recursionLimit = 100,
int? interOpNumThreads = null, int? intraOpNumThreads = null)
: this(env, new Options()
{
ModelFile = modelFile,
Expand All @@ -319,7 +329,9 @@ internal OnnxTransformer(IHostEnvironment env, string[] outputColumnNames, strin
GpuDeviceId = gpuDeviceId,
FallbackToCpu = fallbackToCpu,
CustomShapeInfos = shapeDictionary?.Select(pair => new CustomShapeInfo(pair.Key, pair.Value)).ToArray(),
RecursionLimit = recursionLimit
RecursionLimit = recursionLimit,
InterOpNumThreads = interOpNumThreads,
IntraOpNumThreads = intraOpNumThreads
})
{
}
Expand Down Expand Up @@ -856,9 +868,12 @@ internal OnnxScoringEstimator(IHostEnvironment env, string modelFile, int? gpuDe
/// <param name="fallbackToCpu">If GPU error, raise exception or fallback to CPU.</param>
/// <param name="shapeDictionary"></param>
/// <param name="recursionLimit">Optional, specifies the Protobuf CodedInputStream recursion limit. Default value is 100.</param>
/// <param name="interOpNumThreads">Controls the number of threads used to parallelize the execution of the graph (across nodes).</param>
/// <param name="intraOpNumThreads">Controls the number of threads to use to run the model.</param>
internal OnnxScoringEstimator(IHostEnvironment env, string[] outputColumnNames, string[] inputColumnNames, string modelFile,
int? gpuDeviceId = null, bool fallbackToCpu = false, IDictionary<string, int[]> shapeDictionary = null, int recursionLimit = 100)
: this(env, new OnnxTransformer(env, outputColumnNames, inputColumnNames, modelFile, gpuDeviceId, fallbackToCpu, shapeDictionary, recursionLimit))
int? gpuDeviceId = null, bool fallbackToCpu = false, IDictionary<string, int[]> shapeDictionary = null, int recursionLimit = 100,
int? interOpNumThreads = null, int? intraOpNumThreads = null)
: this(env, new OnnxTransformer(env, outputColumnNames, inputColumnNames, modelFile, gpuDeviceId, fallbackToCpu, shapeDictionary, recursionLimit, interOpNumThreads, intraOpNumThreads))
{
}

Expand Down
21 changes: 18 additions & 3 deletions src/Microsoft.ML.OnnxTransformer/OnnxUtils.cs
Original file line number Diff line number Diff line change
Expand Up @@ -165,8 +165,11 @@ public OnnxVariableInfo(string name, OnnxShape shape, Type typeInOnnxRuntime, Da
/// no longer needed.</param>
/// <param name="shapeDictionary"></param>
/// <param name="recursionLimit">Optional, specifies the Protobuf CodedInputStream recursion limit. Default value is 100.</param>
/// <param name="interOpNumThreads">Controls the number of threads used to parallelize the execution of the graph (across nodes).</param>
/// <param name="intraOpNumThreads">Controls the number of threads to use to run the model.</param>
public OnnxModel(string modelFile, int? gpuDeviceId = null, bool fallbackToCpu = false,
bool ownModelFile = false, IDictionary<string, int[]> shapeDictionary = null, int recursionLimit = 100)
bool ownModelFile = false, IDictionary<string, int[]> shapeDictionary = null, int recursionLimit = 100,
int? interOpNumThreads = null, int? intraOpNumThreads = null)
ericstj marked this conversation as resolved.
Show resolved Hide resolved
{
// If we don't own the model file, _disposed should be false to prevent deleting user's file.
_disposed = false;
Expand All @@ -181,15 +184,27 @@ public OnnxModel(string modelFile, int? gpuDeviceId = null, bool fallbackToCpu =
catch (OnnxRuntimeException)
{
if (fallbackToCpu)
_session = new InferenceSession(modelFile);
{
var sessionOptions = new SessionOptions()
{
InterOpNumThreads = interOpNumThreads.GetValueOrDefault(),
IntraOpNumThreads = intraOpNumThreads.GetValueOrDefault()
};
_session = new InferenceSession(modelFile, sessionOptions);
}
else
// If called from OnnxTransform, is caught and rethrown
throw;
}
}
else
{
_session = new InferenceSession(modelFile);
var sessionOptions = new SessionOptions()
{
InterOpNumThreads = interOpNumThreads.GetValueOrDefault(),
IntraOpNumThreads = intraOpNumThreads.GetValueOrDefault()
};
_session = new InferenceSession(modelFile, sessionOptions);
}

try
Expand Down