[CoreML] more performace flag #22975

wejoncy · 2024-11-29T12:05:38Z

Description

refactor unsquzee's implementation
add more flags to boost peformance.
add profile flag

Motivation and Context

This reverts commit 5249880.

skottmckay · 2024-12-04T04:16:45Z

onnxruntime/core/providers/coreml/builders/impl/batch_norm_op_builder.cc

@@ -151,7 +151,7 @@ bool BatchNormalizationOpBuilder::IsOpSupportedImpl(const Node& node, const OpBu
    return false;
  }

-#if defined(TARGET_OS_IOS) && defined(TARGET_CPU_X86_64)
+#if defined(TARGET_OS_IOS) && defined(TARGET_CPU_X86_64) && TARGET_OS_IOS && TARGET_CPU_X86_64


What values do these #defines have that means we need to check if they're defined and the value is non-zero?

If it's iOS and x86_64 that's only the simulator right?

Yes. TARGET_CPU_X86_64 is defiend in #include <TargetConditionals.h> for either 0 or 1

skottmckay · 2024-12-04T04:48:49Z

onnxruntime/core/providers/coreml/builders/impl/squeeze_op_builder.cc

+    // it's impossible to have empty exes input for unsqueeze
+    if (!axes.empty()) {
+      // coreml squeeze op does support negative axes


nit: can we clarify these commands? this code is now hit for squeeze and unsqueeze so I assume both handle negative axes. and don't both squeeze and unsqueeze require axes to be meaningful otherwise they're no-ops (and we should drop in level 1 if we don't already)?

According to onnx spec; squeeze can have empty axes(optional input), which means to drop all dimens which equal to 1.
But for unsqueeze, axes is required.

skottmckay · 2024-12-04T04:49:49Z

onnxruntime/core/providers/coreml/model/model.mm

@@ -300,14 +301,56 @@ Status GetMLMultiArrayCopyInfo(const MLMultiArray* _Nonnull array,
  return Status::OK();
 }

+void ProfileComputePlan(NSURL* compileUrl, MLModelConfiguration* config) {
+#if defined(__APPLE__) && defined(__clang__) && __clang_major__ >= 15


Do we need the #if defined... if we have the @available?

I tried to pass this CI https://github.com/microsoft/onnxruntime/actions/runs/12118598969/job/33783428976?pr=22975

But it didn't work

skottmckay · 2024-12-04T05:02:58Z

onnxruntime/core/providers/coreml/model/model.mm

+// Set the specialization strategy to FastPrediction  for macOS 10.15+
+#if defined(__APPLE__) && defined(__clang__) && __clang_major__ >= 15
+      if (HAS_COREML8_OR_LATER) {


why do we need to check the clang version?

is this just for macOS or does it set it for iOS as well?

Do you have suggestions on passing the clang-tidy https://github.com/microsoft/onnxruntime/actions/runs/12118598969/job/33783428976?pr=22975?

skottmckay · 2024-12-04T05:06:03Z

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

+static const char* const kCoremlProviderOption_SpecializationStrategy = "SpecializationStrategy";
+static const char* const kCoremlProviderOption_ProfileComputePlan = "ProfileComputePlan";
+static const char* const kCoremlProviderOption_AllowLowPrecisionAccumulationOnGPU = "AllowLowPrecisionAccumulationOnGPU";


Is there documentation on how/when/why these options should be used and what is valid input for them?

Add some notes from https://developer.apple.com/documentation.
But it's not that precise so far to indicate how much gains it will take.

more performace flag

0d547a2

wejoncy requested review from skottmckay and edgchen1 and removed request for skottmckay November 29, 2024 12:05

wejoncy marked this pull request as ready for review November 29, 2024 12:05

wejoncy added 2 commits December 1, 2024 19:18

fix

364b897

MLComputePlan

bf095dd

wejoncy force-pushed the jicwen/coreml_flag branch from 218fe66 to bf095dd Compare December 2, 2024 05:41

debug

5249880

wejoncy force-pushed the jicwen/coreml_flag branch from d64d79c to 5249880 Compare December 2, 2024 08:07

wejoncy added 3 commits December 2, 2024 01:24

Revert "debug"

74953ca

This reverts commit 5249880.

handle x64_cpu bug

37e77a5

Update squeeze_op_builder.cc

d57d62f

skottmckay reviewed Dec 4, 2024

View reviewed changes

add comments for new flag

e641750

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CoreML] more performace flag #22975

[CoreML] more performace flag #22975

wejoncy commented Nov 29, 2024

skottmckay Dec 4, 2024

wejoncy Dec 4, 2024

skottmckay Dec 4, 2024

wejoncy Dec 4, 2024

skottmckay Dec 4, 2024

wejoncy Dec 4, 2024

skottmckay Dec 4, 2024

wejoncy Dec 4, 2024

skottmckay Dec 4, 2024

wejoncy Dec 4, 2024

[CoreML] more performace flag #22975

Are you sure you want to change the base?

[CoreML] more performace flag #22975

Conversation

wejoncy commented Nov 29, 2024

Description

Motivation and Context

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment