Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FFI][RUNTIME] Introduce runtime boxed types for int/float/bool #16183

Merged
merged 12 commits into from
Aug 5, 2024

Conversation

Lunderberg
Copy link
Contributor

Prior to this commit, int, float, and bool arguments from Python were converted to IntImm, FloatImm, and Bool. These are subtypes of PrimExpr, and should only be used at compile-time. By automatically applying this conversion as part of the FFI, these types are required to be present whenever a primitive is converted to a tvm::ObjectRef.

This can become especially fragile for an end-user when storing objects into a TVM container. Because TVM containers require all contents to be ObjectRef subclasses, an automatic conversion may be applied on storing into a container, resulting in an unexpected type being retrieved from the container. For example, this currently occurs in Relax when extracting a R.Prim from a R.Tuple.

This commit introduces a Box<T> type for storage of boxed primitives at runtime, distinct from the IR types. This was based on discussion on #15983, which would have further cemented the IntImm type in libtvm_runtime.so, rather than starting the process of separating it.

  • Primitive arguments provided to a PackedFunc that requires an ObjectRef will be converted to the corresponding boxed type. (e.g. Passing a Python int to a C++ function accepting ObjectRef produces a Box<int64_t>.

  • Boxed primitives provided to a PackedFunc that requires an unboxed primitive will be converted to the corresponding primitive.

  • PackedFunc return values of ObjectRef are converted to the corresponding primitive, if present. (e.g. If a tuple_getitem with static return type ObjectRef returns a Box<int64_t>, it will be unwrapped to a python int.)

Together, these three rules provide backwards compatibility for existing PackedFunc definitions, while avoiding exposing the user to any container-induced type conversions betweeen primitive types and ObjectRef.

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, love the direction we are going with dedicated boxed types. I made some quick comment. The changes wrt to PackedFunc should have benefit from more pairs of eyes to review as they are centeral to most FFI routines

include/tvm/runtime/container/boxed_primitive.h Outdated Show resolved Hide resolved
include/tvm/runtime/container/boxed_primitive.h Outdated Show resolved Hide resolved
include/tvm/runtime/container/boxed_primitive.h Outdated Show resolved Hide resolved
include/tvm/runtime/container/boxed_primitive.h Outdated Show resolved Hide resolved
include/tvm/runtime/container/boxed_primitive.h Outdated Show resolved Hide resolved
include/tvm/runtime/container/boxed_primitive.h Outdated Show resolved Hide resolved
Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent sometime to go through packed_func.h logic. Given there are some restructuring, this would need some more careful checks (as they impact everything).

return static_cast<double>(value_.v_int64);
if (auto opt = TryAsBool()) {
return opt.value();
} else if (auto opt = TryAsInt()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should go in the order of

  • TryAsFloat, TryAsInt, TryAsBool

Prioritize the most likely ones

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would any bool value also work as an int value? Maybe that's the reason for the ordering given.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case order does not matter for correctness, mainly as an optimization as the original intended type as the most freq path

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I just looked at how they work. Yeah, order doesn't matter then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, and reordered here and following conversions. For each conversion operator, it now checks the expected type first, and only falls back to allowed conversions if the expected type isn't present.

Would any bool value also work as an int value? Maybe that's the reason for the ordering given.

Nope, no specific reason for the ordering. These specific Try* functions don't apply any conversions, so the order doesn't matter for correctness.

include/tvm/runtime/packed_func.h Outdated Show resolved Hide resolved
include/tvm/runtime/packed_func.h Outdated Show resolved Hide resolved
// Helper function to reduce duplication in the variable integer
// conversions. This is publicly exposed, as it can be useful in
// specializations of PackedFuncValueConverter.
if (auto opt = FromBoxed<double>()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling convention note: it would be simpler to require that TVMArgValue and TVMRetValue always NOT hold Boxed values. This would help to reduce the overall cost of the codegen function handling.

That does mean when we assign Boxed Object into TVMArg and TVMRet, they should be unboxed. I know that we can make an exception for bool for now until we introduce a specific bool POD code.

@@ -2129,6 +2315,42 @@ struct PackedFuncValueConverter<::tvm::runtime::String> {
}
};

template <typename T>
struct PackedFuncValueConverter<Array<T>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel automatic conversion here is a bit overkill. Recursive automatic conversion would be too much overhead whe passing through FFI. The developer should ensure the correct typings of the internal contained values

I know it may bring some convenience, but in this case I think we should not do it.

Suggestion: Remove the conversion logic from this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need these conversions in order to preserve backwards compatibility with existing user-written code. Otherwise, a user-written function that accepts Array<PrimExpr> could no longer be called with a python list [1,2,3].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding performance, I think we should be good there as well. The conversion is applied instead of the existing recursive check in ObjectTypeChecker<Array<T>>, so no additional recursive walks. The mutation is implemented in terms of Array::Map, which is copy-on-write, so no allocations occur if the user has provided the expected types (the common case).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should likely check some of those cases, and update quite a few of them to Array<runtime::Int>. But agree some might be useful especially for the mixed symbolic shape and static shape case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments below on optimizing for freq case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, and going through the comments now. I like the idea of updating the call sites over time, so that eventually the backwards compatibility conversion would only be triggered for externally-defined PackedFunc instances.


if constexpr (std::is_base_of_v<ContainerType, NDArray::ContainerType> ||
std::is_base_of_v<NDArray::ContainerType, ContainerType>) {
if (ptr && (std::is_base_of_v<NDArray::ContainerType, ContainerType> ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to keep the originals tructure of if (ptr != nullptr), they helps to keep the structure of the code, lift out a CSE and also makes the diff more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly restructured it in order to make sure the if constexpr checks were on the outermost conditional, so that they would be guaranteed to be applied to the entire contents. I agree on the readability and repetition though, and will take a look to see if there's a way to make it more readable overall.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it look in the latest commit? I moved the repeated if (ptr) structure out to an early return for if (ptr == nullptr), which preserves the if constexpr structure.

Copy link
Member

@tqchen tqchen Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i have a quick question, can we do the follows?

if (ptr != nullptr) {
    if constexpr (condition) {
    
    }
}

I feel it should still allow compiler to eliminate the content and would be equivalent. the main reason to put ptr != nullptr first is that it is a more likely path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do, and updated. I mostly pulled it out to avoid having too many levels of nesting for ease of readability.

I also realized that the checks may not be intuitive to future maintainers, and added a comment describing which cases are handled at C++ compile-time and which are handled at runtime.

@tqchen
Copy link
Member

tqchen commented Nov 29, 2023

cc @junrushao

@tqchen tqchen changed the title [FFI] Separate runtime types from IR types for int/float/bool [FFI] Introduce runtime boxed types for int/float/bool Nov 29, 2023
@tqchen tqchen changed the title [FFI] Introduce runtime boxed types for int/float/bool [FFI][RUNTIME] Introduce runtime boxed types for int/float/bool Nov 29, 2023
@tqchen tqchen added the runtime:c++ C++ runtime: src/runtime, excluding crt, RPC, object system, ffi. label Nov 29, 2023
@tqchen tqchen self-assigned this Nov 29, 2023
@tqchen
Copy link
Member

tqchen commented Nov 29, 2023

Just highlight some of the main comments that are high level

  • naming: we can probably go with runtime::Int, runtime::Float, runtime::Bool as they are more understandable terms
    • we should remove tvm::Integer and tvm::bool after migrating their usage to boxed values(in later PR)
  • convention: It is cleaner to have a convention where runtime::Int and runtime::Float always get unboxed when passing into TVMArg and TVMRet.
    • The codegen layer that expects int/float then do not need to deal with object.
    • That does mean the FFI layer needs to explicitly unbox when setting boxed values into them (seems the current impl is already doing so for TVMRet, need to double check if that is the case for argument setting as well)
    • That means we do not need to do TryFromBox when we convert arguments

@Lunderberg
Copy link
Contributor Author

Thanks for the PR, love the direction we are going with dedicated boxed types.

Thank you!

The changes wrt to PackedFunc should have benefit from more pairs of eyes to review as they are centeral to most FFI routines

Definitely agreed, and the more eyes the better. Given that this PR has the potential to break any use of the PackedFunc interface, the surface area is very broad.

naming: we can probably go with runtime::Int, runtime::Float, runtime::Bool

I like this readability improvement, and will update accordingly

convention: It is cleaner to have a convention where runtime::Int and runtime::Float always get unboxed when passing into TVMArg and TVMRet.

Agreed. Assuming I have it implemented correctly, this is handled in the TVMArgsSetter::SetObject and TVMRetValue::operator=(TObjectRef) implementations, so a boxed primitive is always unboxed when storing in the FFI type.

The conversions that occur when reading from a TVMArgsSetter or TVMRetValue are to convert into some form of ObjectRef that is constructible from the primitive, based on the signature of the type receiving the argument.

@Lunderberg Lunderberg force-pushed the ffi_boxed_primitives_for_runtime branch 2 times, most recently from e844ced to c214b76 Compare November 29, 2023 20:47
include/tvm/runtime/packed_func.h Outdated Show resolved Hide resolved
// special handling for unwrapping boxed primitives,
// PackedFunc, runtime::Module, etc, which should be checked
// before delegating to the array element's
// PackedFuncValueConverter implementation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not quite get what we are intending to handle, maybe state we would like to allow conversion to classes like PrimExpr, so would like to trigger that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I changed the comment to describe the use case that it enables.


return untyped_array.Map([](ObjectRef item) {
TVMRetValue item_val;
item_val = std::move(item);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have same quick short cut

if (auto* ptr = item->as<T::ContainerType>()) {
   return GetRef<T>(ptr);
} 

the fallback conversion can remain the same?

@@ -2129,6 +2315,42 @@ struct PackedFuncValueConverter<::tvm::runtime::String> {
}
};

template <typename T>
struct PackedFuncValueConverter<Array<T>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should likely check some of those cases, and update quite a few of them to Array<runtime::Int>. But agree some might be useful especially for the mixed symbolic shape and static shape case

@Lunderberg Lunderberg force-pushed the ffi_boxed_primitives_for_runtime branch 2 times, most recently from eaaea52 to 440568e Compare February 8, 2024 03:29
@Lunderberg Lunderberg force-pushed the ffi_boxed_primitives_for_runtime branch from cd03dbc to 9c19c12 Compare February 14, 2024 19:34
@Lunderberg Lunderberg force-pushed the ffi_boxed_primitives_for_runtime branch from 4bda95c to b0bc242 Compare April 22, 2024 23:29
@Lunderberg
Copy link
Contributor Author

Lunderberg commented Jun 14, 2024

After slowly working through all the unit tests and transforms that needed to be updated for this change, this PR is now ready for review! I've squashed the incremental commits on the PR branch that went into this PR, since the (very) long history of this dev branch wasn't that useful. (This should also prevent a recurrence of #16549, by having a shorter PR branch.)

Prior to this commit, the `Array::Map` member function could only be
applied to nullable object types.  This was due to the internal use of
`U()` as the default value for initializing the output `ArrayNode`, where
`U` is the return type of the mapping function.  This default
constructor is only available for nullable types, and would result in
a compile-time failure for non-nullable types.

This commit replaces `U()` with `ObjectRef()` in `Array::Map`,
removing this limitation.  Since all items in the output array are
overwritten before returning to the calling scope, initializing the
output array with `ObjectRef()` does not violate type safety.
Prior to this commit, `int`, `float`, and `bool` arguments from Python
were converted to `IntImm`, `FloatImm`, and `Bool`.  These are
subtypes of `PrimExpr`, and should only be used at compile-time.  By
automatically applying this conversion as part of the FFI, these types
are required to be present whenever a primitive is converted to a
`tvm::ObjectRef`.

This can become especially fragile for an end-user when storing
objects into a TVM container.  Because TVM containers require all
contents to be `ObjectRef` subclasses, an automatic conversion may be
applied on storing into a container, resulting in an unexpected type
being retrieved from the container.  For example, this currently
occurs in Relax when extracting a `R.Prim` from a `R.Tuple`.

This commit introduces a `Box<T>` type for storage of boxed primitives
at runtime, distinct from the IR types.

* Primitive arguments provided to a PackedFunc that requires an
  `ObjectRef` will be converted to the corresponding boxed type.
  (e.g. Passing a Python `int` to a C++ function accepting `ObjectRef`
  produces a `Box<int64_t>`.

* Boxed primitives provided to a PackedFunc that requires an unboxed
  primitive will be converted to the corresponding primitive.

* PackedFunc return values of `ObjectRef` are converted to the
  corresponding primitive, if present.  (e.g. If a `tuple_getitem`
  with static return type `ObjectRef` returns a `Box<int64_t>`, it
  will be unwrapped to a python `int`.)

Together, these three rules provide backwards compatibility for
existing PackedFunc definitions, while avoiding exposing the user to
any container-induced type conversions betweeen primitive types and
`ObjectRef`.
@Lunderberg Lunderberg force-pushed the ffi_boxed_primitives_for_runtime branch from 8f34004 to bc7cdcc Compare June 14, 2024 19:25
@Lunderberg Lunderberg changed the base branch from main to main-before-unity-switch June 14, 2024 19:26
@Lunderberg Lunderberg changed the base branch from main-before-unity-switch to main June 14, 2024 19:26
@@ -207,6 +209,7 @@ typedef DLTensor* TVMArrayHandle;
*/
typedef union {
int64_t v_int64;
bool v_bool;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely v_bool is not needed and we can simply store value in v_int64 , this can help to reduce the switch cases, and also make all value types as 64 bit, might help kernel passing handling to reduce cases as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, and implemented in #17240.

Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Aug 5, 2024
This is a follow-up to apache#16183, which
added handling of boolean values in the TVM FFI.  The initial
implementation added both a new type code (`kTVMArgBool`) and a new
`TVMValue::v_bool` variant.  This commit removes the
`TVMValue::v_bool` variant, since the `kTVMArgBool` type code is
sufficient to handle boolean arguments.

Removing the `TVMValue::v_bool` variant also makes all `TVMValue`
variants be 64-bit (assuming a 64-bit CPU).  This can simplify
debugging in some cases, since it prevents partial values from
inactive variants from being present in memory.
@Lunderberg Lunderberg deleted the ffi_boxed_primitives_for_runtime branch August 5, 2024 14:17
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Aug 5, 2024
This is a follow-up PR to apache#16183,
which updated the FFI with explicit integer types.  As part of that
change, many internal functions were updated to accept non-IR
types (e.g. `Array<runtime::Int>` instead of `Array<IntImm>`).  For
backwards compatibility with callees that provided the IR types, a
specialization of `PackedFuncValueConverter` unwrapped the `IntImm`
into a `runtime::Int`.

This commit removes the backwards-compatibility specialization of
`PackedFuncValueConverter`.  Breakages that are found in CI as a
result will then be updated at the caller side, removing the need for
the backwards-compatibility handler altogether.
@tqchen
Copy link
Member

tqchen commented Aug 7, 2024

We just find out some a perf regression introduced by this PR, specifically, during LLM decode function calling overhead(before the first kernel launch) goes up to 1.4 ms. The likely cause is due to increased PackedFunc calling overhead in general. Likely this is due to some of the automatic conversion logic introduced that increased the overhead per PackedFunc call.

Such overhead can impact the perf of downstream to optimize for low latency. As a temp measure, I am going to first revert this PR.

The changes of the PR(boxed types and bool) are valuable. Given the smart auto conversion might introduce extra overhead and the regression we see. I think it is helpful to isolate the PR into two pieces, with one that introduces the boxed types/bool, and another one that introduce a possiblity more conservative version of automatic conversion.

Thanks @Lunderberg for great effort and sorry for the extra trouble. As PackedFunc call gets into the center of our runtime execution, being able to reduce the execution time is now become an important topic, so we need to be more careful balancing the runtime logics

@Lunderberg
Copy link
Contributor Author

Thank you, and I'm looking into where the performance overhead comes from. Unfortunately, I'm not sure if the de-coupling of the boxed types from the automatic conversions is possible.

  1. Since the FFI is used during runtime, the default FFI conversions must not use IR types. Therefore, a function that accepts ObjectRef, and is passed a python integer, has the argument converted to runtime::Int instead of the previous IntImm.
  2. A python list must be converted to an tvm::Array to be passed through the FFI. Since Array holds an ObjectRef, this requires converting any primitive values into ObjectRef types. With (1), this would produce an array of runtime::Int rather than the previous array of IntImm.
  3. For a function that accepts Array<IntImm>, existing callers in python could pass the array [1,2,3], and this usage should not be broken. With the new default conversion for [1,2,3], this would raise an error.

Breaking any one of these three steps would avoid the error, but each comes with downsides:

  • Break step (3): Automatic conversion from Array<runtime::Int> to Array<IntImm>. This is what was implemented in the PR, but results in a two-step conversion.
  • Break step (2): Change Array to hold a TVMRetValue instead of ObjectRef. This would avoid the repeated nested conversion, since the C++ API could ide, but would significantly change the C++ side of the API.
  • Break step (1): An explicit change to the Python code to convert to IntImm before calling the FFI. This would work in some cases where the desired C++ type is known (e.g. the shape of a TIR buffer), but wouldn't work in others (e.g. the target attributes, which are stored as ObjectRef, and whose desired type isn't known until they are used).

@Lunderberg
Copy link
Contributor Author

Looking into the origin of the performance difference, since it may just result from an increased number of FFI calls, and (maybe) could be resolved without removing the automatic conversions.

@tqchen
Copy link
Member

tqchen commented Aug 7, 2024

maybe it is not too bad to do manual conversion to the array of ir types in python wrapper then leave ffi part lightweight (per your sugggsted option 3) the objectref stored target attribute are mostly intended as boxed int rather than Intimm so we should be fine

tqchen added a commit that referenced this pull request Aug 7, 2024
…ol" (#17252)

Revert "[FFI][RUNTIME] Introduce runtime boxed types for int/float/bool (#16183)"

This reverts commit 5f22be4.
@Lunderberg
Copy link
Contributor Author

First results, looks like the implicit conversions aren't necessarily causing the overhead. I benchmarked constructing a TIR buffer, with the shape/strides either provided as a

On main (after the FFI changes), the runtime is about the same, regardless of how the shape/strides are provided.
image

On 5a67a00bcb (last commit before the FFI changes), it's about 80% slower 's about a 40% overhead when providing python integers ([1,2,3]) as compared to providing TIR IntImm ([T.int32(1), T.int32(2), T.int32(3)]).

image

This is rather surprising, because I expected the overhead to be present after the FFI changes, but this benchmark suggests that there's less overhead.

Next up, seeing how the FFI changes impact Relax R.call_tir operators.

@Lunderberg
Copy link
Contributor Author

Hmm, the same thing is occurring for this test case. I made a Relax module that computes x = subtract_one(add_one(x)), for a large number of iterations. (10/100/1000 iterations in the benchmarks shown below.)

(Current main, after the FFI changes)
image

(Before the FFI changes)
image

Since the effect isn't showing up for any smaller benchmarks, next for this investigation will probably need to be a full model execution.

@vinx13
Copy link
Member

vinx13 commented Aug 7, 2024

Here is an example, the number of weights are large enough to see the difference (1.2ms vs 6.2ms) https://gist.github.com/vinx13/ea7a8c785d8d0d5ae5318e8ace085db2

srkreddy1238 pushed a commit to srkreddy1238/tvm that referenced this pull request Aug 8, 2024
@Lunderberg
Copy link
Contributor Author

Thank you for the example, @vinx13, and I can reproduce the difference with and without the FFI changes. My numbers roughly agree with yours (1.1 ms vs 5.2 ms), so I can start narrowing down on the cause.

@Lunderberg
Copy link
Contributor Author

FOUND IT!

The root cause was in the type validation, but ended up being pretty easily solveable. When converting from the untyped ArrayNode* to the typed Array<T>, we validate that each ObjectRef in the array has the correct type T. However, because each element in the untyped ArrayNode* must still be some type of ObjectRef, this check is unnecessary.

In the test case, this performance overhead largely occurred during the calls to "vm.builtin.tuple_getitem". For each call, the validation function iterated over the entire Array<ObjectRef> argument. This overhead can be removed by adding a compile-time check on the type parameter's type. I've updated the conversions for both Array<T> and Map<T,U> to skip type checks that will always succeed.

With this addition, the FFI update went from being 5x slower than baseline (increase from 1 ms to 5 ms) to being 30% faster than baseline (decrease from 1ms to 700 us). I'll make a PR shortly that re-applied the FFI changes, with the additional fix.

Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Aug 8, 2024
Initially introduced in apache#16183,
these changes were reverted in
apache#17252 due to performance
degredation in some Relax models.  This could occur when a model
contained a large number of calls to `"vm.builtin.tuple_getitem"`,
which may occur when model weights are provided as a tuple.

This PR re-applies the changes from
apache#16183, but with the performance
degredation resolved.  The root cause was unnecessary type-checking
when converting from an untyped `tvm::ArrayNode*` to the typed
`tvm::Array<T>`, in the case where `T` is `ObjectRef`.
vinx13 added a commit to vinx13/tvm that referenced this pull request Aug 9, 2024
tqchen pushed a commit that referenced this pull request Aug 12, 2024
* Revert "Revert "[FFI][RUNTIME] Introduce runtime boxed types for int/float/bool" (#17252)"

This reverts commit 11be832.

* [FFI] Re-introduce the boxed primitive values

Initially introduced in #16183,
these changes were reverted in
#17252 due to performance
degredation in some Relax models.  This could occur when a model
contained a large number of calls to `"vm.builtin.tuple_getitem"`,
which may occur when model weights are provided as a tuple.

This PR re-applies the changes from
#16183, but with the performance
degredation resolved.  The root cause was unnecessary type-checking
when converting from an untyped `tvm::ArrayNode*` to the typed
`tvm::Array<T>`, in the case where `T` is `ObjectRef`.

* Correct typo from T to U
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Aug 19, 2024
This is a follow-up to apache#16183, which
added handling of boolean values in the TVM FFI.  The initial
implementation added both a new type code (`kTVMArgBool`) and a new
`TVMValue::v_bool` variant.  This commit removes the
`TVMValue::v_bool` variant, since the `kTVMArgBool` type code is
sufficient to handle boolean arguments.

Removing the `TVMValue::v_bool` variant also makes all `TVMValue`
variants be 64-bit (assuming a 64-bit CPU).  This can simplify
debugging in some cases, since it prevents partial values from
inactive variants from being present in memory.
tqchen pushed a commit that referenced this pull request Aug 22, 2024
)

* [FFI][Runtime] Use TVMValue::v_int64 to represent boolean values

This is a follow-up to #16183, which
added handling of boolean values in the TVM FFI.  The initial
implementation added both a new type code (`kTVMArgBool`) and a new
`TVMValue::v_bool` variant.  This commit removes the
`TVMValue::v_bool` variant, since the `kTVMArgBool` type code is
sufficient to handle boolean arguments.

Removing the `TVMValue::v_bool` variant also makes all `TVMValue`
variants be 64-bit (assuming a 64-bit CPU).  This can simplify
debugging in some cases, since it prevents partial values from
inactive variants from being present in memory.

* Update MakePackedAPI, less special handling required for boolean
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Aug 26, 2024
This is a follow-up PR to apache#16183,
which updated the FFI with explicit integer types.  As part of that
change, many internal functions were updated to accept non-IR
types (e.g. `Array<runtime::Int>` instead of `Array<IntImm>`).  For
backwards compatibility with callees that provided the IR types, a
specialization of `PackedFuncValueConverter` unwrapped the `IntImm`
into a `runtime::Int`.

This commit removes the backwards-compatibility specialization of
`PackedFuncValueConverter`.  Breakages that are found in CI as a
result will then be updated at the caller side, removing the need for
the backwards-compatibility handler altogether.
Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Sep 16, 2024
This is a follow-up PR to apache#16183,
which updated the FFI with explicit integer types.  As part of that
change, many internal functions were updated to accept non-IR
types (e.g. `Array<runtime::Int>` instead of `Array<IntImm>`).  For
backwards compatibility with callees that provided the IR types, a
specialization of `PackedFuncValueConverter` unwrapped the `IntImm`
into a `runtime::Int`.

This commit removes the backwards-compatibility specialization of
`PackedFuncValueConverter`.  Breakages that are found in CI as a
result will then be updated at the caller side, removing the need for
the backwards-compatibility handler altogether.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
runtime:c++ C++ runtime: src/runtime, excluding crt, RPC, object system, ffi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants