Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37753: [C++][Gandiva] Add external function registry support #38116

Merged
merged 17 commits into from
Nov 8, 2023

Conversation

niyue
Copy link
Contributor

@niyue niyue commented Oct 7, 2023

Rationale for this change

This PR tries to enhance Gandiva by supporting external function registry, so that developers can author third party functions without modifying Gandiva's core codebase. See #37753 for more details. In this PR, the external function needs to be compiled into LLVM IR for integration.

What changes are included in this PR?

Two new APIs are added to FunctionRegistry:

/// \brief register a set of functions into the function registry from a given bitcode
  /// file
arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);

  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);

Developers can use these two APIs to register external functions. Typically, developers will register a set of function metadatas (funcs) for all functions in a LLVM bitcode file, by giving either the path to the LLVM bitcode file or an arrow::Buffer containing the LLVM bitcode buffer.

The overall flow looks like this:
image

Are these changes tested?

Some unit tests are added to verify this enhancement

Are there any user-facing changes?

Some new ways to interfacing the library are added in this PR:

  • The Configuration class now supports accepting a customized function registry, which developers can register their own external functions and uses it as the function registry
  • The FunctionRegistry class has two new APIs mentioned above
  • The FunctionRegistry class, after instantiation, now it doesn't have any built-in function registered in it. And we switch to use a new function GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry(); to retrieve the default function registry, which contains all the Gandiva built-in functions.
    • Some library depending on Gandiva C++ library, such as Gandiva's Ruby binding's Gandiva::FunctionRegistry class behavior is changed accordingly

Notes

  • Performance
    • the code generation time grows with the number of externally added function bitcodes (the more functions are added, the slower the codegen will be), even if the externally function is not used in the given expression at all. But this is not a new issue, and it applies to built-in functions as well (the more built-in functions are there, the slower the codegen will be). In my limited testing, this is because llvm::Linker::linkModule takes non trivial of time, which happens to every IR loaded, and the RemoveUnusedFunctions happens after that, which doesn't help to reduce the time of linkModule. We may have to selectively load only necessary IR (primarily selectively doing linkModule for these IR), but more metadata may be needed to tell which functions can be found in which IR. This could be a separated PR for improving it, please advice if any one has any idea on improving it. Thanks.
  • Integration with other programming languages via LLVM IR/bitcode
    • So far I only added an external C++ function in the codebase for unit testing purpose. Rust based function is possible but I gave it a try and found another issue (Rust has std lib which needs to be processed in different approach), I will do some exploration for other languages such as zig later.
    • Non pre-compiled functions, may require some different approach to get the function pointer, and we may discuss and work on it in a separated PR later. Another issue [C++][Gandiva] Allow registering external C functions #38589 was logged for this.
  • The discussion thread in dev mail list, https://lists.apache.org/thread/lm4sbw61w9cl7fsmo7tz3gvkq0ox6rod
  • Closes: [C++][Gandiva] Support external function registry #37753

@niyue
Copy link
Contributor Author

niyue commented Oct 8, 2023

@kou this PR is ready for review, the failing CI checks don't seem to be related with my change. Could you please help? Thanks.

cpp/src/gandiva/cmake/GenerateBitcode.cmake Outdated Show resolved Hide resolved
cpp/src/gandiva/cmake/GenerateBitcode.cmake Outdated Show resolved Hide resolved
cpp/src/gandiva/llvm_generator_test.cc Outdated Show resolved Hide resolved
cpp/src/gandiva/llvm_ir_store.cc Outdated Show resolved Hide resolved
cpp/src/gandiva/engine.cc Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Oct 10, 2023
cpp/src/gandiva/engine.cc Show resolved Hide resolved
cpp/src/gandiva/llvm_ir_store.h Outdated Show resolved Hide resolved
cpp/src/gandiva/llvm_ir_store.h Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 12, 2023
@js8544
Copy link
Collaborator

js8544 commented Oct 12, 2023

@pitrou Do you also want to take a look at this new API design?

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Oct 12, 2023
@js8544
Copy link
Collaborator

js8544 commented Oct 13, 2023

It seems simpler for users to call just one API because I don't see any use scenario that users would only call one of the two APIs (FunctionRegistry::Add, LLVMExternalIRStore::Add)

For now, yes, but the LLVM JIT engine also supports calling function pointers directly without the need of IRs. And Gandiva alreadys uses this mechanism (Gandiva calls it function stubs). For example, Gandiva functions like random and regexp_replace are not precompiled to IR.

If a user wants to add a function to Gandiva via function pointers, not precompiled IR, they will need to:

  1. Add the signature to function registry with flag kNeedsFunctionHolder.
  2. Inherit FunctionHolder and implement their function in operator(). For example RandomGeneratorHolder.
  3. Wrap it in a Gandiva standard form. For example gdv_fn_random
  4. Register this exported function to ExportedFuncsRegistry.
  5. Register their holder to FunctionHolderRegistry.

So, in this way user will need to only call FunctionRegistry::Add and a bunch of other registrations but not LLVMExternalIRStore::Add, which is why I suggest we separate the APIs. And as you mentioned there is also the dynamic library approach which also doesn't require IRs, but IMO it won't be as convenient as using function stubs.
cc @kou @niyue

@kou
Copy link
Member

kou commented Oct 13, 2023

Thanks. I didn't know the existing mechanism.

It seems that we need to extend the existing FunctionHolderRegistry (e.g. adding FunctionHolderRegistry::Add()) to implement a external function that is implemented in a normal function (not precompiled IR). And users need to call FunctionRegistry::Add(), ExportedFuncsRegistry::Register() (ah, we may want to rename FunctionRegistry::Add() to FunctionRegistry::Register()) and FunctionHolderRegistry::Add(). (Note that this is out of scope of this PR.)

For the case, I propose that we add using FunctionHolderMaker = std::add_pointer<result<FunctionHolder>(const FunctionNode&)>::type; FunctionRegistry::Add(std::vector<NativeFunction> funcs, FunctionHolderMaker maker) (ExportedFuncsRegistry related argument may be needed too) instead of extending FunctionHolderReigstry.

@niyue
Copy link
Contributor Author

niyue commented Oct 16, 2023

as you mentioned there is also the dynamic library approach which also doesn't require IRs, but IMO it won't be as convenient as using function stubs

I see. I am aware of the function stubs approach, but previously I didn't realize this could be the direct API for addressing the non pre-compiled function registration issue.

Since I've already spotted some performance issue for large IR functions (when constructing LLVM modules), I am interested in making this approach a public and an official approach for extending gandiva's non pre-compiled functions later. Instead of rushing to merge this PR, I would like to spend more time to make the registration APIs of IR based function and non IR based function (which is not in the scope of this PR) consistent.

Internally, I think we could still keep the function metadata registration as a separated API, but what could be the most meaningful external registration APIs for users? Considering users typically need to provide function metadata as well as function implementation, do we think this set of APIs okay for such purpose? (We may not have to put these API in FunctionRegistry class if we think this put too much responsibility to it)

  1. for pre-compiled IR functions, FunctionRegistry::Add(std::vector<NativeFunction> funcs, const std::string& bitcode_path)
  2. for non pre-compiled functions, FunctionRegistry::Add(std::vector<NativeFunction> funcs, FunctionHolderMaker maker)

cc @kou @js8544

@js8544
Copy link
Collaborator

js8544 commented Oct 16, 2023

This looks very reasonable. Having a new class that orchestrates these registration processes ensures both convenience and modularity.

@kou
Copy link
Member

kou commented Oct 16, 2023

I like the API set. We may need to improve the signatures but the core concept (users can register function metadata and function implementation at once) will not be changed.

@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Oct 18, 2023
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

I've updated the GLib part for the new API.

cpp/src/gandiva/configuration.cc Outdated Show resolved Hide resolved
cpp/src/gandiva/function_registry.cc Outdated Show resolved Hide resolved
cpp/src/gandiva/function_registry.cc Outdated Show resolved Hide resolved
cpp/src/gandiva/function_registry.cc Outdated Show resolved Hide resolved
cpp/src/gandiva/llvm_generator.cc Show resolved Hide resolved
cpp/src/gandiva/tree_expr_test.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Nov 2, 2023
@niyue
Copy link
Contributor Author

niyue commented Nov 3, 2023

@kou I've updated the code according to the review comments, please help to see if there is anything else that needs to be revised. Thanks.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this in a few days if nobody objects this.

cpp/src/gandiva/llvm_generator.cc Show resolved Hide resolved
cpp/src/gandiva/function_registry.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Nov 4, 2023
…since developers can guarantee they are correct.
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Nov 5, 2023
@kou
Copy link
Member

kou commented Nov 6, 2023

@niyue Could you update the PR description before we merge this?

@niyue
Copy link
Contributor Author

niyue commented Nov 6, 2023

@kou I've updated the PR description to reflect its current status. But it is not very detailed and we may still have additional documentation for describing the APIs and usages for this. Please let me know if there is anything in the description that needs to be updated.

@kou
Copy link
Member

kou commented Nov 6, 2023

Thanks!

we may still have additional documentation for describing the APIs and usages for this.

OK. Could you open a new issue for this to defer the documentation task?

@niyue
Copy link
Contributor Author

niyue commented Nov 6, 2023

Could you open a new issue for this to defer the documentation task?

Sure. I submitted #38594

@kou
Copy link
Member

kou commented Nov 8, 2023

I'll merge this.

@kou kou merged commit bbb610e into apache:main Nov 8, 2023
37 of 38 checks passed
@kou kou removed the awaiting change review Awaiting change review label Nov 8, 2023
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit bbb610e.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Nov 9, 2023
…apache#38116)

# Rationale for this change

This PR tries to enhance Gandiva by supporting external function registry, so that developers can author third party functions without modifying Gandiva's core codebase. See apache#37753 for more details. In this PR, the external function needs to be compiled into LLVM IR for integration.

# What changes are included in this PR?
Two new APIs are added to `FunctionRegistry`:
```C++
/// \brief register a set of functions into the function registry from a given bitcode
  /// file
arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);

  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);
```
Developers can use these two APIs to register external functions. Typically, developers will register a set of function metadatas (`funcs`) for all functions in a LLVM bitcode file, by giving either the path to the LLVM bitcode file or an `arrow::Buffer` containing the LLVM bitcode buffer.

The overall flow looks like this:
![image](https://github.com/apache/arrow/assets/27754/b2b346fe-931f-4253-b198-4c388c57a56b)

# Are these changes tested?

Some unit tests are added to verify this enhancement

# Are there any user-facing changes?

Some new ways to interfacing the library are added in this PR:
* The `Configuration` class now supports accepting a customized function registry, which developers can register their own external functions and uses it as the function registry
* The `FunctionRegistry` class has two new APIs mentioned above
* The `FunctionRegistry` class, after instantiation, now it doesn't have any built-in function registered in it. And we switch to use a new function `GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();` to retrieve the default function registry, which contains all the Gandiva built-in functions.
    * Some library depending on Gandiva C++ library, such as Gandiva's Ruby binding's `Gandiva::FunctionRegistry` class behavior is changed accordingly

# Notes
* Performance
    * the code generation time grows with the number of externally added function bitcodes (the more functions are added, the slower the codegen will be), even if the externally function is not used in the given expression at all. But this is not a new issue, and it applies to built-in functions as well (the more built-in functions are there, the slower the codegen will be). In my limited testing, this is because `llvm::Linker::linkModule` takes non trivial of time, which happens to every IR loaded, and the `RemoveUnusedFunctions` happens after that, which doesn't help to reduce the time of `linkModule`. We may have to selectively load only necessary IR (primarily selectively doing `linkModule` for these IR), but more metadata may be needed to tell which functions can be found in which IR. This could be a separated PR for improving it, please advice if any one has any idea on improving it. Thanks.
* Integration with other programming languages via LLVM IR/bitcode
    * So far I only added an external C++ function in the codebase for unit testing purpose. Rust based function is possible but I gave it a try and found another issue (Rust has std lib which needs to be processed in different approach), I will do some exploration for other languages such as zig later.
    * Non pre-compiled functions, may require some different approach to get the function pointer, and we may discuss and work on it in a separated PR later. Another issue apache#38589 was logged for this.
* The discussion thread in dev mail list, https://lists.apache.org/thread/lm4sbw61w9cl7fsmo7tz3gvkq0ox6rod
     * I submitted another PR previously (apache#37787) which introduced JSON based function registry, and after discussion, I will close that PR and use this PR instead
* Closes: apache#37753

Lead-authored-by: Yue Ni <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…apache#38116)

# Rationale for this change

This PR tries to enhance Gandiva by supporting external function registry, so that developers can author third party functions without modifying Gandiva's core codebase. See apache#37753 for more details. In this PR, the external function needs to be compiled into LLVM IR for integration.

# What changes are included in this PR?
Two new APIs are added to `FunctionRegistry`:
```C++
/// \brief register a set of functions into the function registry from a given bitcode
  /// file
arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);

  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);
```
Developers can use these two APIs to register external functions. Typically, developers will register a set of function metadatas (`funcs`) for all functions in a LLVM bitcode file, by giving either the path to the LLVM bitcode file or an `arrow::Buffer` containing the LLVM bitcode buffer.

The overall flow looks like this:
![image](https://github.com/apache/arrow/assets/27754/b2b346fe-931f-4253-b198-4c388c57a56b)

# Are these changes tested?

Some unit tests are added to verify this enhancement

# Are there any user-facing changes?

Some new ways to interfacing the library are added in this PR:
* The `Configuration` class now supports accepting a customized function registry, which developers can register their own external functions and uses it as the function registry
* The `FunctionRegistry` class has two new APIs mentioned above
* The `FunctionRegistry` class, after instantiation, now it doesn't have any built-in function registered in it. And we switch to use a new function `GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();` to retrieve the default function registry, which contains all the Gandiva built-in functions.
    * Some library depending on Gandiva C++ library, such as Gandiva's Ruby binding's `Gandiva::FunctionRegistry` class behavior is changed accordingly

# Notes
* Performance
    * the code generation time grows with the number of externally added function bitcodes (the more functions are added, the slower the codegen will be), even if the externally function is not used in the given expression at all. But this is not a new issue, and it applies to built-in functions as well (the more built-in functions are there, the slower the codegen will be). In my limited testing, this is because `llvm::Linker::linkModule` takes non trivial of time, which happens to every IR loaded, and the `RemoveUnusedFunctions` happens after that, which doesn't help to reduce the time of `linkModule`. We may have to selectively load only necessary IR (primarily selectively doing `linkModule` for these IR), but more metadata may be needed to tell which functions can be found in which IR. This could be a separated PR for improving it, please advice if any one has any idea on improving it. Thanks.
* Integration with other programming languages via LLVM IR/bitcode
    * So far I only added an external C++ function in the codebase for unit testing purpose. Rust based function is possible but I gave it a try and found another issue (Rust has std lib which needs to be processed in different approach), I will do some exploration for other languages such as zig later.
    * Non pre-compiled functions, may require some different approach to get the function pointer, and we may discuss and work on it in a separated PR later. Another issue apache#38589 was logged for this.
* The discussion thread in dev mail list, https://lists.apache.org/thread/lm4sbw61w9cl7fsmo7tz3gvkq0ox6rod
     * I submitted another PR previously (apache#37787) which introduced JSON based function registry, and after discussion, I will close that PR and use this PR instead
* Closes: apache#37753

Lead-authored-by: Yue Ni <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
kou pushed a commit that referenced this pull request Nov 17, 2023
…8632)

### Rationale for this change
This PR tries to enhance Gandiva by supporting registering external C functions to its function registry, so that developers can author third party functions with complex dependency and expose them as C functions to be used in Gandiva expression. See more details in GH-38589.

### What changes are included in this PR?
This PR primarily adds a new API to the `FunctionRegistry` so that developers can use it to register external C functions:
```C++
arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);
```

### Are these changes tested?
* The changes are tested via unit tests in this PR, and the unit tests include several C functions written using C++ and we confirm this kind of functions can be used by Gandiva after registration using the above mentioned new API.
* Additionally, locally I wrote some Rust based functions, and integrate the Rust based functions into a C++ program by using the new registration API and verified this approach did work, but this piece of work is not included in the PR.

### Are there any user-facing changes?
There are several new APIs added to `FunctionRegistry` class:
```C++
/// \brief register a C function into the function registry
  /// @ param func the registered function's metadata
  /// @ param c_function_ptr the function pointer to the
  /// registered function's implementation
  /// @ param function_holder_maker this will be used as the function holder if the
  /// function requires a function holder
  arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);

  /// \brief get a list of C functions saved in the registry
  const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;

  const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;
```

* Closes: #38589

### Notes
* This PR is related with #38116, which adds the initial support for registering LLVM IR based external functions into Gandiva.

Authored-by: Yue Ni <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…apache#38116)

# Rationale for this change

This PR tries to enhance Gandiva by supporting external function registry, so that developers can author third party functions without modifying Gandiva's core codebase. See apache#37753 for more details. In this PR, the external function needs to be compiled into LLVM IR for integration.

# What changes are included in this PR?
Two new APIs are added to `FunctionRegistry`:
```C++
/// \brief register a set of functions into the function registry from a given bitcode
  /// file
arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         const std::string& bitcode_path);

  /// \brief register a set of functions into the function registry from a given bitcode
  /// buffer
  arrow::Status Register(const std::vector<NativeFunction>& funcs,
                         std::shared_ptr<arrow::Buffer> bitcode_buffer);
```
Developers can use these two APIs to register external functions. Typically, developers will register a set of function metadatas (`funcs`) for all functions in a LLVM bitcode file, by giving either the path to the LLVM bitcode file or an `arrow::Buffer` containing the LLVM bitcode buffer.

The overall flow looks like this:
![image](https://github.com/apache/arrow/assets/27754/b2b346fe-931f-4253-b198-4c388c57a56b)

# Are these changes tested?

Some unit tests are added to verify this enhancement

# Are there any user-facing changes?

Some new ways to interfacing the library are added in this PR:
* The `Configuration` class now supports accepting a customized function registry, which developers can register their own external functions and uses it as the function registry
* The `FunctionRegistry` class has two new APIs mentioned above
* The `FunctionRegistry` class, after instantiation, now it doesn't have any built-in function registered in it. And we switch to use a new function `GANDIVA_EXPORT std::shared_ptr<FunctionRegistry> default_function_registry();` to retrieve the default function registry, which contains all the Gandiva built-in functions.
    * Some library depending on Gandiva C++ library, such as Gandiva's Ruby binding's `Gandiva::FunctionRegistry` class behavior is changed accordingly

# Notes
* Performance
    * the code generation time grows with the number of externally added function bitcodes (the more functions are added, the slower the codegen will be), even if the externally function is not used in the given expression at all. But this is not a new issue, and it applies to built-in functions as well (the more built-in functions are there, the slower the codegen will be). In my limited testing, this is because `llvm::Linker::linkModule` takes non trivial of time, which happens to every IR loaded, and the `RemoveUnusedFunctions` happens after that, which doesn't help to reduce the time of `linkModule`. We may have to selectively load only necessary IR (primarily selectively doing `linkModule` for these IR), but more metadata may be needed to tell which functions can be found in which IR. This could be a separated PR for improving it, please advice if any one has any idea on improving it. Thanks.
* Integration with other programming languages via LLVM IR/bitcode
    * So far I only added an external C++ function in the codebase for unit testing purpose. Rust based function is possible but I gave it a try and found another issue (Rust has std lib which needs to be processed in different approach), I will do some exploration for other languages such as zig later.
    * Non pre-compiled functions, may require some different approach to get the function pointer, and we may discuss and work on it in a separated PR later. Another issue apache#38589 was logged for this.
* The discussion thread in dev mail list, https://lists.apache.org/thread/lm4sbw61w9cl7fsmo7tz3gvkq0ox6rod
     * I submitted another PR previously (apache#37787) which introduced JSON based function registry, and after discussion, I will close that PR and use this PR instead
* Closes: apache#37753

Lead-authored-by: Yue Ni <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…ns (apache#38632)

### Rationale for this change
This PR tries to enhance Gandiva by supporting registering external C functions to its function registry, so that developers can author third party functions with complex dependency and expose them as C functions to be used in Gandiva expression. See more details in apacheGH-38589.

### What changes are included in this PR?
This PR primarily adds a new API to the `FunctionRegistry` so that developers can use it to register external C functions:
```C++
arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);
```

### Are these changes tested?
* The changes are tested via unit tests in this PR, and the unit tests include several C functions written using C++ and we confirm this kind of functions can be used by Gandiva after registration using the above mentioned new API.
* Additionally, locally I wrote some Rust based functions, and integrate the Rust based functions into a C++ program by using the new registration API and verified this approach did work, but this piece of work is not included in the PR.

### Are there any user-facing changes?
There are several new APIs added to `FunctionRegistry` class:
```C++
/// \brief register a C function into the function registry
  /// @ param func the registered function's metadata
  /// @ param c_function_ptr the function pointer to the
  /// registered function's implementation
  /// @ param function_holder_maker this will be used as the function holder if the
  /// function requires a function holder
  arrow::Status Register(
      NativeFunction func, void* c_function_ptr,
      std::optional<FunctionHolderMaker> function_holder_maker = std::nullopt);

  /// \brief get a list of C functions saved in the registry
  const std::vector<std::pair<NativeFunction, void*>>& GetCFunctions() const;

  const FunctionHolderMakerRegistry& GetFunctionHolderMakerRegistry() const;
```

* Closes: apache#38589

### Notes
* This PR is related with apache#38116, which adds the initial support for registering LLVM IR based external functions into Gandiva.

Authored-by: Yue Ni <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++][Gandiva] Support external function registry
4 participants