-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37848: [C++][Gandiva] Migrate LLVM JIT engine from MCJIT to ORC v2/LLJIT #39098
Conversation
|
|
||
jit_builder.setJITTargetMachineBuilder(std::move(jtmb)); | ||
if (object_cache.has_value()) { | ||
jit_builder.setCompileFunctionCreator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the LLJIT API for customizing the object cache.
if (exec_engine == nullptr) { | ||
return Status::CodeGenError("Could not instantiate llvm::ExecutionEngine: ", | ||
builder_error); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the logic into GetTargetMachineBuilder
and BuildJIT
functions.
std::unique_ptr<llvm::TargetMachine> machine( | ||
target->createTargetMachine(target_triple, cpu, features.getString(), {}, {})); | ||
|
||
module->setDataLayout(machine->createDataLayout()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the logic here is moved into GetTargetMachineBuilder
function.
The setDataLayout
is called in line 265 within VerifyAndLinkModule
function.
f06a8cf
to
a2cbdd5
Compare
auto types = engine->types(); | ||
llvm::IRBuilder<>* builder = engine->ir_builder(); | ||
llvm::LLVMContext* context = engine->context(); | ||
std::string BuildVecAdd(Engine* gdv_engine) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The engine
param is renamed to gdv_engine
since it shadows the engine
member variable in this class which introduces unnecessary confusion here
cpp/src/gandiva/engine.h
Outdated
/// \return arrow::Result containing the created engine | ||
static Result<std::unique_ptr<Engine>> Make( | ||
const std::shared_ptr<Configuration>& config, bool cached, | ||
std::optional<std::reference_wrapper<GandivaObjectCache>> object_cache = std::nullopt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR has to change the Engine::Make
and LLVMGenerator::Make
signatures anyway, I change the factory method to return an arrow::Result
instead of arrow::Status
so that it is simpler to use and the optional parameter like object_cache
can be used.
82ac6a2
to
83b1980
Compare
python/pyarrow/gandiva.pyx
Outdated
@@ -186,10 +186,6 @@ cdef class Projector(_Weakrefable): | |||
self.pool = pool | |||
return self | |||
|
|||
@property | |||
def llvm_ir(self): | |||
return self.projector.get().DumpIR().decode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to remove the llvm_ir
API from python binding, since the Configuration
is not exposed via the make_projector
and make_filter
APIs, but we may consider adding it back if we can expose the Configuration
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep this? This was introduced by #6417 .
There are still two checks failing, it is likely they failed for the same reason.
The cause is the |
For one of the failed CI check [1], it reported:
Do you have any idea how I can view the log file and the core dump file stack? Thanks. [1] https://github.com/apache/arrow/actions/runs/7115811560/job/19372808518?pr=39098 |
d055df3
to
994d83c
Compare
Are you working on macOS, right? (I'll try it later on my local Linux machine.) |
Thanks. I will give it a try later. I found that the crash in |
@kou about the Docker approach, do you have any idea why the Docker image tries to use HTTP_PROXY/HTTPS_PROXY |
Oh, really? How did you confirm it? $ docker run -it --rm amd64/ubuntu:20.04
root@6d3439e4a60d:/# env
HOSTNAME=6d3439e4a60d
PWD=/
HOME=/root
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
TERM=xterm
SHLVL=1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/usr/bin/env |
Please ignore my comment above about the proxy stuff. It is likely I set up such proxy long time ago system wide for Docker but forgot it completely after a long time 🤦 |
std::string fn_name = "codegen"; | ||
// LLVM 10 doesn't like the expr function name to be the same as the module name when | ||
// LLJIT is used | ||
std::string fn_name = "llvm_gen_test_add_expr"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a weird behavior of LLVM LLJIT. The LLVM module is created with name codegen
, if later a function is created with the same name codegen
, it can be successfully created, when looking up its function pointer, a successful symbol will be returned, however, the function address inside the symbol is 0, making it a nullptr and cannot be used.
- This issue happened for LLVM 10 (not sure if it is Linux specific), and the same code doesn't have such an issue when using LLVM 14.0.6 on macOS
- This issue is specific for ORC v2/LLJIT, and MCJIT doesn't have such an issue
To workaround this issue, I have to:
- rename the function name in the test case to be more specific and avoid conflict
- rename the LLVM module name to be more specific and add some random suffix (the object's address) to avoid conflict, and it is very unlikely to encounter such a conflict in reality after this change
UPDATE:
I reported a bug to LLVM, llvm/llvm-project#74903
…ing IR can be done.
d3deae2
to
db819dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@jorisvandenbossche Could you take a look at Cython changes? @js8544 Do you want to review this before we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I do have a question regarding the API changes.
b484dfb
to
8b88f6d
Compare
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 83cba25. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…ORC v2/LLJIT (apache#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: apache#37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ORC v2/LLJIT (apache#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: apache#37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ORC v2/LLJIT (apache#39098) ### Rationale for this change Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal. ### What changes are included in this PR? * This PR replaces the MCJIT JIT engine with the ORC v2 engine, using the `LLJIT` API. * This PR adds a new JIT linker option `JITLink` (https://llvm.org/docs/JITLink.html), which can be used together with `LLJIT`, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variable `GANDIVA_USE_JIT_LINK` ### Are these changes tested? Yes, they are covered by existing unit tests ### Are there any user-facing changes? * `Configuration` class has a new option called `dump_ir`. If users would like to call `DumpIR` API of `Projector` and `Filter`, they have to set the `dump_ir` option first. * Closes: apache#37848 Authored-by: Yue Ni <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…e from MCJIT to ORC v2/LLJIT (apache#39098)"" This reverts commit 2f34923.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
…CJIT to ORC v2/LLJIT (apache#39098)" This reverts commit 83cba25.
Rationale for this change
Gandiva currently employs MCJIT as its internal JIT engine. However, LLVM has introduced a newer JIT API known as ORC v2/LLJIT since LLVM 7.0, and it has several advantage over MCJIT, in particular, MCJIT is not actively maintained, and is slated for eventual deprecation and removal.
What changes are included in this PR?
LLJIT
API.JITLink
(https://llvm.org/docs/JITLink.html), which can be used together withLLJIT
, for LLVM 14+ on Linux/macOS platform. It is turned off by default but could be turned on with environment variableGANDIVA_USE_JIT_LINK
Are these changes tested?
Yes, they are covered by existing unit tests
Are there any user-facing changes?
Configuration
class has a new option calleddump_ir
. If users would like to callDumpIR
API ofProjector
andFilter
, they have to set thedump_ir
option first.