Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][Gandiva] Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ] #39695

Open
kou opened this issue Jan 19, 2024 · 12 comments

Comments

@kou
Copy link
Member

kou commented Jan 19, 2024

Describe the bug, including details regarding any error messages, version, and platform.

I'm not sure that this is a problem of Gandiva/PyArrow or my environment.

I'm verifying Apache Arrow 15.0.0 RC1. I got the following failures:

=================================== FAILURES ===================================
____________________________ test_tree_exp_builder _____________________________

    @pytest.mark.gandiva
    def test_tree_exp_builder():
        import pyarrow.gandiva as gandiva
    
        builder = gandiva.TreeExprBuilder()
    
        field_a = pa.field('a', pa.int32())
        field_b = pa.field('b', pa.int32())
    
        schema = pa.schema([field_a, field_b])
    
        field_result = pa.field('res', pa.int32())
    
        node_a = builder.make_field(field_a)
        node_b = builder.make_field(field_b)
    
        assert node_a.return_type() == field_a.type
    
        condition = builder.make_function("greater_than", [node_a, node_b],
                                          pa.bool_())
        if_node = builder.make_if(condition, node_a, node_b, pa.int32())
    
        expr = builder.make_expression(if_node, field_result)
    
        assert expr.result().type == pa.int32()
    
        config = gandiva.Configuration(dump_ir=True)
>       projector = gandiva.make_projector(
            schema, [expr], pa.default_memory_pool(), "NONE", config)

pyarrow/tests/test_gandiva.py:51: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:625: in pyarrow.gandiva.make_projector
    cpdef make_projector(Schema schema, children, MemoryPool pool,
pyarrow/gandiva.pyx:664: in pyarrow.gandiva.make_projector
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException
----------------------------- Captured stderr call -----------------------------
/tmp/arrow-15.0.0.lJ9Vt/apache-arrow-15.0.0/cpp/src/gandiva/cache.cc:50: Creating gandiva cache with capacity of 500
/tmp/arrow-15.0.0.lJ9Vt/apache-arrow-15.0.0/cpp/src/gandiva/engine.cc:265: Detected CPU Name : znver2
/tmp/arrow-15.0.0.lJ9Vt/apache-arrow-15.0.0/cpp/src/gandiva/engine.cc:266: Detected CPU Features: [ +prfchw -cldemote +avx +aes +sahf +pclmul -xop +crc32 +xsaves -avx512fp16 -sm4 +sse4.1 -avx512ifma +xsave -avx512pf +sse4.2 -tsxldtrk -ptwrite -widekl -sm3 -invpcid +64bit +xsavec -avx512vpopcntdq +cmov -avx512vp2intersect -avx512cd +movbe -avxvnniint8 -avx512er -amx-int8 -kl -sha512 -avxvnni -rtm +adx +avx2 -hreset -movdiri -serialize -vpclmulqdq -avx512vl -uintr +clflushopt -raoint -cmpccxadd +bmi -amx-tile +sse -gfni -avxvnniint16 -amx-fp16 +xsaveopt +rdrnd -avx512f -amx-bf16 -avx512bf16 -avx512vnni +cx8 -avx512bw +sse3 -pku +fsgsbase +clzero +mwaitx -lwp +lzcnt +sha -movdir64b +wbnoinvd -enqcmd -prefetchwt1 -avxneconvert -tbm -pconfig -amx-complex +ssse3 +cx16 +bmi2 +fma +popcnt -avxifma +f16c -avx512bitalg +rdpru +clwb +mmx +sse2 +rdseed -avx512vbmi2 -prefetchi +rdpid -fma4 -avx512vbmi -shstk -vaes -waitpkg -sgx +fxsr -avx512dq +sse4a]
__________________________________ test_table __________________________________

    @pytest.mark.gandiva
    def test_table():
        import pyarrow.gandiva as gandiva
    
        table = pa.Table.from_arrays([pa.array([1.0, 2.0]), pa.array([3.0, 4.0])],
                                     ['a', 'b'])
    
        builder = gandiva.TreeExprBuilder()
        node_a = builder.make_field(table.schema.field("a"))
        node_b = builder.make_field(table.schema.field("b"))
    
        sum = builder.make_function("add", [node_a, node_b], pa.float64())
    
        field_result = pa.field("c", pa.float64())
        expr = builder.make_expression(sum, field_result)
    
>       projector = gandiva.make_projector(
            table.schema, [expr], pa.default_memory_pool())

pyarrow/tests/test_gandiva.py:82: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:625: in pyarrow.gandiva.make_projector
    cpdef make_projector(Schema schema, children, MemoryPool pool,
pyarrow/gandiva.pyx:664: in pyarrow.gandiva.make_projector
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException
_________________________________ test_filter __________________________________

    @pytest.mark.gandiva
    def test_filter():
        import pyarrow.gandiva as gandiva
    
        table = pa.Table.from_arrays([pa.array([1.0 * i for i in range(10000)])],
                                     ['a'])
    
        builder = gandiva.TreeExprBuilder()
        node_a = builder.make_field(table.schema.field("a"))
        thousand = builder.make_literal(1000.0, pa.float64())
        cond = builder.make_function("less_than", [node_a, thousand], pa.bool_())
        condition = builder.make_condition(cond)
    
        assert condition.result().type == pa.bool_()
    
        config = gandiva.Configuration(dump_ir=True)
>       filter = gandiva.make_filter(table.schema, condition, config)

pyarrow/tests/test_gandiva.py:109: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:672: in pyarrow.gandiva.make_filter
    cpdef make_filter(Schema schema, Condition condition,
pyarrow/gandiva.pyx:700: in pyarrow.gandiva.make_filter
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException
_________________________________ test_in_expr _________________________________

    @pytest.mark.gandiva
    def test_in_expr():
        import pyarrow.gandiva as gandiva
    
        arr = pa.array(["ga", "an", "nd", "di", "iv", "va"])
        table = pa.Table.from_arrays([arr], ["a"])
    
        # string
        builder = gandiva.TreeExprBuilder()
        node_a = builder.make_field(table.schema.field("a"))
        cond = builder.make_in_expression(node_a, ["an", "nd"], pa.string())
        condition = builder.make_condition(cond)
>       filter = gandiva.make_filter(table.schema, condition)

pyarrow/tests/test_gandiva.py:129: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:672: in pyarrow.gandiva.make_filter
    cpdef make_filter(Schema schema, Condition condition,
pyarrow/gandiva.pyx:700: in pyarrow.gandiva.make_filter
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException
_________________________________ test_boolean _________________________________

    @pytest.mark.gandiva
    def test_boolean():
        import pyarrow.gandiva as gandiva
    
        table = pa.Table.from_arrays([
            pa.array([1., 31., 46., 3., 57., 44., 22.]),
            pa.array([5., 45., 36., 73., 83., 23., 76.])],
            ['a', 'b'])
    
        builder = gandiva.TreeExprBuilder()
        node_a = builder.make_field(table.schema.field("a"))
        node_b = builder.make_field(table.schema.field("b"))
        fifty = builder.make_literal(50.0, pa.float64())
        eleven = builder.make_literal(11.0, pa.float64())
    
        cond_1 = builder.make_function("less_than", [node_a, fifty], pa.bool_())
        cond_2 = builder.make_function("greater_than", [node_a, node_b],
                                       pa.bool_())
        cond_3 = builder.make_function("less_than", [node_b, eleven], pa.bool_())
        cond = builder.make_or([builder.make_and([cond_1, cond_2]), cond_3])
        condition = builder.make_condition(cond)
    
>       filter = gandiva.make_filter(table.schema, condition)

pyarrow/tests/test_gandiva.py:250: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:672: in pyarrow.gandiva.make_filter
    cpdef make_filter(Schema schema, Condition condition,
pyarrow/gandiva.pyx:700: in pyarrow.gandiva.make_filter
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException
__________________________________ test_regex __________________________________

    @pytest.mark.gandiva
    def test_regex():
        import pyarrow.gandiva as gandiva
    
        elements = ["park", "sparkle", "bright spark and fire", "spark"]
        data = pa.array(elements, type=pa.string())
        table = pa.Table.from_arrays([data], names=['a'])
    
        builder = gandiva.TreeExprBuilder()
        node_a = builder.make_field(table.schema.field("a"))
        regex = builder.make_literal("%spark%", pa.string())
        like = builder.make_function("like", [node_a, regex], pa.bool_())
    
        field_result = pa.field("b", pa.bool_())
        expr = builder.make_expression(like, field_result)
    
>       projector = gandiva.make_projector(
            table.schema, [expr], pa.default_memory_pool())

pyarrow/tests/test_gandiva.py:311: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:625: in pyarrow.gandiva.make_projector
    cpdef make_projector(Schema schema, children, MemoryPool pool,
pyarrow/gandiva.pyx:664: in pyarrow.gandiva.make_projector
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException
_____________________________ test_filter_project ______________________________

    @pytest.mark.gandiva
    def test_filter_project():
        import pyarrow.gandiva as gandiva
        mpool = pa.default_memory_pool()
        # Create a table with some sample data
        array0 = pa.array([10, 12, -20, 5, 21, 29], pa.int32())
        array1 = pa.array([5, 15, 15, 17, 12, 3], pa.int32())
        array2 = pa.array([1, 25, 11, 30, -21, None], pa.int32())
    
        table = pa.Table.from_arrays([array0, array1, array2], ['a', 'b', 'c'])
    
        field_result = pa.field("res", pa.int32())
    
        builder = gandiva.TreeExprBuilder()
        node_a = builder.make_field(table.schema.field("a"))
        node_b = builder.make_field(table.schema.field("b"))
        node_c = builder.make_field(table.schema.field("c"))
    
        greater_than_function = builder.make_function("greater_than",
                                                      [node_a, node_b], pa.bool_())
        filter_condition = builder.make_condition(
            greater_than_function)
    
        project_condition = builder.make_function("less_than",
                                                  [node_b, node_c], pa.bool_())
        if_node = builder.make_if(project_condition,
                                  node_b, node_c, pa.int32())
        expr = builder.make_expression(if_node, field_result)
    
        # Build a filter for the expressions.
>       filter = gandiva.make_filter(table.schema, filter_condition)

pyarrow/tests/test_gandiva.py:359: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/gandiva.pyx:672: in pyarrow.gandiva.make_filter
    cpdef make_filter(Schema schema, Condition condition,
pyarrow/gandiva.pyx:700: in pyarrow.gandiva.make_filter
    check_status(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise convert_status(status)
E   pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

pyarrow/error.pxi:91: ArrowException

Note that Gandiva tests written in C++ were passed.

Component(s)

C++ - Gandiva, Python

@kou
Copy link
Member Author

kou commented Jan 19, 2024

@niyue Have you seen this error?

@niyue
Copy link
Contributor

niyue commented Jan 19, 2024

I haven't seen such issue previously. I will give it a try later to see if I can reproduce it. What is the difference between your env and the envs we have in arrow CI? Are you using LLVM 17 (which may be related with this issue llvm/llvm-project#74671)

@kou
Copy link
Member Author

kou commented Jan 19, 2024

Yes. I'm using LLVM 17. I'll try LLVM 16.

@kou
Copy link
Member Author

kou commented Jan 20, 2024

I tried LLVM 16 and it worked. Thanks!

Can we solve this? Or should we reject LLVM 17?

@niyue
Copy link
Contributor

niyue commented Jan 20, 2024

I pulled the last main branch (55afcf0) tonight, and tried LLVM 14 + Ubuntu 20.04 + Docker for Mac, and it works (there is one test case skipped)

pyarrow/tests/test_gandiva.py ....s.......

I am not sure what the exact cause for this issue yet, and I will give it another try for LLVM 17 to see if it works.

@niyue
Copy link
Contributor

niyue commented Jan 20, 2024

I could reproduce the same issue using LLVM 17 + Ubuntu 23.10

FAILED pyarrow/tests/test_gandiva.py::test_tree_exp_builder - pyarrow.lib.ArrowException: CodeGenError in Gandiva: Could not create LLJIT instance: Symbols not found: [ llvm_orc_registerEHFrameSectionWrapper ]

So far I played with CMake for a few hours trying to use the export_executable_symbols CMake function in LLVM, but I don't figure out how to address it yet. I will try tomorrow to see if there is any approach to address it.

@kou
Copy link
Member Author

kou commented Jan 20, 2024

Thanks!

@niyue
Copy link
Contributor

niyue commented Jan 21, 2024

@kou Sorry I am still not able to figure it out yet. I tried using the export_executable_symbols/setting the ENABLE_EXPORTS property for the gandiva library/arrow_python/arrow_python's Cython gandiva module extension, but none of them help to address this issue. I run out of ideas currently, and will keep experimenting it on and off in the following days, but I am afraid that it may not be addressed in a short time.

@kou
Copy link
Member Author

kou commented Jan 21, 2024

Thanks! No problem. I'll also take a look at this later.

@kou
Copy link
Member Author

kou commented Jan 26, 2024

llvm_orc_registerEHFrameSectionWrapper is needed to be visible in the target process.
For PyArrow case, llvm_orc_registerEHFrameSectionWrapper is needed to be visible in python process.

Here are solutions for this case but all of them are subtlety...:

  1. Use os.RTLD_GLOBAL to import pyarrow.gandiva
import sys
import os
sys.setdlopenflags(sys.getdlopenflags() + os.RTLD_GLOBAL)
import pyarrow.gandiva

Python use os.RTLD_LOCAL by default. So symbols in libLLVM-17.so aren't visible in python process.

  1. Load libLLVM-17.so to python process

Use LD_PRELOAD:

$ LD_PRELOAD=/lib/llvm-17/lib/libLLVM-17.so python ...

Use ctype and os.RTLD_GLOBAL:

import pyarrow.gandiva as gandiva

import ctypes
import os
import sys
ctypes.CDLL("libLLVM-17.so.1", sys.getdlopenflags() + os.RTLD_GLOBAL)

Note: We can't use the ENABLE_EXPORTS approach that is used by llvm/llvm-project@2ad8e6e because we don't build Python and Python isn't linked to libLLVM.so.

@kou
Copy link
Member Author

kou commented Jan 26, 2024

#39622 (comment) is similar but a different case.
It uses static link and executable is a test program.
So we can use the ENABLE_EXPORTS approach for the case: #39622 (comment)

@niyue
Copy link
Contributor

niyue commented Jan 31, 2024

LD_PRELOAD=/lib/llvm-17/lib/libLLVM-17.so python

That is much more difficult than I thought. It makes using gandiva in python much more difficult since it requires installation of LLVM 17 itself. I will try summarize this issue and consult LLVM community to see if there is alternative approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants