Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R][Python] Segfault on session exit after materializing ALTREP character vectors imported from Python #34897

Closed
paleolimbot opened this issue Apr 5, 2023 · 1 comment · Fixed by #35812
Assignees
Labels
Component: Python Component: R Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. Priority: Critical Type: bug
Milestone

Comments

@paleolimbot
Copy link
Member

Describe the bug, including details regarding any error messages, version, and platform.

As described and reproduced nicely in rpy2/rpy2-arrow#11

This is almost certainly a result of #34489 since it seems to involve fully materialized arrays. Otherwise, I'm not sure exactly what would cause this specific to shipping arrays over the C Data interface. The traceback reported is below (I will try to reproduce as well):

I suspect this might have to do with the fact that the session shutting down might lead to some things happening with the GIL and R is trying to release that memory at a very inconvenient time.

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00005555557868b0 in new_threadstate (interp=0x0, init=1) at ../Python/pystate.c:616
616	../Python/pystate.c: No such file or directory.
(gdb) up
#1  0x00005555557a4ee2 in PyThreadState_New (interp=<optimized out>)
    at ../Python/pystate.c:684
684	../Python/pystate.c: No such file or directory.
(gdb) up
#2  PyGILState_Ensure () at ../Python/pystate.c:1504
1504	in ../Python/pystate.c
(gdb) up
#3  0x00007fffe0ee6003 in arrow::py::NumPyBuffer::~NumPyBuffer() ()
   from /opt/software/python/py310_env/lib/python3.10/site-packages/pyarrow/libarrow_python.so
(gdb) up
#4  0x00007fffe0ecb56d in std::_Sp_counted_ptr_inplace<arrow::ArrayData, std::allocator<arrow::ArrayData>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() ()
   from /opt/software/python/py310_env/lib/python3.10/site-packages/pyarrow/libarrow_python.so
(gdb) up
#5  0x00007fffe0ecb4d5 in std::_Sp_counted_ptr_inplace<arrow::ArrayData, std::allocator<arrow::ArrayData>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() ()
   from /opt/software/python/py310_env/lib/python3.10/site-packages/pyarrow/libarrow_python.so
(gdb) up
#6  0x00007fffde32f04a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
   from /opt/software/python/py310_env/lib/python3.10/site-packages/pyarrow/libarrow.so.1100
(gdb) up
#7  0x00007fffdee02a4c in arrow::(anonymous namespace)::ReleaseExportedArray(ArrowArray*) ()
   from /opt/software/python/py310_env/lib/python3.10/site-packages/pyarrow/libarrow.so.1100
(gdb) up
#8  0x00007fffd0aab6f7 in std::_Sp_counted_ptr_inplace<arrow::(anonymous namespace)::ImportedArrayData, std::allocator<arrow::(anonymous namespace)::ImportedArrayData>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() ()
   from /usr/local/packages/R/4.2/lib/R/library/arrow/libs/arrow.so
(gdb) up
#9  0x00007fffd04eda4a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x55555c8b2a70)
    at /usr/include/c++/11/bits/shared_ptr_base.h:168
168		    _M_dispose();
(gdb) 

Component(s)

R

@paleolimbot
Copy link
Member Author

Weston pointed me at some code in the Python package that checks if Python is finalizing before attempting to do something that might modify reference counts:

https://github.com/apache/arrow/blob/main/python/pyarrow/src/arrow/python/udf.cc#L52-L58

From the R side, our external pointers have an option to not run the finalizer when the session shuts down ( https://github.com/r-lib/cpp11/blob/main/inst/include/cpp11/external_pointer.hpp#L58 ). We could/should pass false here since we aren't using the finalizers to do anything that will have an effect outside the process.

@paleolimbot paleolimbot added this to the 13.0.0 milestone May 30, 2023
thisisnic pushed a commit to thisisnic/arrow that referenced this issue Jun 6, 2023
…ot own any Array references (apache#35812)

This was identified and 99% debugged by @ lgautier on rpy2/rpy2-arrow#11 . Thank you!

I have no idea why this does anything; however, the `RStringViewer` class *was* holding on to an unnecessary Array reference and this seemed to fix the crash for me. Maybe a circular reference? The reprex I was using (provided by @ lgautier) was:

Install fresh deps:

```bash
pip3 install pandas pyarrow rpy2-arrow
R -e 'install.packages("arrow", repos = "https://cloud.r-project.org/")'
```

Run this python script:

```python
import pandas as pd
import pyarrow
from rpy2.robjects.packages import importr
import rpy2.robjects
import rpy2_arrow.arrow as pyra
base = importr('base')
nanoarrow = importr('nanoarrow')

code = """
    function(df) {
        # df$col1  # no segfault on exit
        # I(df$col1)  # no segfault on exit
        # df$col2  # no segfault on exit
        I(df$col2)  # segfault on exit
    }
"""
rfunction = rpy2.robjects.r(code)

pd_df = pd.DataFrame({
    "col1": range(10),
    "col2":["a" for num in range(10)]
})
pd_tbl = pyarrow.Table.from_pandas(pd_df)
r_tbl = pyra.pyarrow_table_to_r_table(pd_tbl)
r_df = base.as_data_frame(nanoarrow.as_nanoarrow_array_stream(r_tbl))

output = rfunction(r_df)
print(output)
```

Before this PR (installing R/arrow from main) I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"

zsh: segmentation fault  python reprex-arrow.py
```

After this PR I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
```

(with no segfault)

I wonder if this also will help with apache#35391 since it's also a segfault involving the Python <-> R bridge.
* Closes: apache#34897

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
thisisnic pushed a commit to thisisnic/arrow that referenced this issue Jun 13, 2023
…ot own any Array references (apache#35812)

This was identified and 99% debugged by @ lgautier on rpy2/rpy2-arrow#11 . Thank you!

I have no idea why this does anything; however, the `RStringViewer` class *was* holding on to an unnecessary Array reference and this seemed to fix the crash for me. Maybe a circular reference? The reprex I was using (provided by @ lgautier) was:

Install fresh deps:

```bash
pip3 install pandas pyarrow rpy2-arrow
R -e 'install.packages("arrow", repos = "https://cloud.r-project.org/")'
```

Run this python script:

```python
import pandas as pd
import pyarrow
from rpy2.robjects.packages import importr
import rpy2.robjects
import rpy2_arrow.arrow as pyra
base = importr('base')
nanoarrow = importr('nanoarrow')

code = """
    function(df) {
        # df$col1  # no segfault on exit
        # I(df$col1)  # no segfault on exit
        # df$col2  # no segfault on exit
        I(df$col2)  # segfault on exit
    }
"""
rfunction = rpy2.robjects.r(code)

pd_df = pd.DataFrame({
    "col1": range(10),
    "col2":["a" for num in range(10)]
})
pd_tbl = pyarrow.Table.from_pandas(pd_df)
r_tbl = pyra.pyarrow_table_to_r_table(pd_tbl)
r_df = base.as_data_frame(nanoarrow.as_nanoarrow_array_stream(r_tbl))

output = rfunction(r_df)
print(output)
```

Before this PR (installing R/arrow from main) I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"

zsh: segmentation fault  python reprex-arrow.py
```

After this PR I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
```

(with no segfault)

I wonder if this also will help with apache#35391 since it's also a segfault involving the Python <-> R bridge.
* Closes: apache#34897

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
@raulcd raulcd added the Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. label Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment