Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Segfault on flight test #35391

Closed
nealrichardson opened this issue May 2, 2023 · 5 comments
Closed

[R] Segfault on flight test #35391

nealrichardson opened this issue May 2, 2023 · 5 comments

Comments

@nealrichardson
Copy link
Member

Describe the bug, including details regarding any error messages, version, and platform.

This is happening pretty regularly on main now in the ubuntu "force tests" job. As a result, I haven't seen #35238 in a while because this segfault happens before we get to that point :/

cc @paleolimbot

...
Start test: flight_get
/arrow/cpp/src/arrow/status.cc:134: Invalid: Signal stop source already set up
  'test-python-flight.R:56:5' [success]
End test: flight_get

Start test: flight_put with RecordBatch
/arrow/cpp/src/arrow/status.cc:134: Invalid: Signal stop source already set up
  'test-python-flight.R:62:5' [success]
End test: flight_put with RecordBatch

Start test: flight_put with overwrite = FALSE
  'test-python-flight.R:66:5' [success]
/arrow/cpp/src/arrow/status.cc:134: Invalid: Signal stop source already set up

 *** caught segfault ***
address 0xb8, cause 'memory not mapped'
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)

Component(s)

R

@h-vetinari
Copy link
Contributor

The flight tests are also segfaulting in python (on osx), see here

@nealrichardson
Copy link
Member Author

ICYMI @lidavidm @jorisvandenbossche

@jorisvandenbossche
Copy link
Member

Other potentially related issue:

@nealrichardson
Copy link
Member Author

That sounds like the same thing--didn't realize it had been happening for so long. It does seem more frequent now though, or has re-emerged after being fixed.

paleolimbot added a commit that referenced this issue May 30, 2023
… any Array references (#35812)

This was identified and 99% debugged by @ lgautier on rpy2/rpy2-arrow#11 . Thank you!

I have no idea why this does anything; however, the `RStringViewer` class *was* holding on to an unnecessary Array reference and this seemed to fix the crash for me. Maybe a circular reference? The reprex I was using (provided by @ lgautier) was:

Install fresh deps:

```bash
pip3 install pandas pyarrow rpy2-arrow
R -e 'install.packages("arrow", repos = "https://cloud.r-project.org/")'
```

Run this python script:

```python
import pandas as pd
import pyarrow
from rpy2.robjects.packages import importr
import rpy2.robjects
import rpy2_arrow.arrow as pyra
base = importr('base')
nanoarrow = importr('nanoarrow')

code = """
    function(df) {
        # df$col1  # no segfault on exit
        # I(df$col1)  # no segfault on exit
        # df$col2  # no segfault on exit
        I(df$col2)  # segfault on exit
    }
"""
rfunction = rpy2.robjects.r(code)

pd_df = pd.DataFrame({
    "col1": range(10),
    "col2":["a" for num in range(10)]
})
pd_tbl = pyarrow.Table.from_pandas(pd_df)
r_tbl = pyra.pyarrow_table_to_r_table(pd_tbl)
r_df = base.as_data_frame(nanoarrow.as_nanoarrow_array_stream(r_tbl))

output = rfunction(r_df)
print(output)
```

Before this PR (installing R/arrow from main) I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"

zsh: segmentation fault  python reprex-arrow.py
```

After this PR I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
```

(with no segfault)

I wonder if this also will help with #35391 since it's also a segfault involving the Python <-> R bridge.
* Closes: #34897

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
thisisnic pushed a commit to thisisnic/arrow that referenced this issue Jun 6, 2023
…ot own any Array references (apache#35812)

This was identified and 99% debugged by @ lgautier on rpy2/rpy2-arrow#11 . Thank you!

I have no idea why this does anything; however, the `RStringViewer` class *was* holding on to an unnecessary Array reference and this seemed to fix the crash for me. Maybe a circular reference? The reprex I was using (provided by @ lgautier) was:

Install fresh deps:

```bash
pip3 install pandas pyarrow rpy2-arrow
R -e 'install.packages("arrow", repos = "https://cloud.r-project.org/")'
```

Run this python script:

```python
import pandas as pd
import pyarrow
from rpy2.robjects.packages import importr
import rpy2.robjects
import rpy2_arrow.arrow as pyra
base = importr('base')
nanoarrow = importr('nanoarrow')

code = """
    function(df) {
        # df$col1  # no segfault on exit
        # I(df$col1)  # no segfault on exit
        # df$col2  # no segfault on exit
        I(df$col2)  # segfault on exit
    }
"""
rfunction = rpy2.robjects.r(code)

pd_df = pd.DataFrame({
    "col1": range(10),
    "col2":["a" for num in range(10)]
})
pd_tbl = pyarrow.Table.from_pandas(pd_df)
r_tbl = pyra.pyarrow_table_to_r_table(pd_tbl)
r_df = base.as_data_frame(nanoarrow.as_nanoarrow_array_stream(r_tbl))

output = rfunction(r_df)
print(output)
```

Before this PR (installing R/arrow from main) I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"

zsh: segmentation fault  python reprex-arrow.py
```

After this PR I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
```

(with no segfault)

I wonder if this also will help with apache#35391 since it's also a segfault involving the Python <-> R bridge.
* Closes: apache#34897

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
thisisnic pushed a commit to thisisnic/arrow that referenced this issue Jun 13, 2023
…ot own any Array references (apache#35812)

This was identified and 99% debugged by @ lgautier on rpy2/rpy2-arrow#11 . Thank you!

I have no idea why this does anything; however, the `RStringViewer` class *was* holding on to an unnecessary Array reference and this seemed to fix the crash for me. Maybe a circular reference? The reprex I was using (provided by @ lgautier) was:

Install fresh deps:

```bash
pip3 install pandas pyarrow rpy2-arrow
R -e 'install.packages("arrow", repos = "https://cloud.r-project.org/")'
```

Run this python script:

```python
import pandas as pd
import pyarrow
from rpy2.robjects.packages import importr
import rpy2.robjects
import rpy2_arrow.arrow as pyra
base = importr('base')
nanoarrow = importr('nanoarrow')

code = """
    function(df) {
        # df$col1  # no segfault on exit
        # I(df$col1)  # no segfault on exit
        # df$col2  # no segfault on exit
        I(df$col2)  # segfault on exit
    }
"""
rfunction = rpy2.robjects.r(code)

pd_df = pd.DataFrame({
    "col1": range(10),
    "col2":["a" for num in range(10)]
})
pd_tbl = pyarrow.Table.from_pandas(pd_df)
r_tbl = pyra.pyarrow_table_to_r_table(pd_tbl)
r_df = base.as_data_frame(nanoarrow.as_nanoarrow_array_stream(r_tbl))

output = rfunction(r_df)
print(output)
```

Before this PR (installing R/arrow from main) I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"

zsh: segmentation fault  python reprex-arrow.py
```

After this PR I get:

```
(.venv) dewey@ Deweys-Mac-mini 2023-05-29_rpy % python reprex-arrow.py
 [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
```

(with no segfault)

I wonder if this also will help with apache#35391 since it's also a segfault involving the Python <-> R bridge.
* Closes: apache#34897

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
@thisisnic
Copy link
Member

I believe this has been fixed by #35812 but feel free to reopen if not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants