-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Passing table as argument to pyarrow.Table.filter causes segfault #37650
Comments
So seems that it's arrow-12.0? In 13.0 the code below will also core dump. Don't know if thats a bug or expected.
Firstly, seems that the filter should be used as below:
|
Yea sorry this was arrow-12.0 . And noted that the filter was used incorrectly here to begin with - I would have just expected a TypeError instead of a segfault |
From a source build I was able to trace the segfault to this line:
It looks like |
We are passing a generic Datum there as filter ( |
### Rationale for this change Prevent a segfault upon passing a non-(chunked_)array object into `Table.filter`. See #37650. ### What changes are included in this PR? 1. Check filter Datum kind to make sure that it is an array or a chunked array 2. test that attempting to filter a table with another table raises a not implemented error ### Are these changes tested? In PyArrow, yes ### Are there any user-facing changes? Raises an error if a non-array or non-chunked_array object is passed into `Table.filter` * Closes: #37650 Lead-authored-by: Patrick Clarke <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
@p-a-a-a-trick Can you post a comment here so that we can assign the issue to you? |
@pitrou sure thing (I'll take this one (: ) |
…pache#38075) ### Rationale for this change Prevent a segfault upon passing a non-(chunked_)array object into `Table.filter`. See apache#37650. ### What changes are included in this PR? 1. Check filter Datum kind to make sure that it is an array or a chunked array 2. test that attempting to filter a table with another table raises a not implemented error ### Are these changes tested? In PyArrow, yes ### Are there any user-facing changes? Raises an error if a non-array or non-chunked_array object is passed into `Table.filter` * Closes: apache#37650 Lead-authored-by: Patrick Clarke <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
…pache#38075) ### Rationale for this change Prevent a segfault upon passing a non-(chunked_)array object into `Table.filter`. See apache#37650. ### What changes are included in this PR? 1. Check filter Datum kind to make sure that it is an array or a chunked array 2. test that attempting to filter a table with another table raises a not implemented error ### Are these changes tested? In PyArrow, yes ### Are there any user-facing changes? Raises an error if a non-array or non-chunked_array object is passed into `Table.filter` * Closes: apache#37650 Lead-authored-by: Patrick Clarke <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
…pache#38075) ### Rationale for this change Prevent a segfault upon passing a non-(chunked_)array object into `Table.filter`. See apache#37650. ### What changes are included in this PR? 1. Check filter Datum kind to make sure that it is an array or a chunked array 2. test that attempting to filter a table with another table raises a not implemented error ### Are these changes tested? In PyArrow, yes ### Are there any user-facing changes? Raises an error if a non-array or non-chunked_array object is passed into `Table.filter` * Closes: apache#37650 Lead-authored-by: Patrick Clarke <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
This is not correct code, but I did actually make this mistake and was puzzled for a while by the segfault (I thought I was passing a pyarrow.compute expression to filter)
A gdb backtrace shows the following steps:
Component(s)
Python
The text was updated successfully, but these errors were encountered: