Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] StructArray.sort() errors with default arguments #41464

Closed
a-reich opened this issue Apr 30, 2024 · 4 comments
Closed

[Python] StructArray.sort() errors with default arguments #41464

a-reich opened this issue Apr 30, 2024 · 4 comments

Comments

@a-reich
Copy link
Contributor

a-reich commented Apr 30, 2024

Describe the bug, including details regarding any error messages, version, and platform.

StructArray's sort method is documented to have the parameter by: str or None, default None, however a simple example that leaves the argument as None raises:

arr = pa.array([{'x': 1, 'y': True}, {'x': 3.4, 'y': False}])
arr.sort()
####### exception
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/array.pxi", line 3971, in pyarrow.lib.StructArray.sort
  File "/home/asaf/.local/lib/python3.12/site-packages/pyarrow/compute.py", line 263, in wrapper
    return func.call(args, options, memory_pool)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/_compute.pyx", line 385, in pyarrow._compute.Function.call
  File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Invalid sort key column: No match for FieldRef.Name() in x: double
y: bool

I am running pyarrow 16.0.0, Python 3.12, on Windows WSL.

I think the cause is a simple mistake in this line?

options=_pc().SortOptions(sort_keys=[("", order)], **kwargs)

I can send a basic PR with that fix and a test soon. But I'm not sure if I'll be able to build & test locally. Thanks!

Component(s)

Python

@a-reich a-reich changed the title StructArray.sort() errors with default arguments [Python] StructArray.sort() errors with default arguments Apr 30, 2024
@jorisvandenbossche
Copy link
Member

@a-reich Thanks for the report, a PR is certainly welcome!

@raulcd
Copy link
Member

raulcd commented May 2, 2024

As per how to build and test, If you add a test that reproduces the issue it can be tested on CI.

a-reich added a commit to a-reich/arrow that referenced this issue May 2, 2024
@a-reich
Copy link
Contributor Author

a-reich commented May 2, 2024

take

jorisvandenbossche pushed a commit that referenced this issue May 14, 2024
### Rationale for this change
Closes issue #41464. Fix `StructArray.sort` method's `by` param to work in the case of `by=None` which was documented to mean sort by all fields (the default), but would raise an exception.

### What changes are included in this PR?
* Add a unit test with by=None in `test_struct_array_sort` that fails on main
* Fix the sort method

### Are these changes tested?
yes

### Are there any user-facing changes?
yes
* GitHub Issue: #41464

Authored-by: a-reich <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@jorisvandenbossche
Copy link
Member

Issue resolved by pull request 41495
#41495

@jorisvandenbossche jorisvandenbossche added this to the 17.0.0 milestone May 14, 2024
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…1495)

### Rationale for this change
Closes issue apache#41464. Fix `StructArray.sort` method's `by` param to work in the case of `by=None` which was documented to mean sort by all fields (the default), but would raise an exception.

### What changes are included in this PR?
* Add a unit test with by=None in `test_struct_array_sort` that fails on main
* Fix the sort method

### Are these changes tested?
yes

### Are there any user-facing changes?
yes
* GitHub Issue: apache#41464

Authored-by: a-reich <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this issue May 29, 2024
…1495)

### Rationale for this change
Closes issue apache#41464. Fix `StructArray.sort` method's `by` param to work in the case of `by=None` which was documented to mean sort by all fields (the default), but would raise an exception.

### What changes are included in this PR?
* Add a unit test with by=None in `test_struct_array_sort` that fails on main
* Fix the sort method

### Are these changes tested?
yes

### Are there any user-facing changes?
yes
* GitHub Issue: apache#41464

Authored-by: a-reich <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants