Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Python] Unable to cast date{32,64} to date{32,64} #43183

Closed
Fokko opened this issue Jul 8, 2024 · 1 comment
Closed

[C++][Python] Unable to cast date{32,64} to date{32,64} #43183

Fokko opened this issue Jul 8, 2024 · 1 comment
Assignees
Milestone

Comments

@Fokko
Copy link
Contributor

Fokko commented Jul 8, 2024

Describe the bug, including details regarding any error messages, version, and platform.

It looks like I'm able to cast ints/string:

> import pyarrow as pa

> n_legs = pa.array([2, 2, 4, 4, 5, 100])
> animals = pa.array(["Flamingo", "Parrot", "Dog", "Horse", "Brittle stars", "Centipede"])
> names = ["n_legs", "animals"]

> batch = pa.RecordBatch.from_arrays([n_legs, animals], names=names)
> batch

pyarrow.RecordBatch
n_legs: int64
animals: string
----
n_legs: [2,2,4,4,5,100]
animals: ["Flamingo","Parrot","Dog","Horse","Brittle stars","Centipede"]

> schema = pa.schema([
>     ('n_legs', pa.int64()),
>     ('animals', pa.string()),
> ])
> pa.RecordBatchReader.from_batches(
>     schema,
>     [batch]
> ).cast(schema).read_all()

pyarrow.Table
n_legs: int64
animals: string
----
n_legs: [[2,2,4,4,5,100]]
animals: [["Flamingo","Parrot","Dog","Horse","Brittle stars","Centipede"]]

But it seems to fail with a date32:

> import pyarrow as pa
> from datetime import date
> birthday = [date(1990, 3, 1)]
> names = ["Fokko"]
> batch = pa.RecordBatch.from_arrays([birthday, names], names=['birthday', 'name'])
> batch
pyarrow.RecordBatch
birthday: date32[day]
name: string
----
birthday: [1990-03-01]
name: ["Fokko"]

> schema = pa.schema([
>     ('birthday', pa.date32()),
>     ('name', pa.string()),
> ])

> pa.RecordBatchReader.from_batches(
>     schema,
>     [batch]
> ).cast(schema).read_all()

---------------------------------------------------------------------------
ArrowTypeError                            Traceback (most recent call last)
Cell In[6], line 9
      1 schema = pa.schema([
      2     ('birthday', pa.date32()),
      3     ('name', pa.string()),
      4 ])
      6 pa.RecordBatchReader.from_batches(
      7     schema,
      8     [batch]
----> 9 ).cast(schema).read_all()

File /opt/homebrew/lib/python3.10/site-packages/pyarrow/ipc.pxi:800, in pyarrow.lib.RecordBatchReader.cast()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:154, in pyarrow.lib.pyarrow_internal_check_status()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:91, in pyarrow.lib.check_status()
ArrowTypeError: Field 0 cannot be cast from date32[day] to date32[day]

Same for date64:

---------------------------------------------------------------------------
ArrowTypeError                            Traceback (most recent call last)
Cell In[42], line 15
      4 schema = pa.schema([
      5     # ('date32', pa.date32()),
      6     ('date64', pa.date64()),
      7 ])
      9 batch = pa.RecordBatch.from_arrays(data, schema=schema)
     12 table = pa.RecordBatchReader.from_batches(
     13     schema,
     14     [batch]
---> 15 ).cast(schema).read_all()
     17 assert table['date32'][0].as_py() == dt
     18 assert table['date64'][0].as_py() == dt

File /opt/homebrew/lib/python3.10/site-packages/pyarrow/ipc.pxi:800, in pyarrow.lib.RecordBatchReader.cast()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:154, in pyarrow.lib.pyarrow_internal_check_status()
File /opt/homebrew/lib/python3.10/site-packages/pyarrow/error.pxi:91, in pyarrow.lib.check_status()
ArrowTypeError: Field 0 cannot be cast from date64[ms] to date64[ms]

This looks like a valid cast operation to me. Please advise. Happy to create a PR, if someone can point out the place where I should add the test would be very helpful, since I'm not familiar with the codebase :)

> pa.__version__
'16.1.0'

Component(s)

C++

@Fokko Fokko added the Type: bug label Jul 8, 2024
Fokko added a commit to Fokko/arrow that referenced this issue Jul 8, 2024
Fokko added a commit to Fokko/arrow that referenced this issue Jul 8, 2024
Fokko added a commit to Fokko/arrow that referenced this issue Jul 8, 2024
Fokko added a commit to Fokko/arrow that referenced this issue Jul 8, 2024
Fokko added a commit to Fokko/arrow that referenced this issue Jul 8, 2024
@Fokko Fokko changed the title Unable to cast date32 to date32 Unable to cast date{32,64} to date{32,64} Jul 8, 2024
pitrou pushed a commit that referenced this issue Jul 10, 2024
### Rationale for this change

This one seems to be missing, see #43183

### What changes are included in this PR?

### Are these changes tested?

I'm not sure what the best place is to test this, please advise

### Are there any user-facing changes?

* GitHub Issue: #43183

Lead-authored-by: Fokko <[email protected]>
Co-authored-by: Fokko Driesprong <[email protected]>
Co-authored-by: Hyunseok Seo <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
@pitrou pitrou added this to the 18.0.0 milestone Jul 10, 2024
@pitrou
Copy link
Member

pitrou commented Jul 10, 2024

Issue resolved by pull request 43192
#43192

@pitrou pitrou closed this as completed Jul 10, 2024
@jorisvandenbossche jorisvandenbossche changed the title Unable to cast date{32,64} to date{32,64} [C++][Python] Unable to cast date{32,64} to date{32,64} Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants