-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Python][R] PrettyPrint ignores timezone #30117
Comments
Joris Van den Bossche / @jorisvandenbossche: arrow/cpp/src/arrow/util/formatting.h Lines 431 to 473 in bf67ec7
Currently this simply does not take into account any timezone information of the type, and formats the stored UTC epoch. Personally, I find this confusing as the current repr for the array has no indication whatsoever how to interpret the printed time values (and without any indication, my first expectation would be local time in the timezone of the type, which isn't the case). The formatting code already uses the date.h utilities (and which we have vendored anyway), so in principle we could use date.h to first localize the epoch value. However, that makes printing dependent on finding a timezone database (eg not yet supported on Windows at the moment). An alternative could be to keep the printed value in UTC but add a |
Antoine Pitrou / @pitrou: |
Joris Van den Bossche / @jorisvandenbossche: Personally I would still prefer actual localized times, but adding the UTC indication would already help a lot. |
Antoine Pitrou / @pitrou: |
Rok Mihevc / @rok: What would the final (once timezone database is always present) default be? Z or localized? |
Jonathan Keane / @jonkeane: Separately from the default, we should also indicate in some way that though that the type does include a (non-UTC) timezone. If I only saw a string of timestamps like below (+ Z), I would assume the array was set to UTC (and try casting it to a different TZ, and be confused/annoyed that it 'didn't seem to work') arr <- Array$create(ts)
arr
#> Array
#> <timestamp[us]> <------ maybe add timezone information here?
#> [
#> 2020-01-01 09:00:00.000000,
#> 2020-01-01 10:00:00.000000,
#> 2020-01-01 11:00:00.000000,
#> 2020-01-01 12:00:00.000000,
#> 2020-01-01 13:00:00.000000,
#> 2020-01-01 14:00:00.000000,
#> 2020-01-01 15:00:00.000000,
#> 2020-01-01 16:00:00.000000,
#> 2020-01-01 17:00:00.000000,
#> 2020-01-01 18:00:00.000000
#> ] |
Antoine Pitrou / @pitrou: |
Joris Van den Bossche / @jorisvandenbossche: |
Rok Mihevc / @rok: |
Rok Mihevc / @rok: |
I created a PR for the simple version of the fix which is adding a "Z" to the end of the string in case timezone is defined. >>> import pyarrow as pa
>>> # tz defined
>>> pa.array([0], pa.timestamp('s', tz='+02:00'))
<pyarrow.lib.TimestampArray object at 0x131399fc0>
[
1970-01-01 00:00:00Z
]
>>> # tz not defined
>>> pa.array([0], pa.timestamp('s'))
<pyarrow.lib.TimestampArray object at 0x13139a020>
[
1970-01-01 00:00:00
] I haven't added any change in the type output as the timezone information is already included: >>> a = pa.array([0], pa.timestamp('s', tz='+02:00'))
>>> a.type
TimestampType(timestamp[s, tz=+02:00]) or do we want info about the timezone in array object print? |
Thanks for doing this @AlenkaF ! Did you consider adding an option to print in local time as per ISO8601 convention displaying local time + offset from UTC? |
Printing in local time would definitely be good to have. What I was thinking is to have the simple fix asap and then do a follow-up with localized strings when possible. |
…when tz defined (#39272) ### What changes are included in this PR? This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC. ### Are these changes tested? Yes. ### Are there any user-facing changes? There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined. * Closes: #30117 Lead-authored-by: AlenkaF <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Co-authored-by: Rok Mihevc <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…tring when tz defined (apache#39272) ### What changes are included in this PR? This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC. ### Are these changes tested? Yes. ### Are there any user-facing changes? There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined. * Closes: apache#30117 Lead-authored-by: AlenkaF <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Co-authored-by: Rok Mihevc <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…tring when tz defined (apache#39272) ### What changes are included in this PR? This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC. ### Are these changes tested? Yes. ### Are there any user-facing changes? There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined. * Closes: apache#30117 Lead-authored-by: AlenkaF <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Co-authored-by: Rok Mihevc <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…tring when tz defined (apache#39272) ### What changes are included in this PR? This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC. ### Are these changes tested? Yes. ### Are there any user-facing changes? There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined. * Closes: apache#30117 Lead-authored-by: AlenkaF <[email protected]> Co-authored-by: Alenka Frim <[email protected]> Co-authored-by: Rok Mihevc <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
When printing TimestampArray in pyarrow the timezone information is ignored by PrettyPrint (str calls to_string() in array.pxi).
Reporter: Alenka Frim / @AlenkaF
Watchers: Rok Mihevc / @rok
Related issues:
Note: This issue was originally created as ARROW-14567. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: