Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Handle negative sign (-) when parsing ISO 8601 durations #37172

Closed
mgmarino opened this issue Oct 16, 2020 · 3 comments · Fixed by #39497
Closed

BUG: Handle negative sign (-) when parsing ISO 8601 durations #37172

mgmarino opened this issue Oct 16, 2020 · 3 comments · Fixed by #39497
Assignees
Labels
Bug Timedelta Timedelta data type
Milestone

Comments

@mgmarino
Copy link
Contributor

Related to #37159, #29773, #36204, splitting out only dealing with the behavior of the negative sign when parsing ISO 8601 Durations.

The current behavior is somewhat counter intuitive:

"P-6DT0H50M3.010010012S" parses as Timedelta( days=-6, minutes=50, seconds=3, milliseconds=10, microseconds=10, nanoseconds=12, )
, and the negative is only allowed right after the P descriptor. A negative in any other position will raise an error.

This comment notes that the original spec for 8601 doesn't mention negativity at all, but that some other "extensions" (e.g. usage of it in Java Duration) do support it. I have been unable to find the detailed ISO 8601 spec.

As far as I can tell, there are a few possibilities to deal with this here:

  • explicitly drop support for the negative sign
  • only support an overall "-" e.g. "-P6DT1H" = Timedelta('-7 days +23:00:00') and/or
  • support positive/negative on each, e.g. "P7DT-1H3M" = Timedelta('6 days 23:03:00')

Originally posted by @mgmarino in #37159 (comment)

@jreback jreback added Bug Timedelta Timedelta data type labels Oct 17, 2020
@jreback jreback added this to the Contributions Welcome milestone Oct 17, 2020
@mgmarino
Copy link
Contributor Author

The link to the relevant documentation in the Java Duration class.

They note in the docs there that negatives are not part of the ISO 8601 standard.

My suspicion, however, is that many users need to parse "ISO8601-like" strings that include these extensions. This is indeed my case as well. As such, I would propose supporting the negative as Java Duration does it, e.g.:

"PT-6H3M" -- parses as "-6 hours and +3 minutes"
"-PT6H3M" -- parses as "-6 hours and -3 minutes"
"-PT-6H+3M" -- parses as "+6 hours and -3 minutes"

@avinashpancham
Copy link
Contributor

take

@cnygardtw
Copy link

while investigating this, I found a link to a thread on the postgresql mailing list discussing the same issues, which references an extension to 8601: https://www.postgresql.org/message-id/9q0ftb37dv7.fsf%40gmx.us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
4 participants