incorrect date without leading zero in .get('2016-1-17') #292

bencharb · 2016-01-04T14:00:35Z

This is misleading. '2016-01-17' != '2016-1-17'

import arrow
import dateutil

withzero = '2016-01-17'
withoutzero = '2016-1-17'
assert dateutil.parser.parse(withzero) == dateutil.parser.parse(withoutzero)
assert arrow.get(withzero) != arrow.get(withoutzero)

mattalytics · 2016-01-05T19:41:53Z

Even more disturbingly:

arrow.get('2016/1/1').datetime == arrow.get('2016/1/10').datetime

!!!

bencharb · 2016-01-06T00:10:22Z

I'm startled that this basic parsing fails.

philiptzou · 2016-01-06T02:46:26Z

According to the document, arrow.get support the format "2016-01-17" is because it is an ISO-8601-formatted str ^{[doc:ArrowFactory]}. According to RFC3339 which follows ISO-8601, the date-month, date-mday, time-hour, time-minute and time-second are all strict 2 digit chars^{[rfc3339:sec5.6]}.

So I'm afraid this might be wontfix and perhaps you could use more flexible way to parse that date string. For example:

import arrow
arrow.parser.DateTimeParser().parse('2016-1-10', 'YYYY-M-D')

Or another solution here is we add another method to DateTimeParser which loosely parse string looks like ISO-8601 but not exactly is? Actually you can add such thing to your software easily. Just copy all the code of DateTimeParser.parse_iso and replace all MM to M, DD to D, HH to H, mm to m, and ss to s. Also don't forget the ones in DateTimeParser.MARKERS. And you got your own parse_loose_iso function.

mattalytics · 2016-01-06T03:00:22Z

Hi Philip,

I see. That is not unreasonable. However, if Arrow is to strictly
enforce the 2 digit month/day standard, perhaps a 1 digit day/month (e.g.
2010-1-1) should throw an error. Quietly transforming the date is a good
way to cause bugs in users' implementations.

Personally, I like the idea of arrow being a bit more flexible. Ease of
use is why I decided to try using arrow. Just my personal perspective.

Matt
On Tuesday, January 5, 2016, Philip Tzou [email protected] wrote:

According to the document, arrow.get support the format "2016-01-17" is
because it is an ISO-8601-formatted str [doc:ArrowFactory]
http://arrow.readthedocs.org/en/latest/#arrow.factory.ArrowFactory.
According to RFC3339 which follows ISO-8601, the date-month, date-mday,
time-hour, time-minute and time-second are all strict 2 digit chars
[rfc3339:sec5.6] http://tools.ietf.org/html/rfc3339#section-5.6.

So I'm afraid this might be wontfix and perhaps you could use more
flexible way to parse that date string. For example:

import arrow
arrow.parser.DateTimeParser().parse('2016-1-10', 'YYYY-M-D')

Or another solution here is we add another method to DateTimeParser which
loosely parse string looks like ISO-8601 but not exactly is? Actually you
can add such thing to your software easily. Just copy all the code of
DateTimeParser.parse_iso and replace all MM to M, DD to D, HH to H, mm to
m, and ss to s. Also don't forget the ones in DateTimeParser.MARKERS. And
you got your own parse_loose_iso function.

—
Reply to this email directly or view it on GitHub
#292 (comment).

bencharb · 2016-01-06T04:03:17Z

I'm with Matt, if the date string is invalid or ambiguous it ought to raise an exception. Thanks for your attention to this.

philiptzou · 2016-01-06T05:50:35Z

I'm also with the idea of raising an exception. I think it is feasible. Btw, @bencharb I think you are still need to parse the non-standard string. I know Arrow is great which helps us solved the timezone headache and output beatiful human-readable strings, but it may be not the best choice suitable for your needs. AFAIK parsedatetime is good at parsing human-readable datetime strings. So you may want to try that instead.

bencharb · 2016-01-06T22:36:09Z

That works, thanks, @philliptzou.

balihoo-gens · 2016-09-08T20:40:46Z

I'd like to reference issue #267 here as it points out a similar case of defaulting to 1 when parsing fails (instead of an exception). Example:

>>> arrow.get("2016-09-31")
<Arrow [2016-09-01T00:00:00+00:00]>

One would hope for an exception like dateutils.parser.parse gives:

>>> dateutil.parser.parse("2016-09-31")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "dateutil/parser.py", line 1164, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "dateutil/parser.py", line 577, in parse
    ret = default.replace(**repl)
ValueError: day is out of range for month

Currently, my workaround is to not have arrow.get do any parsing, using dateutil instead:
arrow.get(dateutil.parser.parse(date_string))

andrewelkins · 2016-12-31T22:56:32Z

Will be handled by #91

andrewelkins mentioned this issue Dec 31, 2016

Accurate handling of parsing errors #91

Closed

andrewelkins closed this as completed Dec 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incorrect date without leading zero in .get('2016-1-17') #292

incorrect date without leading zero in .get('2016-1-17') #292

bencharb commented Jan 4, 2016

mattalytics commented Jan 5, 2016

bencharb commented Jan 6, 2016

philiptzou commented Jan 6, 2016

mattalytics commented Jan 6, 2016

bencharb commented Jan 6, 2016

philiptzou commented Jan 6, 2016

bencharb commented Jan 6, 2016

balihoo-gens commented Sep 8, 2016

andrewelkins commented Dec 31, 2016

incorrect date without leading zero in .get('2016-1-17') #292

incorrect date without leading zero in .get('2016-1-17') #292

Comments

bencharb commented Jan 4, 2016

mattalytics commented Jan 5, 2016

bencharb commented Jan 6, 2016

philiptzou commented Jan 6, 2016

mattalytics commented Jan 6, 2016

bencharb commented Jan 6, 2016

philiptzou commented Jan 6, 2016

bencharb commented Jan 6, 2016

balihoo-gens commented Sep 8, 2016

andrewelkins commented Dec 31, 2016