-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of dates and timestamps before the 20th century #4563
Conversation
nit: commit message summary to long |
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Show resolved
Hide resolved
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
// the default date when the Gregorian calendar was instituted (October 15, 1582) | ||
private static final long GREGORIAN_CALENDAR_INTRODUCTION = new GregorianCalendar().getGregorianChange().getTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "the default" here?
The cutoff date defaults to 1582-10-15, but you are not using a constant.
Instead youre asking for locale/env specific date. Maybe
// The date when the Gregorian calendar was instituted, environment specific. Defaults to October 15, 1582.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a constant. The actual change happened on different dates, in different countries, but GregorianCalendar
doesn't have information for specific locales. Instead, it provides a method the client can call to specify the cutover date (setGregorianChange()
). So unless that method is called, getGregorianChange()
will always return the same value.
Anyway, after adding some more tests, it turns out that Joda millisecond values are not consistent with java.sql.Date even for more recent dates (as late as the 1800's), so I decided to use Joda only starting with the 20th century.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, after adding some more tests, it turns out that Joda millisecond values are not consistent with java.sql.Date even for more recent dates (as late as the 1800's), so I decided to use Joda only starting with the 20th century.
I forgot about that, but indeed. I think i remember finding a zone where java time and java.util.Calendar differed > 1900 (but before 1950).
java time and java.util.Calendar use the same tz data file, but they seem to parse it differently.
... I can only find as recent as 1890s in multiple zones (eg Europe/Warsaw or Asia/Aden), so maybe my memory is wrong. Using 1900 seems to be safe.
java time and joda seem to behave the same as long as they have the same tz data. But tz data shouldn't change for the past date/times.
This is some variation of the test code i am using: https://github.com/findepi/urandom-bits/blob/master/src/main/java/findepi/time/JodaJdkZoneDrift.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that's good info.
if (millis >= GREGORIAN_CALENDAR_INTRODUCTION) { | ||
return new Date(millis); | ||
} | ||
// the chronology used by default by Joda is not appropriate for dates preceding the introduction of the Gregorian calendar, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is "appropriate" in some way. I think it would be better to indicate the actual problem
// the chronology used by default by Joda is not appropriate for dates preceding the introduction of the Gregorian calendar, | |
// The chronology used by Joda is not consistent with java.sql.Date | |
for dates preceding the introduction of the Gregorian calendar. | |
// Same millisecond value represents a different year/month/day. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation says:
it is not historically accurate before 1583
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the "historically accurate" term
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course it's not historically accurate for >1583 for some (many) countries as well
} | ||
// the chronology used by default by Joda is not appropriate for dates preceding the introduction of the Gregorian calendar, | ||
// so for such cases we are falling back to the more expensive GregorianCalendar; note that Joda also has a chronology that | ||
// works for older dates, but it uses a slightly different algorithm, so we are sticking with the standard library |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the end user visible symptoms of these differences? are they test covered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertions I have added for dates before the introduction of the Gregorian calendar were failing using Joda's chronology.
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
235660c
to
e9fcffe
Compare
Please update the commit description to be a single line, without an asterisk:
The body should be wrapped at 72 characters and provide explanatory text. It's fine to use a bulleted list if that's the best way to explain something, but using a single bullet with nothing else looks strange. See https://chris.beams.io/posts/git-commit/ |
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
presto-jdbc/src/main/java/io/prestosql/jdbc/AbstractPrestoResultSet.java
Outdated
Show resolved
Hide resolved
@@ -193,6 +225,45 @@ public void testObjectTypes() | |||
assertEquals(rs.getTimestamp(column), Timestamp.valueOf(LocalDateTime.of(2018, 2, 13, 13, 14, 15, 555_555_556))); | |||
}); | |||
|
|||
// distant past, but apparently not an uncommon value in practice | |||
checkRepresentation("TIMESTAMP '0001-01-01 00:00:00'", Types.TIMESTAMP, (rs, column) -> { | |||
assertEquals(rs.getObject(column), Timestamp.valueOf(LocalDateTime.of(1, 1, 1, 0, 0, 0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize the others do this, but it'd be better to factor this out
Timestamp expected = Timestamp.valueOf(LocalDateTime.of(1, 1, 1, 0, 0, 0);
That way it's clear to the reader that these are the same value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Others do it to avoid sharing a mutable objected as the expected state.
Merged, thanks! |
No description provided.