Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for zzzz (and beyond) in format_datetime #11330

Closed

Conversation

kevinwilfong
Copy link
Contributor

Summary:
This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these. Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 22, 2024
Copy link

netlify bot commented Oct 22, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 08c6948
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/672051f59366710008889c79

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 24, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 24, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 24, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 24, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 24, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 25, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 25, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 26, 2024
…or#11330)

Summary:
Pull Request resolved: facebookincubator#11330

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 28, 2024
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

CMakeLists.txt Outdated
@@ -473,6 +473,25 @@ add_compile_definitions(FOLLY_HAVE_INT128_T=1)
set_source(folly)
resolve_dependency(folly)

if(CMAKE_SYSTEM_NAME MATCHES "Darwin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should move this to before boost - since I suspect ICU_SOURCE is only set after boost build ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fixed the Mac build, for Ubuntu, it seems like find package is not finding the installed ICU, so it builds a second one which has a different version and ends up trying to use a combination of both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, find package can find the installed ICU, we're hard coding the source to BUNDLED which forces it to build it. Setting it to SYSTEM for ICU seems to fix it.

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 28, 2024
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 29, 2024
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 29, 2024
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 29, 2024
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

Kevin Wilfong added 4 commits October 28, 2024 20:09
…or#11283)

Summary:

The format_datetime currently outptus the time zone id if for 3 or fewer 'z' characters
in the format string.  However, the JODA library, which this is based on, does this for
3 or more 'Z' characters.

This diff fixes this, as well as adds support for a single 'Z' (which outputs the same
thing as 'ZZ' just without the colon).  So 'Z' is fully supported for any number of
characters.

To be more explicit:

https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html

From the JODA docs:
`'Z' outputs offset without a colon (-0800), 'ZZ' outputs the offset with a 
colon(-08:00), 'ZZZ' or more outputs the zone id(America/Los_Angeles).`

And not clearly explained in the docs, but from experimentation:
`'z', 'zz', or 'zzz' outputs the abbreviation of the time zone(PDT), and 'zzzz' or more 
outputs the time zone name(Pacific Daylight Time)`

Currently DateTimeFormatter treats 'zzzz' or more like JODA treats 'ZZZ' or more.
This diff marks 'zzzz' or more as unsupported (we can implement that in a future
change), and moves that logic under 'ZZZ' or more to be consistent.

It also implements 'Z' (previously only 'ZZ' was implemented in DateTimeFormatter).

Reviewed By: bikramSingh91

Differential Revision: D64500193
Summary:

This diff adds support for JODA's ZZZ pattern in Presto's parse_datetime function.

This is used to parse time zone IDs (called "time zone names" in the tz library, but this 
means something else in JODA).

I borrowed the algorithm from JODA to ensure it matches Presto Java's behavior.  The
idea is to greedily consume the longest substring that matches a known time zone.  I
borrowed their algorithm which is to break the set of known time zones into a list of
those without a prefix (without the '/' character) and lists of suffixes for those with
prefixes.  This limits the number of strings that need to be compared.  I modified it
slightly to pre-sort these lists by size descending, so we don't have to necessarily 
compare every string, but can stop early if we find a match.

One other change is I added a get_time_zone_names function to our copy of the tz
library.  I tried calling get_tzdb() from DateTimeFormatter directly and accessing its
zones member to get the names, but for some reason after get_tzdb() returns every
time_zone in zones (except the first one) has a string name_ that has nullptr for its
data after get_tzdb() returns. I spent a good amount of time trying to figure out why,
but couldn't figure it out, so I gave up and added this helper method (for whatever
reason everything is fine as long as it's done in the tz.cpp file).  If anyone has pointers
as to what's going on I'm happy to investigate further, I'd much rather use the existing
get_tzdb function if I can.

Reviewed By: bikramSingh91

Differential Revision: D64708598
Summary:

This diff adds support for JODA's z, zz, zzz patterns (all equivalent) in Presto's
forma_datetime function.

This is used to format time zone abbreviations.

Reviewed By: pedroerp

Differential Revision: D64774281
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64795407

kevinwilfong pushed a commit to kevinwilfong/velox that referenced this pull request Oct 29, 2024
…or#11330)

Summary:

This diff adds support for JODA's zzzz (or more) patterns (all equivalent) in Presto's
format_datetime function.

This is used to format long time zone names.

Long time zone names are not available from the IANA time zone database, so we
can't use the tz library to generate these.  Fortunately, unicode provides some
utilities to generate these.

Reviewed By: pedroerp

Differential Revision: D64795407
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 1b5b013.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants