Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: stop relying on time.LoadLocation for time zone data #36864

Closed
knz opened this issue Apr 16, 2019 · 9 comments
Closed

sql: stop relying on time.LoadLocation for time zone data #36864

knz opened this issue Apr 16, 2019 · 9 comments
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL B-os-windows Issues specific to the Windows OS. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@knz
Copy link
Contributor

knz commented Apr 16, 2019

Summary of the problem

CockroachDB uses the Go language's idea of where time zones are defined, and this is non-standard. This breaks time zone support in some environments, most notably Windows.

Workarounds

  • run on an OS platform where Go has proper timezone support (e.g. Linux)
  • get a copy of zoneinfo.zip from the Go sources and set the env var ZONEINFO to point to it
  • install the Go sources and set GOROOT to point to them

Note: since 20.1 a node will refuse to start up (#45680) if the tzdata is not available. It is possible to forcefully start a node (and cause inconsistent time results in SQL) via the env var COCKROACH_INCONSISTENT_TIME_ZONES.

Description of the problem

CockroachDB currently uses Go's standard time.LoadLocation to load time zone data.

However Go's standard time package is very simple/dumb. It takes the time zone name given as a file name, tries to open that in the system-wide tzdata directory, and satisfies itself with what it finds there. If the file does not exist (as would happen if the given string is mis-cased), tough luck.

There are multiple problems with this design:

Solution

CockroachDB should:

  • embed its own copy of a time zone database (like postgres does? I'm not sure)
  • it should only accept time zone names that are known to be valid/exist when cockroachdb was built (to avoid trying to open a file with a special name)
  • it should provide its own custom time zone loader, which knows how to find time zone names case-insensitively.

Another approach would be to scan the system-wide timezone database upon process start-up and then normalize the name cases in an in-RAM cache.

@rolandcrosby
Copy link

@knz How big of a work item is this? If it's not huge, is it something that we could consider backporting to a 19.1 point release?

@knz
Copy link
Contributor Author

knz commented Apr 17, 2019

No I don't think a backport would be reasonable, unless there was a hefty $$$ reason to. In any case you'd need to escalate to @bdarnell (or @petermattis)

@bdarnell
Copy link
Contributor

This is a pretty large change and I don't think a backport is appropriate. But this change is also trying to solve many issues at once. Some of them might have simpler/partial fixes that are backportable (for example, supporting lowercase utc (perhaps only for UTC) for #14988 would be easy).

embed its own copy of a time zone database (like postgres does? I'm not sure)

Postgres does embed its own tzdata, with a commonly-used compile-time option to use the system tzdata instead.

@knz
Copy link
Contributor Author

knz commented Apr 17, 2019

lowercase utc is literally the only change I think is small

@awoods187 awoods187 added A-sql-pgcompat Semantic compatibility with PostgreSQL C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Apr 22, 2019
@knz
Copy link
Contributor Author

knz commented Jun 17, 2019

@jordanlewis please consider this for a starter project. It's super well scoped, not too complicated, but still challenging/interesting.

knz added a commit to knz/cockroach that referenced this issue Jun 17, 2019
This patch is a crutch to support the special case of `"utc"` by
popular demand. For more details and guidance towards a better
solution, see issue cockroachdb#36864.

Release note (sql change): CockroachDB now supports the special case
`set timezone = 'utc'` as a special alias for `set timezone =
'UTC'`. The other time zone names are still case-sensitive as
previously, pending resolution of issue cockroachdb#36864.
craig bot pushed a commit that referenced this issue Jun 17, 2019
38195: sql: special case lowercase "utc" as alias to "UTC". r=knz a=knz

Informs #36864.
Informs (and perhaps closes) #14988.

This patch is a crutch to support the special case of `"utc"` by
popular demand. For more details and guidance towards a better
solution, see issue #36864.

Release note (sql change): CockroachDB now supports the special case
`set timezone = 'utc'` as a special alias for `set timezone =
'UTC'`. The other time zone names are still case-sensitive as
previously, pending resolution of issue #36864.

Co-authored-by: Raphael 'kena' Poss <[email protected]>
craig bot pushed a commit that referenced this issue Jun 17, 2019
38197: cli: warn the user if the time zone database is unavailable r=knz a=knz

Fixes #23828.
(The part in that issue about the database not being up to date cannot be fixed using the Go standard library - this does not give access to the search path and the version number. A comprehensive fix needs to wait for #36864.)

Release note (cli change): CockroachDB will now print out an error
message and an informative hint if the time zone database is unusable.

38214: storage/tscache: save 4 bytes per encoded tscache value r=ajwerner a=nvanbenschoten

`unsafe.Sizeof(hlc.Timestamp{})` was including XXX_sizecache, which was bloating
the struct size even though it didn't have an effect on the encoded size of the
timestamps, which we control in `encodeValue`.

Even once we remove `XXX_sizecache` (#37706), it looks like `unsafe.Sizeof` will
still include padding which we don't want to capture in the `encodedTsSize` constant,
so this seems like the best approach. See https://play.golang.org/p/5swJSSbP6J4.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@kenliu kenliu added the B-os-windows Issues specific to the Windows OS. label Jul 24, 2019
@otan
Copy link
Contributor

otan commented Feb 11, 2020

Also note timezone data doesn't [seem to] exist after 2038 (whoo, int32!), causing bugs such as cockroachdb/django-cockroachdb#124 when using golang's time library.

This may be fixed in https://go-review.googlesource.com/c/go/+/161202/ though.

tbg pushed a commit to tbg/cockroach that referenced this issue Mar 4, 2020
Previously, unavailability of named time zones (issue cockroachdb#36864) was only
causing an error message to be printed on the terminal and the log
files. However, some users were not looking and would instead run
directly into silently incorrect results in SQL.

This change ensures a node doesn't even start when the TZ database is
not ready.

Example output:

```
ERROR: unable to load named timezones
HINT: See: cockroachdb#36864
--
Check that the time zone database is installed on your system, or
set the ZONEINFO environment variable to a Go time zone .zip archive.
Failed running "demo"
```

Release note (cli change): CockroachDB now refuses to start if named
time zones are not properly configured. It is possible to override
this behavior for testing purposes, with the understanding that doing
so will cause incorrect SQL results and other inconsistencies, using
the environment variable `COCKROACH_INCONSISTENT_TIME_ZONES`.
kota2and3kan added a commit to kota2and3kan/dockerfile-cockroachoss that referenced this issue May 28, 2020
Since 20.1 a cockroach process will refuse to start up if the tzdata
is not available. Please refer the following URL for details.
cockroachdb/cockroach#36864
@bobvawter
Copy link
Contributor

Go 1.15 will include an option to bake (fallback) tzdata into the binary.

https://tip.golang.org/doc/go1.15#time/tzdata

@otan
Copy link
Contributor

otan commented Oct 9, 2020

Thanks @gliptak -- let me try that out!

craig bot pushed a commit that referenced this issue Nov 16, 2020
56634: timeutil: embed Go time zone data into CockroachDB r=knz a=otan

Refs #36864

* Use go's time/tzdata to embed timezone data.
* Modify timeutil package to always use this package.
* Remove cli checks for tzdata.
* Add lint rule to use timeutil.LoadLocation

The binary size changes by ~1MB here.

Release note (general change): The timezone data is now built in to the
CockroachDB binary, which is the fallback source of time if tzdata is
not located by the default Go standard library.

56698: stopper: assert that stoppers don't leak r=tbg a=knz

Fixes #56697
Fixes #56696

Until now leaked stoppers were just reported in tests but did not
cause failures. This PR switches the check to cause errors instead.

Release note: None

Co-authored-by: Oliver Tan <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
@otan otan closed this as completed May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL B-os-windows Issues specific to the Windows OS. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

8 participants