-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
timeutil: embed time zone data into CockroachDB #55377
Conversation
Sorry I don't get it - why did you opt for a 3rd party package instead of using the facility we get for free in go 1.15? |
I recall we're waiting for Go 1.16 before an upgrade. |
@nvanbenschoten @petermattis do you think we could consider go 1.15 for crdb v21.1 if 1.16 is not there on time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR as-is is a good intermediate step even if go 1.15 makes it even simpler in the future
Also @awoods187 @bdarnell motion to backport this to 20.2? The change seems minimal.
Reviewed 14 of 14 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained
* Use 4d63.com/tz to embed timezone data. * Modify timeutil.LoadLocation to use this data. * Remove cli checks for tzdata. * Add lint rule to use timeutil.LoadLocation The binary size changes by ~1MB here. Release note (general change): The timezone data is now built in to the CockroachDB binary, thus no longer relying on the system tzdata for locations.
Yes, this is what we had decided upon when we switched back to go 1.13. We're going to switch to go 1.15 early in the next release cycle (a week or two after v20.2 is out the door). go 1.16 won't be released until 2021, so it's not clear whether it will land in time for us to use it in v21.1. |
I think i'm still in favour of this solution as it is patchable to allow lower case time zone comparisons (#36847), as well as a good safety net in case go 1.15's partial fix does not work out and we need to go back to 1.13. (also backportable to 20.2 if we chose to) |
Yes your PR is good to go as-is. Your argument is sound. |
The go 1.15 tzdata package is used as a fallback - if external time zone data is found on this system, it will be used instead. This package uses the embedded data as the sole source. That means you lose the ability to update the tzdata package as updates occur, which is sometimes significant as governments like to change DST rules on short notice. (and it appears that this package tracks the zoneinfo files as embedded in Go, rather than the upstream sources from IANA). This feels like a significant regression to me, although maybe it's not one in practice since docker makes it impractical to manage tzdata updates in the traditional way and I've not heard of anyone who cares or even notices. Neither this change nor the go 1.15 solution completely solve #31978. With external tzdata, there is a window of inconsistency when a new version of the tzdata package is rolled out, but this is a low-risk change so it (can be) done rapidly across the fleet. Embedding tzdata in the binary means that your time zone data is inconsistent for the duration of your CRDB canary and upgrade process, which can be quite long.
But the author does not intend to keep updating it after go 1.16 is released, so this would put us on the hook for future maintenance. So I'm 👎 on backporting this to 20.2, although it would have the nice side effect of letting us more easily use |
ok so maybe this PR should not be marked to solve #31978 - there would be more work needed afterwards (store the tzdata in-db to ensure it's consistent across all nodes) |
@bdarnell regardless of whether we adopt this PR or adopt go 1.15, what do you think the future looks like? It seems to be that we are committed to running CockroachDB inside containers, and that goes with keeping the system image immutable at run-time. So the only way to "pick up the latest IANA changes" is to re-bake the container base image, while keeping the CockroachDB version constant. So do you think that's going to be our process (in theory)?
If we were to recommend this, why do you think it's better to depend on system-level tzdata, instead of an archive we'd inject ourself at step (1)? Finally, what do you recommend should be a good next step for @otan here? |
We could introduce an env variable to use a tzdata from a user compiled archive instead of our embedded version if we so choose (or even the system one). I think a user can mount this onto each docker image (or in the case of k8s use a config map) if they wanted to pursue this avenue. Alternatively, we could make using the system one the default and falling back to the embedded version in 20.2 if we so choose (of course, we should log this in cli/start or demo). (To fully solve #31978 it sounds like tzdata should be stored in the database as part of a system table, which is also gross as updating it with IANA sounds like a painful primitive and needing to use internal executors everywhere to fetch this metadata. I suppose that solution fixes this too but I'm not sure if this is viable. ) |
I think in the long-term future, in order to solve #31978 we need to support multiple versions of tzdata and switch between them in a coordinated way across all nodes, which precludes either using the system files or the go 1.5 package. But I think that's a long way off, so let's think about the nearer-term future.
If we use the system-level data, each time we produce a patch release we get the latest tzdata from redhat automatically (along with security patches for base libraries, etc). If we embed something into the binary, we are either tied to the Go upgrade cycle (they appear to only plan to update the go embedded tzdata every 6 months, although I'm sure that if a major jurisdiction has a late-breaking DST change they'll do a faster release) or we have to do extra work on our side to pick up updates.
I think I'd just wait for 21.1 and go 1.15. If we really wanted to backport embedded tzdata into 20.2.x we could add something that defaults to the system tzdata and falls back to the embedded one if it's missing, but this feels like a new feature that wouldn't qualify for a backport. |
Looks like #55467 takes care of that. I guess that just leaves the remaining question of whether this is "better" than having the tzdata behaviour depend on the system. |
alrighty, since we're going to go1.15 soon, i'll leave this alone |
The binary size changes by ~1MB here.
full credits to @gliptak for linking.
Informs #36864
Refs #31978
Release note (general change): The timezone data is now built in to the
CockroachDB binary, thus no longer relying on the system tzdata for
locations.