Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Absolute.from() have an option to resolve offset vs. timezone conflicts? #716

Closed
justingrant opened this issue Jul 1, 2020 · 28 comments · Fixed by #1097
Closed
Assignees
Labels
documentation Additions to documentation has-consensus non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! spec-text Specification text involved

Comments

@justingrant
Copy link
Collaborator

justingrant commented Jul 1, 2020

Should Temporal should have a way to handle offset-vs.-zone conflicts when parsing ISO strings by giving users an option in Absolute.from to choose whether IANA time zone definition rules or time zone offset should "win" if they conflict? (This only applies when parsing an Absolute from an extended ISO string with both an IANA suffix and an offset.)

As @gibson042 has raised, ISO strings that are valid when stored can end up conflicting between the offset and the time zone if the time zone definition is changed after the ISO string is stored. For example, when Western Europe permanently stops observing DST in 2021 (unless it's postponed again), far-future times that were stored before the tzData change may have a conflict between the persisted time zone offset (reflecting the old time zone rules) and the new time zone rules which might require a different offset for that stored instant.

This won't be a common case, but it will certainly happen, especially after the 2021 change in Europe which will coincidentally be around the same time that the first Temporal usage may be going into production.

To handle this potential conflict, #700 proposes a configuration option key offset for the LocalDateTime.from method. But @ptomato noted in #700 (comment) that this problem applies to Absolute too:

This was the main use case we identified for Temporal.parse(). If we adopt this flexibility in Temporal.LocalDateTime, though, then it should be used in Temporal.Absolute.from() as well.

@sffc raised the same concern in #706 (comment)

the string initializer problem isn't unique to LocalDateTime; Absolute has that problem as well.

So I'm filing this issue to close the loop.

Here's the proposed option used in LocalDateTime.from in #700:

/**
 * Time zone definitions can change. If an application stores data about events
 * in the future, then stored data about future events may become ambiguous, for
 * example if a country permanently abolishes DST. The `offset` option controls
 * this unusual case.
 *
 * - `'use'` always uses the offset (if it's provided) to calculate the absolute
 *   time. This ensures that the result will match the absolute time that was
 *   originally stored, even if local clock time is different.
 * - `'prefer'` uses the offset if it's valid for the date/time in this time
 *   zone, but if it's not valid then the time zone will be used as a fallback
 *   to calculate the absolute time.
 * -  `'ignore'` will disregard any provided offset. Instead, the time zone and
 *    date/time value are used to calculate the absolute time. This will keep
 *    local clock time unchanged but may result in a different real-world
 *    instant.
 * - `'reject'` acts like `'prefer'`, except it will throw a RangeError if the
 *   offset is not valid for the given time zone identifier and date/time value.
 *
 * If a time zone offset is not present in the input, then this option is
 * ignored.
 *
 * If the offset is not used, and if the date/time and time zone don't uniquely
 * identify a single absolute time, then the `disambiguation` option will be
 * used to choose the correct absolute time. However, if the offset is used
 * then the `disambiguation` option will be ignored.
 */
export interface TimeZoneOffsetDisambiguationOptions {
  offset: 'use' | 'prefer' | 'ignore' | 'reject';
}

I have no strong opinion about the naming of this property or of the individual option strings.

I do have a moderately-strong opinion that the offset should win by default because:

  1. This is an unusual case so not having a default will add unnecessary complexity.
  2. Defaulting to reject seems contrary to our general "don't throw by default" approach taken for handling DST ambiguity, overflows, etc.
  3. If we need a default, the consequence of changing UTC values is typically worse (e.g. scheduled tasks run an hour early or late) than the consequence of changing local times.
  4. When parsing an Absolute it seems reasonable to prioritize the Absolute value that was originally stored so that round-tripping an Absolute to a string will always result in the same value, even if its local time is now different.

Absolute.from doesn't have an options parameter today so there's no naming conflicts. That said, #607 applies here too: if we do adopt a LocalDateTime type then we'll have methods that accept multiple options properties. Naming them all disambiguation (even for different kinds of disambiguation) will make it harder to interop between LocalDateTime and other Temporal types.

@ptomato ptomato added the zoned-type Part of the effort for a type with timestamp+timezone label Jul 1, 2020
@justingrant
Copy link
Collaborator Author

@ptomato - I'm going to remove the zoned-type label from this issue because it's really focused on what we should do with the existing Absolute type, and it applies even if there's no LocalDateTime. If you disagree feel free to put the label back.

@justingrant justingrant removed the zoned-type Part of the effort for a type with timestamp+timezone label Jul 3, 2020
@thojanssens
Copy link

ISO strings that are valid when stored can end up conflicting between the offset and the time zone

ISO 8601 strings don't contain the time zone, they can only contain an offset:
https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations

Why wouldn't we reject non-ISO strings?

@ptomato
Copy link
Collaborator

ptomato commented Jul 8, 2020

We use a (common) extension to the ISO format which puts the name of the time zone in brackets after the offset.

@thojanssens
Copy link

thojanssens commented Jul 8, 2020

For example Elixir won't accept a datetime string holding the time zone in brackets, as it is not conform to ISO. I think it's originally a Java thing.

Actually there is no reason to have the time zone AND offset in the datetime string. It just helps to create errors and confusion.

  • Future dates are mostly stored as (wall) datetime string and time zone (separately) => as offset might change
  • Past dates as datetime string with offset (or as timestamp) => offset might also change by IANA for past dates but we generally don't want those changes for something that happened in the past in our application

But why support a datetime string with offset and time zone? Why not stick to ISO and reject all other strings?

It's really just some food for thought, and I raise those questions here as this issue reveals exactly the problem with such strings.

@justingrant
Copy link
Collaborator Author

justingrant commented Jul 8, 2020

These are all good questions. Some have easier answers than others.

But why support a datetime string with offset and time zone?

A datetime+timezone pair is not sufficient to uniquely identify a moment in time. Offset is required to disambiguate repeated wall-clock times, e.g. during the hour after DST ends in the Fall. Some apps may be OK to ignore this ambiguity while others may not. Therefore the default is to always persist the offset, and the default is to prefer the offset over the time zone if both are present.

That said, it's a good idea to enable apps to avoid persisting the IANA identifier or the offset if their use cases don't need them. See #741 which is the persistence-time equivalent to this parse-time issue.

Why not stick to ISO and reject all other strings?

It's helpful to have a single-string format to enable interop between apps and platforms. While the bracketed Java extension isn't part of the ISO standard, it's popular across platforms and libraries so it's a reasonable alternative to asking every developer to create their own persistence format, which would make interop harder-- not just with Java, but also between different JavaScript apps and libraries.

Actually there is no reason to have the time zone AND offset in the datetime string. It just helps to create errors and confusion.

Here's a few use cases why having all three pieces of data (the wall time, the offset, and the time zone) can be helpful:

  • Apps may want to know if the time zone definition has changed since the string was stored. For example, if there's a conflict then the app may need to ask the user which one is correct, or the app may need to run additional logic to pick a winner.
  • Apps may not be sure how they'll want to resolve conflicts when they happen (or, more realistically, developers storing data may not understand the issue). Storing a "lossless" value retains flexibility for developers to later choose how to parse the data. But if the data is lossy by default, by the time developers understand the issue it may be too late to fix because data is already stored.
  • Even if an app may not need the offset or time zone, they may need to interop later with other code that does need to know it. By storing all three pieces, developers will have future-proof data unless they explicitly opt out.

I'm not saying that all apps need this flexibility. But mistakes in persistence can be hard or impossible to fix after the fact so it's smart to make the default "lossless" and to require apps to opt in to lossy persistence.

@thojanssens
Copy link

thojanssens commented Jul 9, 2020

A datetime+timezone pair is not sufficient to uniquely identify a moment in time. Offset is required to disambiguate repeated wall-clock times, e.g. during the hour after DST ends in the Fall. Some apps may be OK to ignore this ambiguity while others may not.

Even a datetime+offset+time zone is not sufficient to identify a unique moment as datestring in time with time zone. You need to add the tzdata version to it. For example "2022-07-09T07:32:00" + "+02:00" + "Europe/Brussels" + "tzdata2020a"
Only with these 4 fields you have a unique datestring moment with time zone.

When storing the time zone separately for example in a database:
datetime: "2022-03-09T05:00:00+02:00"
time zone: "Europe/Brussels"
It's now easy to take into account the offset over the time zone or vice-versa if those data conflict.
I think the time zone has been added to the ISO string because some devs wanted to save a field.

I was not saying that devs should not store an offset and time zone together. I was saying that those should be stored separately instead of combining those into a string.

Also about #741, sending a non-ISO-conform datetime string to my server (with time zone in brackets), will make my code crash (Elixir doesn't accept non-ISO strings). To have an ISO string you are required to add an option every time? Ooch.. That comes as a surprise.

Pros/cons of supporting bracketed time zones in datetime strings ->

PROS:

  • interop

CONS:

Now if you allow only ISO strings, all this burden disappear:-D

@ptomato
Copy link
Collaborator

ptomato commented Jul 13, 2020

To add to the list of pros:

  • Encourage storage format with a precedent, instead of ad-hoc format

@thojanssens
Copy link

thojanssens commented Jul 14, 2020

@ptomato it would be interesting to dig why ISO didn't embrace that extension adding the time zone. Probably for the same reasons:) We can see all it involves in the code and the questions that it raises.

As I mentioned here #741 (comment), this is how we currently get an ISO string from a given absolute and a given time zone:

const absolute = Temporal.now.absolute()
absolute.toString(Temporal.TimeZone.from('America/Los_Angeles').getOffsetStringFor(absolute))

It requires to get the offset from a TimeZone and a given absolute, then calling toString on that same absolute.
This is not ok.

For example @justingrant proposed the toString of an absolute + given time zone to omit the time zone to stay ISO compliant, but the toString of a zoned datetime (if any) to output the time zone. Quoting from #741 (comment):

Also, if LocalDateTime is adopted, then it could make sense to make { timeZone: false } the default for Absolute.prototype.toString() and { timeZone: true } the default for LocalDateTime.prototype.toString().

That would be confusing in my opinion.

These will be the questions we will have to deal with if we work with the extended format, which we currently support and seem to favor over ISO.

@ptomato
Copy link
Collaborator

ptomato commented Jul 14, 2020

As I mentioned here #741 (comment), this is how we currently get an ISO string from a given absolute and a given time zone:

const absolute = Temporal.now.absolute()
absolute.toString(Temporal.TimeZone.from('America/Los_Angeles').getOffsetStringFor(absolute))

It requires to get the offset from a TimeZone and a given absolute, then calling toString on that same absolute.
This is not ok.

It's certainly reasonable that that use case needs to be made more convenient, but I don't see why that invalidates the whole idea of using a common extension to the ISO 8601 format. I don't think this is necessarily a bad habit.

We already have a laundry list of ways in which the format we loosely refer to as "ISO" deviates from ISO 8601. Or, I should say, the current version of ISO 8601, because there were strings (such as MonthDay strings) accepted in ISO 8601:2000 that became forbidden in ISO 8601:2004. There is also RFC 3339 in common use which is almost like ISO 8601 but not quite: stricter in some ways, and looser in others. I think the reality is that we have to make these concessions due to the very common appearance around the web of not-quite-ISO strings. And this is by necessity a subjective judgement call. Clearly we should not accept wacky strings like Tuezzzzday, Julyjuly 14/! 2020 3:42 pM like Date.parse() does, clearly we should accept 2020-07-14 15:42-0700 (not valid ISO 8601 but extremely common), but in between we just have to arbitrarily pick where the boundary lies using our good judgement.

See also #198 for discussions of some of these deviations.

On some occasions the idea has been suggested to have a separate toISOString() method that emits only strict ISO 8601 and is not concerned with round-tripping: #293 (comment)

We also have an explainer that we are hoping to build wider consensus around so that eventually these extensions can one day be standards-tracked.

@thojanssens
Copy link

I think having a toISOString() function is a good idea 👍

@pipobscure
Copy link
Collaborator

pipobscure commented Jul 16, 2020

I think the question is misguided, and here is why:

  • by default an Absolute-toString does NOT contain a timezone or offset, but rather is expressed as a UTC string (with a Z at the end).
  • by default a DateTime-toString does NOT contain a timezone or an offset, but rather nothing.
  • the only way to get that version is by asking explicitly for it by passing a timezone or offset to Absolute-toString.

Because of that the actual IANA zone is only ever present if in storing the data it was intended to be the more important element. So the offset in that case is only meant to disambiguate. However it can only disambiguate fully if it throws an exception for the case where the offset is not valid for that datetime combination in that IANA zone.

So that in turn means that in Absolute.from this invalid combination always needs to throw unless being explicitly told not to. So we have to have a disambiguation parameter that can be earlier, later, offset or reject where reject is the default.

@thojanssens
Copy link

For past dates, it is the offset that is most important.
For future dates, the time zone is most important.

Because of that the actual IANA zone is only ever present if in storing the data it was intended to be the more important element. So the offset in that case is only meant to disambiguate. However it can only disambiguate fully if it throws an exception for the case where the offset is not valid for that datetime combination in that IANA zone.

So that in turn means that in Absolute.from this invalid combination always needs to throw unless being explicitly told not to. So we have to have a disambiguation parameter that can be earlier, later, offset or reject where reject is the default.

I didn't fully understand your arguments. I think you talk specifically about DST rule changes?

But IANA changes are not limited to DST changes. Say some country, without any DST rules, had its offset to +7 somewhere in the past, but it needs to be corrected to +6. In that context, the options "earlier" and "later" are not relevant.

In any case, I think throwing an exception because some random IANA change happened is not a good idea; you would have to add error handling code every time for any parsing of the extended format including an offset and a time zone.

@ptomato

This comment has been minimized.

@justingrant
Copy link
Collaborator Author

Now that we will have LocalDateTime, should we remove parsing and emitting of IANA time zones from Absolute? Might be cleaner to have all timezone-related serialization and deserializaton in one place.

@ptomato
Copy link
Collaborator

ptomato commented Aug 4, 2020

In my opinion: Remove from emitting yes. Remove from parsing in that the IANA time zone is ignored, but it is not an error to have one in a string passed to Temporal.Absolute.from().

(Discussion about removal of emitting IANA time zones from Absolute: #741)

@justingrant
Copy link
Collaborator Author

Agreed. Let's leave this issue open until after LocalDateTime is merged, and then we can remove parsing (meaning what you said where the time zone is ignored) and emitting of IANA time zones in Absolute.

@gibson042
Copy link
Collaborator

I don't think a parsing method should ignore part of its input, especially when that can displace future events. If Absolute.from isn't going to use bracketed time zones then it should reject them; authors can be directed to parse with LocalDateTime.from and then project the results into Absolute.

@pipobscure
Copy link
Collaborator

Hard disagree: all temporal types take input that overspecifies since that allows parsing parts of a whole. The IANA bracket is overspecified from an Absolute PoV and should just be ignored

@sffc
Copy link
Collaborator

sffc commented Aug 5, 2020

I don't think a parsing method should ignore part of its input, especially when that can displace future events. If Absolute.from isn't going to use bracketed time zones then it should reject them; authors can be directed to parse with LocalDateTime.from and then project the results into Absolute.

Previous discussions:

#428
https://github.com/tc39/proposal-temporal/blob/main/docs/parse-draft.md

@justingrant
Copy link
Collaborator Author

There's two questions here:

  1. Should Absolute.from have an option to resolve offset vs. timezone conflicts?
  2. What should be the default behavior when there is an offset vs. timezone conflict?

Did we already answer (1)? Should this behavior (whichever we decide for the default) be controllable by the Absolute.from caller like it will be in LocalDateTime.from? I don't have an opinion yet about this question. I'm curious about what others think. If we do have an option then we should try to align the option choices with LocalDateTime.from.

Re: the default behavior, I think there are good arguments on both sides. @gibson042 is correct that catching conflicts here could be useful. I also agree with @pipobscure that ignoring out-of-scope data is more consistent with the rest of Temporal. So I don't have a strong opinion about what the default should be.

FWIW, LocalDateTime's default is now 'reject', per the decision in last Friday's meeting:

Consensus is that this issue only appears when IANA time zone name and offset disagree, and the default should be to throw in that case. We should have a disambiguation option in all places where this might occur so that you can specify what to do instead of throwing.

Although I'm very open to being convinced otherwise, I lean towards ignoring conflicts, because someone using Absolute.from is opting in to parsing only the epoch timestamp. Not the calendar date or clock time. Not the time zone. Just the timestamp. So IMHO it seems reasonable to ignore the IANA time zone (at least by default) because if the user cared about the DateTime or the TimeZone, then they wouldn't be using Absolute.from. Instead they'd be using DateTime.from or LocalDateTime.from.

But I'm open to other arguments too; I don't have a strong opinion here.

BTW, I also might be suggesting this because "ignore" coincidentally matches an idea I had for a PR to the docs to better explain Temporal ISO 8601 parsing, via a visual diagram that maps string parts to Temporal types. If folks like the idea, I can PR it at some point. Something like this:

image

@ryzokuken
Copy link
Member

Ignoring in Absolute feels very on-brand wrt the rest of the parsing behavior as @justingrant pointed out and if someone wants explicitly not to ignore the bracketed timezone, they can initialize a LocalDateTime and project to an Absolute anyway, right?

@sffc
Copy link
Collaborator

sffc commented Aug 6, 2020

+1 to your diagram.

FWIW, LocalDateTime's default is now 'reject', per the decision in last Friday's meeting

That sounds bad to me. Generally, the offset and IANA name will match; it is rare that they don't match. Most (basically all) developers won't think about or handle the reject case, which means that their app will crash. I think defaulting to absolute disambiguation is almost always the right choice.

@gibson042
Copy link
Collaborator

I remain opposed to ignoring time elements, especially when an inconsistency can change the results—silent failures are even worse than unexpected errors. Anyone wanting to ignore the time zone and extract fixed-offset instants from strings like "2020-08-06T08:15+09:00[Undisclosed]" should be using Temporal.parse(str).absolute, not relying on sloppiness in an otherwise strict library.

@gibson042
Copy link
Collaborator

gibson042 commented Aug 20, 2020

Update: I was convinced in the meeting that it is acceptable to treat string input as a representation of structured fields and subsequently ignore irrelevant fields as with objects, subject to syntactic conformance with a well-documented format. Basically, Temporal[$type].from(string) is analogous to Temporal[$type].from( RegExp(formats[$type]).exec(string).groups ).

@sffc
Copy link
Collaborator

sffc commented Aug 20, 2020

Conclusions from 2020-08-20:

  • Absolute.from should use offset and ignore IANA
  • Absolute.from should not have an option to change this behavior
  • LocalDateTime.from should default to "reject", but the spec should recommend that the exception text give a pointer to a document with additional information on this case

ptomato added a commit that referenced this issue Aug 25, 2020
When parsing an ISO 8601 string, Temporal.Absolute.from should always
use the offset, and ignore the bracketed IANA name, even if the two
disagree.

Closes: #716
@ptomato ptomato self-assigned this Aug 25, 2020
ptomato added a commit that referenced this issue Aug 25, 2020
When parsing an ISO 8601 string, Temporal.Absolute.from should always
use the offset, and ignore the bracketed IANA name, even if the two
disagree.

Closes: #716
@ptomato
Copy link
Collaborator

ptomato commented Aug 25, 2020

I have this in a branch at https://github.com/tc39/proposal-temporal/tree/716-741-absolute-zone-names but it should not be merged until LocalDateTime lands.

ptomato added a commit that referenced this issue Aug 26, 2020
When parsing an ISO 8601 string, Temporal.Absolute.from should always
use the offset, and ignore the bracketed IANA name, even if the two
disagree.

Closes: #716
@ptomato ptomato added this to the Stable API milestone Aug 27, 2020
ptomato added a commit that referenced this issue Sep 2, 2020
When parsing an ISO 8601 string, Temporal.Absolute.from should always
use the offset, and ignore the bracketed IANA name, even if the two
disagree.

Closes: #716
@ptomato
Copy link
Collaborator

ptomato commented Sep 12, 2020

What still needs to be discussed in this issue? Is the open question whether Temporal.Absolute.from('2020-08-05T20:06:13+09:00[Asia/Tokyo]') should throw or not? (IMO, no, it should ignore the IANA name.)

@justingrant
Copy link
Collaborator Author

justingrant commented Sep 12, 2020 via email

@ptomato ptomato added documentation Additions to documentation spec-text Specification text involved non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! labels Sep 18, 2020
ptomato added a commit that referenced this issue Nov 2, 2020
When parsing an ISO 8601 string, Temporal.Instant.from should always
use the offset, and ignore the bracketed IANA name, even if the two
disagree.

Closes: #716
ryzokuken added a commit that referenced this issue Nov 2, 2020
When parsing an ISO 8601 string, Temporal.Instant.from should always
use the offset, and ignore the bracketed IANA name, even if the two
disagree.

Closes: #716

Co-authored-by: Ujjwal Sharma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Additions to documentation has-consensus non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! spec-text Specification text involved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants