Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regionCode attribute #690

Merged
merged 12 commits into from
May 16, 2018
Merged

Add regionCode attribute #690

merged 12 commits into from
May 16, 2018

Conversation

marcoscaceres
Copy link
Member

@marcoscaceres marcoscaceres commented Mar 1, 2018

closes w3c/contact-picker#65

The following tasks have been completed:

Implementation commitment:


Preview | Diff

@marcoscaceres marcoscaceres changed the title Add region code (Part 2) Add regionCode attribute (Part 2) Mar 1, 2018
@marcoscaceres
Copy link
Member Author

Added test and indicated commitment to support this in Gecko.

Anyone else going to support this (PaymentAddress.regionCode): @aestes, @adrianba, @zkoch, @adrianba? If so, please give a tick above.

If "maybe later", also useful, in case this needs to be also marked at risk.

@marcoscaceres
Copy link
Member Author

Added MDN Page also now part of the PaymentAddress definition.

@rsolomakhin
Copy link
Collaborator

@sebsg: I saw you working on bringing ISO region codes to libaddressinput and Chrome. Do you have a tracking bug for exposing it in PaymentRequest that we can link here?

@sebsg
Copy link

sebsg commented Mar 1, 2018

I created https://bugs.chromium.org/p/chromium/issues/detail?id=817960 to track the work

@marcoscaceres
Copy link
Member Author

@rsolomakhin, when you can, could you please let me know the bug number for Chrome for adding the attribute?

@rsolomakhin
Copy link
Collaborator

@marcoscaceres: We will use https://crbug.com/817960 for adding the attribute as well.

@marcoscaceres marcoscaceres force-pushed the add_region_code branch 3 times, most recently from 07f12e2 to 5a02e9c Compare March 8, 2018 11:47
@marcoscaceres marcoscaceres force-pushed the add_region_code branch 2 times, most recently from 1107d72 to 2d45682 Compare March 15, 2018 02:27
@marcoscaceres marcoscaceres force-pushed the add_region_code branch 2 times, most recently from 611002c to 2b49bd6 Compare March 15, 2018 04:05
@marcoscaceres marcoscaceres requested a review from domenic March 15, 2018 04:05
index.html Outdated
<var>details</var>["<a>regionCode</a>"].
</li>
<li>If <var>regionCode</var> is not a valid [[!ISO3166-2]]
subdivision code, throw a <a>RangeError</a> exception.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this reference give an algorithm for determining validity? I would check, but, I assume it's behind a paywall...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behind a paywall (and I doubt it would have the algorithm... but you never know). I'll buy the spec and have a look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISO3166-2 defines the structure of subdivision codes, and could be usable (section 5.2)... but it's defined in terms that are not "web terms" (e.g., speaks about "characters" instead of code points).

Options:

  1. Just send implementers to ISO3166-2, and say "valid as per Section 5.2 of ISO3166-2".
  2. Take ISO3166-2 format, and talk about it in terms of infra.

With 1, the interop risk is low... the format is pretty simple:

  • a country-code, a hyphen, 1-3 aphanum.

For 2, we could do:

The steps to validate a region code are as follows. The algorithm takes a string regionCode as input. The algorithm returns true if valid, false if invalid.

  1. If regionCode doesn't have exactly one U+002D (-), return false.
  2. Set subCode to the result of ASCII uppercase regionCode.
  3. Split subCode on hyphen, and let country be the first item, and subdivision be the second item.
  4. If country is not ASCII upper alpha and length 2, return false.
  5. If country is not a [[!ISO3166-1]] country code, return false.
  6. If subdivision is not ASCII alphanumeric and of length between 1 and 3, return false.
  7. If subdivision is not [[!ISO3166-2]] subdivision that corresponds to country, return false.
  8. Return true.

@domenic, happy to add 2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I guess 1 is good. I was going to say 2 was nice because then you don't have to pay for ISO3166-2 to implement this spec, but it turns out you would still have to do so for your step 7, so let's just stick with 1.

index.html Outdated
the empty string:
<ol>
<li>Let <var>regionCode</var> be the result of <a>strip leading
and trailing ASCII whitespace</a> and <a data-cite=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing "from" after "stripping leading and trailing ASCII whitespace"

index.html Outdated
<var>regionCode</var>.
</li>
</ol>
</li>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider deriving region from regionCode, so the caller only has to pass one? I'm thinking ahead to the constructor; do we want to prevent new PaymentAddress({ ..., regionCode: "PT-11", region: "Tasmania" })?

I guess for the cases where there is no corresponding regionCode, we'd still want to pass region. But if regionCode is not the empty string, deriving region from it might be good...

Copy link
Member Author

@marcoscaceres marcoscaceres Mar 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if regionCode is not the empty string, deriving region from it might be good...

Agree. It's definitely doable. ISO3166-2 provides lookup tables, for example:

AD-07* Andorra la Vella
AD-02* Canillo
AD-03* Encamp
AD-08* Escaldes-Engordany
AD-04* La Massana
AD-05* Ordino
AD-06* Sant Julià de Lòria

Where column 2 (defined thing) is "...the country subdivision names in the administrative language of the country concerned, where relevant with diacritic signs..." (as Unicode). Note that "a country’s administrative language is a written language used by the administration of the country at the national level". So, the result won't be in English a lot of the time - but that's fine, IMO. This is not for display purposes.

So, something like:

The steps to derive a region from a country subdivision code element are as follows. The string takes a DOMString subCode as input and returns a DOMString.

  1. If validate a region code subCode return false, throw RangeError exception.
  2. ASCII uppercase subCode, and let normalizeCode be the result.
  3. Find normalizeCode in Section 8, "List of country subdivision names and their code elements" of ISO3166-2 and return the value of "column 2" for the matching country subdivision code element.

Sound good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great! Probably we should get implementer buy-ins though; maybe as a separate PR? Because it'd mean having to ship that table with the browser...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree... ☝️@rsolomakhin, @mnoorenberghe, something to start thinking about. Would appreciate your input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already using libaddressinput (and I believe Chromium does too) so it would be great if we could also use that here instead of having two versions of similar data. Does ISO3166-2 only have one name per region? What about when there are multiple official languages? Consider CA-QC: is it "Quebec" or "Québec" in ISO3166-2. libaddressinput provides both but without a clear way to know which one would match ISO3166-2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumedly, we can just grab the "en" column 2

I'm not quite sure how this all fits together but won't this affect what gets returned in the region property and therefore may be displayed back to the user on the payment confirmation page the merchant renders? If so, it would seem like we should use the language best for the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably perform some kind of lookup, based on the document language. But then it gets into preference order for when it doesn't match the user's language (and there is no "en").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems crazy town, but:

  1. Try to match document language.
  2. Try to match user language (like Accept-Language)?
  3. Try to match "en".
  4. Um... just give me whatever you got.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use libaddressinput in Chromium and @sebsg is helping us to add the ISO codes to that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if regionCode is not the empty string, deriving region from it might be good.

The regionCode field takes care of interop, so we don't necessarily have to agree on how the region field is derived from that. Having said that, the rule of thumb we have been using in Chrome is to use the browser user interface language when deciding between Quebec and Québec. If the browser user interface language does not match any of the languages supported by the country, then Chrome falls back to the first language specified in libaddressinput data for that country. For example, if the browser user interface language is Italian, then Canadian addresses will be in English. Although the default language works out to be English for Canada, that's not something that we should hardcode for all countries in the world, IMHO.

index.html Outdated
<td>
A <a>region</a> represented as a [[!ISO3166-2]] subdivision
code or the empty string, stored in its canonical uppercase
form. For example, "<code>PT-11</code>" represents the Lisbon
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or the empty string" at the end of the sentence, please. There is no canonical uppercase form for the empty string.

@marcoscaceres marcoscaceres changed the title Add regionCode attribute (Part 2) Add regionCode attribute Mar 15, 2018
index.html Outdated
<li>Set <var>address</var>.<a>[[\regionCode]]</a> to
<var>regionCode</var>.
</li>
<li>Let <var>region</var> be the corresponding [[!ISO3166-2]]
Copy link
Member Author

@marcoscaceres marcoscaceres Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsolomakhin, I've tried to summarize what you said about how Chrome does the ordering, while keeping the order in which things are matched optional. However, I added matching on the document's body's language as first, to match what we say in .show(). The "Any other criteria the user agent deems suitable" if for the fallback you described, whereby you pick the first available.

WDYT? Would this work?

index.html Outdated
<li>Set <var>address</var>.<a>[[\regionCode]]</a> to
<var>regionCode</var>.
</li>
<li>Let <var>region</var> be the corresponding <a>country
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Adding comment again, so we can discuss here)

@rsolomakhin, I've tried to summarize what you said about how Chrome does the ordering, while keeping the order in which things are matched optional. However, I added matching on the document's body's language as first, to match what we say in .show(). The "Any other criteria the user agent deems suitable" if for the fallback you described, whereby you pick the first available.

WDYT? Would this work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the specific question is… do you want me to comment on the locale ordering or the spec change in general? For the locale ordering, as long as the UA isn't required to show the PR dialog in a language that the UA doesn't have installed then I'm fine with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a separate note, I wonder whether we really need to derive region from the regionCode… it would be simpler to also return the ISO3166-2 code as the region.

One concern I have with the spec change is that I've never seen ISO3166-2 used with the country code prefix in the wild. As a web developer I would expect regionCode to be "CA" for California, not "US-CA" if I hadn't read documentation on the API and only saw the webidl FWIW. I'm not sure if there are exceptions to cutting off the first 3 characters in order to get the region code specific to the country. If there is then this will likely cause issues for merchants when mapping the PR's address to existing systems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a web developer I would expect regionCode to be "CA" for California, not "US-CA"

With the information I have at the moment, I'm leaning towards just returning the ISO3166-2 format. It might save us from having to deal with special cases where there is ambiguity, but I also don't know if there are any. I'll try to do a bit more reading.

I'm not sure if there are exceptions to cutting off the first 3 characters in order to get the region code specific to the country.

I don't think there are, from my reading of the ISO3166-2 spec. But I'll take a closer look as it's been a few weeks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this and re-reading the ISO spec, I can't determine that there would be significant issues with chopping off country and dash. I'm still reluctant to do it tho. I'd rather just stick to the full code "just in case" and just so we are not creating new conventions. @mnoorenberghe, is that ok?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'm always hesitant to parse other people's strings, ISO 3166-2 is defined to use ISO 3166-1 alpha-2 code elements as the first two characters of the string, followed by the hyphen character, followed by the region itself. So I think we'd be safe in truncated the first three chars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it does also remove a source of ambiguity/redundancy, as .country will now need to be there to derive .region. Do we still attempt to do validation on a synthetic 3166-2 code? I would guess no: we attempt to populate region iff county and regionCode are both there and they both together form a known 3366-2 code. Can work with that.

index.html Outdated
</li>
</ol>
</li>
<li>Set <var>address</var>.<a>[[\region]]</a> to
Copy link
Member Author

@marcoscaceres marcoscaceres Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domenic, stop reading this! get back to vacation! 🍹

But when you are back... by design, address.region is overridable by the developer. So, eventually:

const dict = {
   country: "PT",
   regionCode: "PT-11",
}
const address = new PaymentAddress(dict);
address.region; // "Lisboa", per ISO ISO3166-2

// And... then.. 
const dict = {
   country: "PT",
   regionCode: "PT-11",
   region: "Lisbon",
}
const address = new PaymentAddress(dict);
address.region; // Lisbon, per developer override.

Hmm.... now I'm wondering if we should allow some members to be nullable on the AddressInit dictionary:

const dict = {
   country: "PT",
   regionCode: "PT-11",
   region: null, // or undefined, means "derive it for me" 
}

While:

const dict = {
   country: "PT",
   regionCode: "PT-11",
   region: "", // trash it ... I'll handle it
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domenic, if you are back from vacation, #690 (comment)

Copy link
Collaborator

@domenic domenic Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think undefined ("not present" in dictionary-speak) is the right signifier. So basically only execute this step if details["region"] does not exist. Otherwise set address.[[region]] to details["region"].

Whether we want to allow details["region"] to be nullable or not, as some kind of explicit "don't derive this for me, but also make it not a real region value"---I'm not sure.

index.html Outdated
<li>Set <var>details</var>["<a>region</a>"] to the user-provided
region, or to the empty string if none was provided.
</li>
<li>If <var>details</var>["<a>region</a>"] has as a corresponding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "has as a"

index.html Outdated
<li>Set <var>address</var>.<a>[[\regionCode]]</a> to
<var>regionCode</var>.
</li>
<li>Let <var>region</var> be the corresponding <a>country
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the specific question is… do you want me to comment on the locale ordering or the spec change in general? For the locale ordering, as long as the UA isn't required to show the PR dialog in a language that the UA doesn't have installed then I'm fine with it.

index.html Outdated
<li>Set <var>address</var>.<a>[[\regionCode]]</a> to
<var>regionCode</var>.
</li>
<li>Let <var>region</var> be the corresponding <a>country
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a separate note, I wonder whether we really need to derive region from the regionCode… it would be simpler to also return the ISO3166-2 code as the region.

One concern I have with the spec change is that I've never seen ISO3166-2 used with the country code prefix in the wild. As a web developer I would expect regionCode to be "CA" for California, not "US-CA" if I hadn't read documentation on the API and only saw the webidl FWIW. I'm not sure if there are exceptions to cutting off the first 3 characters in order to get the region code specific to the country. If there is then this will likely cause issues for merchants when mapping the PR's address to existing systems.

@marcoscaceres
Copy link
Member Author

Can definitely just return everything after “-“, as we have the country. A 3166-2 code is always country-subregion. However, the codes are easier to look up as complete codes so my personal preference would be to not chop them up. Happy to hear other opinions.

@marcoscaceres
Copy link
Member Author

@mnoorenberghe ok, just subdivision part is returned now. Check it!

Rules are:

  • if regionCode is present, valid country is required (or throws).
  • If region is missing, and regionCode is present and valid, browser gives you region.

Need to redo the tests and MDN docs, but would appreciate review in meantime.

@marcoscaceres
Copy link
Member Author

@domenic, this spec text should hopefully now be good to go 🤞

Copy link
Collaborator

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits with variable naming, but overall makes sense...

index.html Outdated
<li>If <var>details</var>["<a>regionCode</a>"] is present and not
the empty string:
<ol>
<li>Let <var>regionPart</var> be the result of <a>strip leading
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable naming here is super-confusing. To a web developer, a regionCode is something like "CA", right? But in this algorithm, details["regionCode"] is "CA", regionPart is "CA", and regionCode is "USA-CA".

I think the variable on this line should be regionCode, and on line 2213 should be putativeCountrySubdivisionCodeElement or similar.

index.html Outdated
</li>
<li>Let <var>regionCode</var> be the concatenation of
<var>address</var>.<a>[[\country]]</a>, a single U+002D (-) <a>
code point</a>, <var>regionPart</var>.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing "and" after last comma.

marcoscaceres added a commit to web-platform-tests/wpt that referenced this pull request May 16, 2018
With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
@marcoscaceres
Copy link
Member Author

Added new tests, updated MDN docs.

@marcoscaceres marcoscaceres merged commit f0018f9 into gh-pages May 16, 2018
@marcoscaceres marcoscaceres deleted the add_region_code branch May 16, 2018 05:44
@marcoscaceres
Copy link
Member Author

@ianbjacobs, could you kindly please republish the specification on TR? Thanks in advance!

marcoscaceres added a commit to web-platform-tests/wpt that referenced this pull request May 16, 2018
With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
@marcoscaceres
Copy link
Member Author

@domenic thanks again for your time and the great feedback! ✨

@ianbjacobs
Copy link
Collaborator

ianbjacobs commented May 16, 2018

@marcoscaceres, aiming for 17 May publication. Thanks all!

Update: 22 May due to moratorium.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request May 20, 2018
Automatic update from web-platform-testsTest short regionCode (#11021)

With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
--

wpt-commits: 32a16759d48dae9332763da98f8582bd320d82ea
wpt-pr: 11021
sole pushed a commit to sole/gecko that referenced this pull request May 21, 2018
Automatic update from web-platform-testsTest short regionCode (#11021)

With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
--

wpt-commits: 32a16759d48dae9332763da98f8582bd320d82ea
wpt-pr: 11021
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Oct 3, 2019
Automatic update from web-platform-testsTest short regionCode (#11021)

With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
--

wpt-commits: 32a16759d48dae9332763da98f8582bd320d82ea
wpt-pr: 11021

UltraBlame original commit: 617ffba267cf5f8f6fa6622a96d48cefb45ec2c5
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 3, 2019
Automatic update from web-platform-testsTest short regionCode (#11021)

With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
--

wpt-commits: 32a16759d48dae9332763da98f8582bd320d82ea
wpt-pr: 11021

UltraBlame original commit: 617ffba267cf5f8f6fa6622a96d48cefb45ec2c5
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 3, 2019
Automatic update from web-platform-testsTest short regionCode (#11021)

With w3c/payment-request#690 the spec changed
requiring `regionCode` to just be the "code element" of an [[!ISO3166-2]]
country subdivision name (e.g., "CA" for California).
--

wpt-commits: 32a16759d48dae9332763da98f8582bd320d82ea
wpt-pr: 11021

UltraBlame original commit: 617ffba267cf5f8f6fa6622a96d48cefb45ec2c5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ContactAddress 'region' inconsistencies
7 participants