Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intl.DisplayNames API #31

Closed
zbraniecki opened this issue Sep 16, 2015 · 17 comments · Fixed by #502
Closed

Intl.DisplayNames API #31

zbraniecki opened this issue Sep 16, 2015 · 17 comments · Fixed by #502
Assignees
Labels
c: text Component: case mapping, collation, properties Proposal Larger change requiring a proposal s: in progress Status: the issue has an active proposal

Comments

@zbraniecki
Copy link
Member

There's a lot of data related to Language names, Timezone names, Script names and Region names contained in CLDR that would allow many App Settings to be easier to develop.

Many popular applications contain some combination of "language selector" and "timezone selector" in their UI. We use it in Firefox OS and Firefox desktop, but most of popular webapps like Gmail, Facebook or Twitter do the same.

It would be awesome to tap into CLDR resources and expose the ability to get localized versions of those tokens. We'll want to do this for Firefox OS as part of the 'mozIntl' API and I think it would make sense to standardize it.

Open questions:

  • We'll have to fallback gracefully if the localized value for the token is not available
  • We'll need to evaluate the disk space cost of that data. I can see us not wanting to keep Script/Region/Territory data, and only expose Language/Timezone. But it would be smart to design API to allow us to expose more in the future.
  • Most use cases will involve a loop to retrieve many names (for a drop-down selector), so the function that retrieves the name for a given language has to be fast, but we probably don't want to load all strings into memory. Need to design the spec for balance.
@caridy
Copy link
Contributor

caridy commented Sep 17, 2015

Hmm, can you elaborate more?

@zbraniecki
Copy link
Member Author

Sure, I'll start with Timezone names.

CLDR exposes list of time zone names translations: http://www.unicode.org/cldr/charts/27/summary/ar.html#3155 - and every system that allows user to customize time/date will probably want to allow also for the selection of timezone - think gmail, facebook, twitter, firefox os, firefox desktop etc.

An example API could look like this:

var obj = Intl.DisplayNames('ar', {});
obj.format('Europe/Warsaw'); // 'وارسو'

var obj = Intl.DisplayNames('pl', {});
obj.format('Europe/Warsaw'); // 'Warszawa'

Such API in the future could also handle language names, script names, calendar names, territories, date fields etc.

There's a lot of term translations that we have to store in Intl, like translated names of days (e.g. 'Monday') for date time formatting. It would be valuable to expose them so that developer working on things like calendar systems can reuse them and keep consistent with what Intl.DateTimeFormat returns.

@caridy
Copy link
Contributor

caridy commented Sep 19, 2015

I want to focus more on helping developers to avoid loading large chunks of CLDR data for complex operations that requires algos, that's where we should put our energies. This proposal is a nice to have IMO, but let's keep it around and see if we can get any traction on it.

@srl295
Copy link
Member

srl295 commented Sep 23, 2015

@zbraniecki so for TZ, take a look at the icu4j TimeZoneFormat class - what you proposed is something like the EXEMPLAR_LOCATION style supported by that API. Zones change over time, so you want to include a date when formatting, as well as to be able to format DST/Summer time.

@zbraniecki
Copy link
Member Author

you want to include a date when formatting, as well as to be able to format DST/Summer time

Good point!

@caridy caridy added this to the 4rd Edition milestone Feb 29, 2016
@zbraniecki
Copy link
Member Author

For SpiderMonkey/Gecko I landed a non-public API mozIntl.getDisplayNames, which works like this:

mozIntl.getDisplayNames('en-US', {
  style: 'long',
  keys: [
    'dates/fields/year',
    'dates/fields/month',
    'dates/fields/week',
    'dates/fields/day',
    'dates/gregorian/months/january',
    'dates/gregorian/months/february',
    'dates/gregorian/months/march',
    'dates/gregorian/weekdays/tuesday'
  ]
}) === {
  locale: 'en-US',
  style: 'long',
  values: {
    'dates/fields/year': 'year',
    'dates/fields/month': 'month',
    'dates/fields/week': 'week',
    'dates/fields/day': 'day',
    'dates/gregorian/months/january': 'January',
    'dates/gregorian/months/february': 'February',
    'dates/gregorian/months/march': 'March',
    'dates/gregorian/weekdays/tuesday': 'Tuesday',
  }
}

So far I only added dates/gregorian/months/*, dates/gregorian/weekdays/*, dates/gregorian/dayperiods/* and dates/fields/{year|month|week|day].

As with other mozIntl proprietary API's, I hope that will help us develop an ECMA402 spec proposal and I'll be happy to transition to it once it happens.

@littledan
Copy link
Member

Cool, I like the idea of exposing this data. However, it's important to keep in mind that this is the point where we fall off a cliff and resource size for fully supporting a locale becomes gigantic.

Bikeshed: I might have expected something like (new mozIntl.DisplayNames('en-US', {style: 'long'})).get('/dates/fields/year') ==> 'year', if we were going for analogy with the existing Intl APIs.

@zbraniecki
Copy link
Member Author

Yeah, I was thinking about it, but there seem to be no value in creating an object (no internal data or state that can be shared) and we didn't have any scenario where we'd want a single string to be loaded.
It's always a lot of strings at once (say, all weekdays or month names).

So we settled on this, but I believe we can adjust it any way we want as we move forward. I'd like to see if there's anyone else interested in working on this API. I have a pretty full plate :)

Wrt. cliff - totally agree. It's fundamentally different thing to display a list of months in the wrong language from displaying a date formatted using different skeleton (when the requested locale's data is not available).
Not sure how to resolve it and is it a reason to avoid building an API like this for the ECMA402.

@jungshik
Copy link

Thank you for moving this forward.

For me, display names for scripts, languages and regions are more interesting. This has been briefly discussed a few times over the last few years.

resource size for fully supporting a locale becomes gigantic.

What's required for mozIntl.getDisplayName so far does not require any additional resource for supported locales because they're already necessary for DateTimeFormat.

As for various elements of Date (month names, day of week names, etc), CLDR has multiple values not just for different widths but also for context (display context; formatting or standalone). For Firefox OS and most users of this proposed API, 'standalone' would be suitable. And, perhaps, it's implied in 'Display' in Intl.DisplayNames or 'mozIntl.getDisplayNames()'. Anyway, it's to be discussed whether this API is only for 'standalone' or it needs to support 'the other' context. If it's only for standalone, it has to be made clear.

FYI, ICU's DateFormatSymbols class API takes DTWidthType and DTContextType (e.g. getWeekDays ). Tangentially related to this is http://bugs.icu-project.org/trac/ticket/11552 .

@zbraniecki
Copy link
Member Author

For me, display names for scripts, languages and regions are more interesting.

Yeah, we didn't get to that use case yet, but I was planning for it as well.

What's required for mozIntl.getDisplayName so far does not require any additional resource for supported locales because they're already necessary for DateTimeFormat.

I was thinking that one way to facilitate it would be to just use this API for exposing single terms that are covered by other APIs anyway.
So for example, once we land UnitFormat, we could use unit display names and if the implementation supports UnitFormat, that it would also have terms for units.

That doesn't solve your use case because we don't have any API planned for that, but I just envision this API having some "buckets" that implementers may choose to support or not (datetime, units, scripts, languages, cities, timezones etc.)

As for various elements of Date (month names, day of week names, etc), CLDR has multiple values not just for different widths but also for context (display context; formatting or standalone).

Yes, I can see us consider adding a context option to handle other contexts as well.

@jungshik
Copy link

jungshik commented Jan 6, 2017

I was thinking that one way to facilitate it would be to just use this API for exposing single terms that are covered by other APIs anyway.
So for example, once we land UnitFormat, we could use unit display names and if the implementation supports UnitFormat, that it would also have terms for units.

The above reminded me of what I wanted to say before starting my previous comment, but along the way I forgot to mention.

Perhaps, a better way to "bundle" formatter and its corresponding symbols/display names would be either put SymbolsGetter API under 'formatter' or put it under an 'obviously-related' name.

That is, for DateTimeFormat symbols (month names, day names, etc), we can have either Intl.DateTimeFormat.getSymbols or Intl.DateTimeSymbols. For NumberFormat, it'd be Intl.NumberFormat.getSymbols or Intl.NumberSymbols.

Perhaps, the latter is better than the former.

Intl.DisplayNames can be used for country names, region names, and script names because they don't have any particular association with a formatter.

@jungshik
Copy link

jungshik commented Jan 6, 2017

@zbraniecki
Copy link
Member Author

I think I prefer Intl.DateTimeFormat.getSymbols over Intl.DateTimeSymbols, but that's a subjective opinion.

Wrt. separating formatter-related symbols from generic ones, I also don't see an obvious advantage of one approach over the other, and I'm happy to go either way.
Perhaps someone else will come up with a reason to differentiate the value of those approaches.

@annevk
Copy link
Member

annevk commented Oct 27, 2017

@LeaVerou has interest in this as well to enable <input type=country> at some point down the line (enabling better UI for those widgets driven by browsers and more consistency across sites): https://twitter.com/LeaVerou/status/923312367815024640.

@zbraniecki
Copy link
Member Author

The HTML approach has very different characteristics. It's more limited but its intl is transparent from the dev perspective which reduces fingerprinting vector.

@littledan
Copy link
Member

@zbraniecki The idea is to expose the right strings here so that it'd be easy to define a custom element which would be very similar to <input type=country>. Personally, I see this as some good motivation for continuing towards exposing more strings like this.

@jungshik
Copy link

A real use case for language display name from Chrome bug tracker:

https://bugs.chromium.org/p/chromium/issues/detail?id=493962#c3
and http://crbug.com/496804 .

@sffc sffc added s: in progress Status: the issue has an active proposal c: text Component: case mapping, collation, properties and removed enhancement labels Mar 19, 2019
@sffc sffc added the Proposal Larger change requiring a proposal label Jun 5, 2020
@sffc sffc removed this from the 4th Edition milestone Jun 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: text Component: case mapping, collation, properties Proposal Larger change requiring a proposal s: in progress Status: the issue has an active proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants