Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(i18n) What should the RS do if a language value is not well formed? #1508

Closed
iherman opened this issue Feb 10, 2021 · 4 comments
Closed

(i18n) What should the RS do if a language value is not well formed? #1508

iherman opened this issue Feb 10, 2021 · 4 comments
Labels
Cat-i18n Grouping label for all internationalization related issues Status-Declined The issue has been reviewed and not accepted by the working group for inclusion

Comments

@iherman
Copy link
Member

iherman commented Feb 10, 2021

At the moment, the spec is silent on this...

@iherman iherman added the Cat-i18n Grouping label for all internationalization related issues label Feb 10, 2021
@mattgarrish
Copy link
Member

We don't really have much in the way of requirements for the language, though. Beyond potentially trying to guess at the page progression direction, it's only advisory metadata.

We could always suggest assuming "und". Anything else seems complex and unreliable, like checking content documents for language tags.

@iherman
Copy link
Member Author

iherman commented Feb 11, 2021

Specifying 'und' as a value for not well-formed entries may be a good approach. At this moment, there is not even a hint to RS-s that they should check the value.

@dauwhe
Copy link
Contributor

dauwhe commented Feb 18, 2021

Useful comment: w3c/bp-i18n-specdev#36 (comment)

@iherman
Copy link
Member Author

iherman commented Feb 19, 2021

The issue was discussed in a meeting on 2021-02-18

List of resolutions:

View the transcript

2. Remaining i18n issues

See github issue #1508, #1509.

Wendy Reid: 2 i18n issues remain after the review

Dave Cramer: the two issues are pretty intertwined
… i18n should require valid lang tags
… there is a formal grammar which describes the formal structure of language tags
… so, well-formedness

Leonard Rosenthol: I believe that lang is ISO 3166

Matt Garrish: we enforce well formed, but nothing about validity

Dave Cramer: a valid tag is one which matches the actual languages
… so should we require lang tags to be valid?
… and what should RS do when faced with invalid lang tag?
… not want to make requirement more stringent
… hard to check validity of lang tags
… its a list of strings that changes over time
… burden on epubcheck
… also, there could be existing epub with well formed but invalid lang tags - a change could cause those epubs to fail epubcheck
… should having a valid lang tag just be a best practice?
… so epubcheck would just flag it as an informative warning

Wendy Reid: leonardr:

Leonard Rosenthol: [here's the info](https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/lang
… BCP-47 is the technical spec, you can validate against that if you want
… but i think we should do what HTML does, no more, no ness

Matt Garrish: some of this came up in epubcheck itself, when someone complained that there was no check for lang validity
… also, the epubcheck folks seemed to say that it would be a hard thing to implement
… and unless there is some critical function, a RS is going to ignore this
… and there are currently no critical functions that rely on this very general metadata
… nothing bad comes of this lang tag not being specified properly
… in pub manifest we said that well-formedness is good enough

Wendy Reid: also, we'd run into issues with testing this if we wanted to make this stricter

Dave Cramer: matt has more or less convinced me that this isn't broken

Brady Duga: +1 to not broken

Dave Cramer: ... the costs of fixing it are higher than the purely theoretical benefits of conforming to broadly worded i18n guidelines

Dave Cramer: i propose that we close this without fixing

Matt Garrish: something like WCAG could have stricter rules about lang tags, but for us its not a critical piece of metadata

Ben Schroeter: i like the idea of doing some sort of warning in epubcheck
… also, i don't want to tell RS what to do in general
… RS want to be as lax as possible when it comes to what they will ingest
… they don't want to keep content authors off their RS

Proposed resolution: Close issue 1509 with no action (Wendy Reid)

Marisa DeMeglio: +1

Ben Schroeter: +1

Matthew Chan: +1

Wendy Reid: +1

Brady Duga: +1

Matt Garrish: +1

Toshiaki Koike: +1

Masakazu Kitahara: +1

Shinya Takami (高見真也): +1

Resolution #2: Close issue 1509 with no action

Dave Cramer: for 1508 (i.e. question of what RS should do with poorly-formed lang tag)

Dave Cramer: there was a suggestion that the RS treat the lang as "und" (i.e. undefined)
… that to me is a satisfactory solution to this somewhat theoretical problem

Leonard Rosenthol: +1

Proposed resolution: Close issue 1508, add text to RS specification instructing reading systems to treat a poorly-formed language tag as "und" (undefined) (Wendy Reid)

Brady Duga: is this yet another untestable assertion?
… should we tell RS what to do with this at all?

Matt Garrish: i suggested the "und" thing because i thought we'd done this in pub manifest as well
… but i think we actually went back and decided to remain silent on it
… "we're not going to define what it means for the RS"

Dave Cramer: testing it would require reading the minds of the RS
… what we're actually using the lang tag for is trying to guess at page progression direction?
… how would we know if the RS is actually doing this?
… so maybe another "close, won't fix"?

Ben Schroeter: if we feel the RS wants some sort of guidance we could change the proposal to say "suggest"
… but i'm also happy to drop it

Matt Garrish: i think there's something worrisome about RS determining lang for the author
… i'd rather RS do nothing

Dave Cramer: would also add that RS don't seem to be looking for guidance

Proposed resolution: close issue 1508, won't fix (Wendy Reid)

Ben Schroeter: +1

Brady Duga: +1

Wendy Reid: +1

Matthew Chan: +1

Matt Garrish: +1

Toshiaki Koike: +1

Masakazu Kitahara: +1

Resolution #3: close issue 1508, won't fix

@iherman iherman closed this as completed Feb 24, 2021
@mattgarrish mattgarrish added the Status-Declined The issue has been reviewed and not accepted by the working group for inclusion label Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cat-i18n Grouping label for all internationalization related issues Status-Declined The issue has been reviewed and not accepted by the working group for inclusion
Projects
None yet
Development

No branches or pull requests

3 participants