-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine-Readable navigation #9
Comments
Some of it can't be built automatically, |
On Tue, Jul 11, 2017 at 9:38 PM, Hadrien Gardeur ***@***.***> wrote:
If an author wishes to provide navigation in their document - they can
build it using the same tools they build all other content.
Some of it can't be built automatically, page-list being a good example.
What is a "page-list", in this context? And why does it need to be built
automatically if the author is building it themselves (instead of a UA
doing it on the fly).
I also think that for specific types of publications (textbooks for
instance) it could be very useful to provider alternate ToCs and/or
navigation that you won't be able to simply extract from content documents.
I agree - which is why I am suggesting that there is *not* any expectation
of a UA extracting anything! Whatever the publication and its author
wishes to expose as the navigation model and/or TOC is written by them and
incorporated directly into the publication. Just like web pages do today.
|
In this context, it's meant to tie page numbers from a print edition to specific fragments of a resource. Authors or authoring tools are building such a page list themselves, but we need a place to provide such info. I feel that HTML is a very poor choice for that, and that this is a good example of something that would be better suited to be contained elsewhere.
FYI, I'm not arguing against an HTML navigation document with no markup restrictions being rendered as-is. I'm all for it. |
On Tue, Jul 11, 2017 at 10:03 PM, Hadrien Gardeur ***@***.***> wrote:
In this context, it's meant to tie page numbers from a print edition to
specific fragments of a resource.
Now that is a losing proposition if I ever heard one...Here's hoping that
we don't have to actually tackle that problem!
Authors or authoring tools are building such a page list themselves, but
we need a place to provide such info. I feel that HTML is a very poor
choice for that, and that this is a good example of something that would be
better suited to be contained elsewhere.
Sure - but since I don't see this as something that will be part of our
manifest (or associated documents) - authors can do whatever they want and
it's out of scope...
FYI, I'm not arguing against an HTML navigation document with no markup
restrictions being rendered as-is. I'm all for it.
Excellent!!
|
I'm not sure we should spend much time debating this issue now. I think the A11Y folks will, at some point, decide this for us (required, optional, machine readable or not, print page list, et al). |
On Tue, Jul 11, 2017 at 10:13 PM, GarthConboy ***@***.***> wrote:
I'm not sure we should spend much time debating this issue now. I think
the A11Y folks will, at some point, decide this for us (required, optional,
machine readable or not, print page list, et al).
What makes you think A11y has anything to do with this, @GarthConboy??
|
|
On Tue, Jul 11, 2017 at 10:50 PM, Dave Cramer ***@***.***> wrote:
What makes you think A11y has anything to do with this, @GarthConboy
<https://github.com/garthconboy>??
https://www.w3.org/TR/WCAG20-TECHS/G64.html=
Unless I am missing something @dauwhe, that just says that _if_ I use a
TOC, then I should be sure that it's accessable. In no way does it mandate
that (a) I must have one or (b) what it would look/work like or even (c)
what technologies I use (as evidanced by the reference to TOCs in PDF in
Example 2).
|
No, it's the other way around. WCAG says that when documents consist of multiple resources, you should provide multiple ways of navigating through the resources, and a table of contents is one effective technique to allow navigation. |
Actually, multiple ways applies to a single document, too. Navigating a document can be a complex task for users, and not only because of vision loss. It doesn't require a navigation document to meet, but if you just stick a table of contents where you please then you become responsible for ensuring that the user can find that table of contents from every document. That's not exactly elegant. But WP complicates the issue, as if we don't require it and user agents don't support the navigation document then simply providing one is not enough, as the technology needs to be widely supported to claim conformance. In that case, you're back to including links to meet accessibility requirements. EPUB mandated a navigation document and that reading systems provide access to it. I expect this is one of the issues we'll be taking up next week on the accessibility call. |
That's not necessarily true. We could identify that HTML document in the manifest using a specific rel value ("contents" for example seems well suited for that). |
Sure, I mentioned the same somewhere else in one of these threads. I'm not yet convinced we need a navigation document in the epub sense at this level, and not necessarily a standalone one. But you still need broad support for WP and accessing that document/location for it to meet the threshold of being a supported technology. That's where making it optional to include navigation becomes problematic, in my view, as it lessens the likelihood of support. Which is not to say I don't understand the case for some small documents not having or needing such navigation. |
And again our discussions turn to the debate of declarative/machine-readable (and thus requiring a WP-aware UA) vs. simply using web technology. Obviously this a11y requirement applies to the web in general and folks have no problem complying with it on regular sites. Why do we believe we need something more? |
@lrosenthol I wouldn't say "web technologies" here (IMHO REST APIs are in the scope of web technologies). "html technologies" seem more focused. |
Why do you say that? The identification of navigation is not linked to the representation. As I've said elsewhere, WP doesn't necessarily have to take the declarative approach as it's not incompatible with epub 4 doing so. Providing the access mechanism is more critical if we want compatibility between the two. I think we can establish whether we need that and whether navigation has to be provided before moving on to what form the navigation takes.
Sure, and this boils down to reading experience expectations for publications. Does every document begin with a header and site menu? Do you want explicit linkage in every document in a publication? If an immersive reading experience is wanted, the necessary linking is intrusive. Before dismissing deterministic access to the table of contents we need to be sure of the implications and that they're acceptable. That's all. |
Not necessarily in our case. Having the history of EPUB behind us, reading system developers won't see any issue mapping a TOC to the app menu if it exists, and hiding the menu item if not. |
On Wed, Jul 12, 2017 at 9:22 AM, Matt Garrish ***@***.***> wrote:
Sure, and this boils down to reading experience expectations for
publications.
OK.
I have *zero* expectations.
AFAIAC - and I know this is controversial position - publications can
express their own native experience, should they choose. OR they can
simply provide some guidance to a WP-aware UA and let it do the work for
it. That doesn't mean that the user shouldn't have the ability to
personalize or customize that experience - but only within the context of
what the author of the content chooses to allow.
|
Well, given my previous comment, I can't believe I'm chiming in one last time! I think we eventually will want to consider four options:
This isn't a bridge we need to burn now, and I still stick with my previous comment that these decisions will be driven by our A11Y compatriots, and I believe they think said decisions can be fairly late binding. |
There's a fourth option:
That's what Readium-2 currently outputs: https://readium2.feedbooks.net/Ym9va3MvbW9ieS1kaWNrLmVwdWI=/manifest.json The example above is not a static manifest, it's generated on the fly from an EPUB. The HTML navigation can be identified by its rel value: {
"href": "https://readium2.feedbooks.net/Ym9va3MvbW9ieS1kaWNrLmVwdWI=/OPS/toc.xhtml",
"type": "application/xhtml+xml",
"rel": ["contents"]
} While the machine-readable information is available in the manifest directly, for example landmarks: "landmarks": [
{
"href": "https://readium2.feedbooks.net/Ym9va3MvbW9ieS1kaWNrLmVwdWI=/OPS/OPS/toc.xhtml#toc",
"title": "Table of Contents"
},
{
"href": "https://readium2.feedbooks.net/Ym9va3MvbW9ieS1kaWNrLmVwdWI=/OPS/chapter_001.xhtml",
"title": "Begin Reading"
},
{
"href": "https://readium2.feedbooks.net/Ym9va3MvbW9ieS1kaWNrLmVwdWI=/OPS/copyright.xhtml",
"title": "Copyright Page"
}
] It makes it super easy for a UA to:
|
@HadrienGardeur – I think that's a fifth option. :-) And, yes, indeed it is. |
On Wed, Jul 12, 2017 at 4:57 PM, Hadrien Gardeur ***@***.***> wrote:
There's a fourth option:
- Navigation in HTML rendered as-is
- Plus machine readable navigation in the manifest
Redundant information gets out of sync quickly...it should always (IMO) be
avoided.
|
@GarthConboy my bad, I didn't counted "doing nothing" as a valid option, that's why ;-)
@lrosenthol it's only redundant in the context of an EPUB being ingested. If the HTML lacks the kind of machine readable information currently available in a NavDoc, there's no redundancy at all. For instance you could use machine readable representation in manifest strictly for landmarks and page-list. |
Proper navigation is a requirement beyond accessibility also, but I will only focus on accessibility in my reply. Proper navigation is a strong accessibility requirement. In WCAG it is mentioned but WCAG 2.0 is more focused on one file concept. So, it talks more about navigation in a file, but we are working with AG to improve it for digital publication with multiple files. So, the question is not whether good navigation is required, instead it is about finding best way to achieve in WP. |
@avneeshsingh wrote:
Yes and no. The question, as we discussed in NYC, is whether that navigation should come from the author/publisher - just as it does today on the web - or whether it should be part of the UA (which is how EPUB RS's do it).
True, but if the web itself doesn't see this as important (in that it's not part of WCAG today and every site is navigated differently) - why should we be solving it (only?) for publications? |
It's being added to WCAG. Do we wait until 2.1 is out or get ahead of 2.1? |
On Fri, Jul 14, 2017 at 1:55 PM, Tzviya ***@***.***> wrote:
True, but if the web itself doesn't see this as important (in that it's
not part of WCAG today and every site is navigated differently) - why
should we be solving it (only?) for publications?
It's being added to WCAG. Do we wait until 2.1 is out or get ahead of 2.1?
What is "it"? Can you or Avneesh provide links to the work in progress?
And as to when to adopt, that's a good question...
|
I believe what Avneesh means is that WCAG will be clearer about application to web publications, not that WCAG is defining toc requirements (see the definition of a set of web pages. Multiple ways of navigating web content within a set of web pages is already a requirement in 2.0. I misspoke above that it's needed for single page documents, which is also why I'm not convinced that we have to mandate a table of contents. But, as I also mentioned, be prepared for the effects it will have if we don't have a broadly supported means of finding/accessing a table of contents. EPUB isn't as cut-and-dry as reading system does one and author the other. It all depends on how the reading system choose to present the required access to the navigation document. Nothing says it has to be done in a special widget or outside the spine, only that access to the links be provided. I seem to recall old versions of readium that presented it as html. Placing rules on whether the table of contents is machine readable or not and where it has to be located aren't specifically accessibility issues. Ensuring access, and access in a form that retains the outline hierarchy, are the accessibility requirements, as I see them. Beyond that, we're arguing details of how we expect navigation to work in a UA, what kind of author control is provided, etc. These have varying degrees of importance for accessibility, too, but they have other stakeholders who may have more vested interests in the solutions. |
Indeed, I did not mention "TOC" in my reply. I talked about requirement of proper navigation, we are working with WCAG to ensure that proper navigation is possible when publication consists of multiple files. WCAG starts from user requirements and then finds the technical solutions for it. My reply was from same perspective. Regarding the question of existing WCAG, it is known fact that current version of WCAG does not completely satisfy the accessibility requirements of publishing. http://www.w3.org/TR/dpub-accessibility/ |
Clearly emerging from this thread, the fact that restrictions on the nav structure are a pain for html developers (and html authoring tools). I suspect that the choice of ol as list element is one of the reasons (most naves in the field use ul) but this is certainly not the only reason. What other ways would we have to express structure from html markup? RDFa and microdata. I'll try to get from EPUB authors some samples of complex navs, where the EPUB restrictions are problematic.
I asked to several companies in parallel. Nord Compo was the first to answer, I'll forward other feedbacks when available. |
I think the content model for ol and ul are pretty similar, so I'm not sure how this is making things more difficult. I'd also note that many publications do depend on the order of the contents, and so ol is an appropriate semantic choice. This might not be true for many web sites, where there might not be an inherent ordering of the web pages. I agree that relaxing some of the restrictions in EPUB could be helpful. |
This is true but not the most current practice on the Web; I've been looking at a series of tutorials about nav to be certain. Purity vs practice ...
To be clear, I think that a minimal relaxation will NOT be sufficient for authors and authoring tools. If relaxation is maximal, automatic extraction of structured content will NOT be doable anymore. And automatic extraction of structured content is mandatory, as native UIs will not display html markup but unicode strings, which defeats the purpose of using html to keep internationalization markup intact in a TOC exposed by native code. Therefore, I regret like others the duplication of information but don't see how we can avoid it in many cases. There is still a narrow path, where the HTML nav could indicate (via some feature in the manifest) that it is structured enough to be machine processable, thus letting the UA creating its own TOC. The UA would then face several cases: Authors would therefore have the freedom to create structured nav docs (EPUB 3 like) or free nav docs + a structured TOC (EPUB 2 like) or only a free nav doc (Web like). What do you think of this proposal? |
Yes, it was pick one in epub so we went with the semantically correct list type, but is it all that complicated to dumb down the requirements and still be able to extract data? If the requirements for the table of contents are only:
Are we really changing a lot from a machine-processing perspective? Is this the kind of flexibility people want? |
Based on the information available for Apple ibooks at https://support.apple.com/en-us/HT202972, it looks like Apple uses |
They use epub:type to distinguish the different kind of information that nav doc can contain : TOC, landmark and page-list. They use the page-list to display in iBooks paper page numbers or or local page numbers. When I said that in France we never put nav is spine for all EPUBs, I forgot to mention that we also ask for page-list and landmarks in the nav doc. This nav document we understand as structured information, and not a place for a well rendered table of content. We then also ask for a proper and different HTML document for the toc in spine. IMHO, i don't foresee that relaxing constraints in nav doc will make it a better toc document, as we will still ask for page-list and landmarks. |
I agree with @laudrain, I don't think that we can relax the constraints in a way that will make the NavDoc well-suited for both rendering and machine-readable use cases. Expressing machine-readable info in HTML is always tricky and either requires complex restrictions (EPUB3 NavDoc) or the use of technologies such as RDFa that are clearly difficult to use. I'm in favour of keeping all machine-readable info in the manifest and removing anything that gets in the way of properly authoring/rendering an HTML TOC. A WP UA could then discover in the manifest:
It would be entirely up to the author to decide what they'd like to include in the HTML or the manifest. |
But that's a separate question about whether they belong all together in one file. If we bury these in a manifest where people probably won't have access to them except in a certain class of user agent, how are we not making this an exercise in duplication as they have to be placed also somewhere as html where anyone with any user agent can access them? |
@mattgarrish agree that we'll need to discuss whether they all belong together in one file (IMO, page-list and landmarks have nothing to do in an HTML NavDoc). Regarding duplication and visibility, once again it depends what we're talking about. If you have a 500 items long page-list, do you truly want to display that in a webview to the end user? Probably not. This is mostly used for machine-readable use cases. Duplication is mostly an issue with If we go back to @laudrain example of how Hachette relies on the NavDoc in France:
For their specific use-case, there wouldn't be any additional duplication compared to what they already output in EPUB3 and their nice-looking HTML TOC would be properly identified (not the case currently in EPUB3 where you can indicate a single NavDoc). It would then be entirely up to the UA to decide which TOC to display. A UA could even provide an option to switch between the native UI or the nice looking HTML table of contents. |
I tend to agree with @HadrienGardeur bulleted list above -- certainly the first two. For the third, I wonder if the duplication could be obviated by tagging the "nice looking" TOC with detailed roles (or some such) such that portions could be pulled out as machine processable. Maybe not, but perhaps worth pondering a little. |
You may want to use progressive enhancement techniques to provide a better interface, but it doesn't strike me as all that problematic to have such a bare-bones list if it comes to that. AT have options to bulk jump list items. What's critical, certainly for anyone who needs accessible content is that these features be available. A WP-aware user agent should be able to take that bare-bones list and enhance it, not be the only way to access it. |
@GarthConboy we could also offer both options. Let me describe how Readium-2 would handle navigation in WP (based on the current behaviour for EPUB 2/3):
|
HTML 5.2 attempts to clarify what values or rel are allowed. Note that "contents" is not a listed value. See https://www.w3.org/TR/html52/document-metadata.html#element-attrdef-link-rel and https://www.w3.org/TR/html52/links.html#allowed-keywords-and-their-meanings. See also w3c/html#160 |
@TzviyaSiegman there are multiple registries for rel values, in the case of "contents" it's covered by RFC5988 although it's true that it references HTML4. Just to clarify the message that you quoted above: I'm talking specifically about a link in the manifest, not an HTML |
Just want to put back in the mix that many WPs won't have *any* nav - they
will be simple documents (memos, white papers, etc.). So we also need to
make sure that our design allows for the non-inclusion of such things as
well.
And in those cases, also consider that all of them will be machine authored
and not hand-authored - so that using better/proper techniques for
inclusion of semantic information will be easy to specify.
|
We had a little discussion about navigation in accessibility task force call yesterday. I will like to put forward the accessibility expectations from UA. We have defined this in epubtest.org.
If the WP consists of a single content file then also it is important to have navigation because there is no restriction for the size of single file. One may think that screen readers can provide jump to next heading command, but accessibility is not only for screen reader users, a person with restricted vision needs good navigation without using a screen reader. Another aspect to consider may be finding the least resistance path that can enable navigation in browsers (although I do not remember if we have made a final decision that WP MUST work natively in browsers). HTML document for navigation has an advantage that it could be processed by UA and can also be displayed directly to the users. |
@avneeshsingh - did the a11y task force consider these requirements from a non-book perspective? For example, considering generic documents (let's take a white paper on some topic) then in relation to your comments 2 - There are no pages. It's a born-digital document and does not have a physical (aka page-based) manifestation. 3 - Again, there are no sections (unless you count heading levels as an implicit determiner for such) and no pages. So by your definition, there is no "navigation information". |
"Of course, I would expect that the document is well structured and using proper heading levels (H1, H2, etc.) from which a UA could derived navigation aids if needed." "2 - There are no pages. It's a born-digital document and does not have a physical (aka page-based) manifestation." |
On Fri, Jul 21, 2017 at 9:30 AM, Avneesh Singh ***@***.***> wrote:
I have mentioned earlier also that accessibility requirements start from
user interface and the content specifications must enable user interface to
meet them.
I have to say that is completely backwards.
User interface varies from UA to UA, and even publication to publication.
However, rich (and proper) semantics of content is constant regardless of
how those semantics are presented. This is true not only for accessibility
but also for visual presentation since it is really the premise of CSS
(which binds styling to semantics).
"2 - There are no pages. It's a born-digital document and does not have a
physical (aka page-based) manifestation."
Content specifications must support page navigation. It is up to authors
to decide if pages are required in their content.
Again, that doesn't make any sense to me in a world of born-digital content
that may never have a physical (aka page based) manifestation. Page
navigation is a historical thing and we should (IMO) not be designed
technology for the future around the concept.
If there is some confusion regarding the eptest.org requirements that I
mentioned, then I would clarify that the requirements are for user
interface.
And for EPUB - which is focused very clearly on books (which are more rigid
and have a physical manifestation) - that's fine.
But for WP and PWP, such requirements make no sense. And the task force
needs to remember that they are building for WP first - then PWP - and only
finally for EPUB...
|
From: Leonard Rosenthol
Sent: Friday, July 21, 2017 19:22
To: w3c/wpub
Cc: Avneesh Singh ; Mention
Subject: Re: [w3c/wpub] Machine-Readable navigation (#9)
On Fri, Jul 21, 2017 at 9:30 AM, Avneesh Singh ***@***.***> wrote:
I have mentioned earlier also that accessibility requirements start from
user interface and the content specifications must enable user interface to
meet them.
I have to say that is completely backwards.
User interface varies from UA to UA, and even publication to publication.
However, rich (and proper) semantics of content is constant regardless of
how those semantics are presented. This is true not only for accessibility
but also for visual presentation since it is really the premise of CSS
(which binds styling to semantics).
We should remember that WCAG success criteria are not dependent on technology, instead it is based on user requirements.
"2 - There are no pages. It's a born-digital document and does not have a
physical (aka page-based) manifestation."
Content specifications must support page navigation. It is up to authors
to decide if pages are required in their content.
Again, that doesn't make any sense to me in a world of born-digital content
that may never have a physical (aka page based) manifestation. Page
navigation is a historical thing and we should (IMO) not be designed
technology for the future around the concept.
Again, I do not think that this discussion is relevant. I have seen no decision that WP/PWP/EPUB 4 are the specification for only and only born digital publications, there will be no print equivalent for these publications and page numbers are not relevant.
|
On Fri, Jul 21, 2017 at 10:06 AM, Avneesh Singh ***@***.***> wrote:
We should remember that WCAG success criteria are not dependent on
technology, instead it is based on user requirements.
User requirements, yes. *NOT* user interface or user experience
requirements. They are not the same things.
Again, I do not think that this discussion is relevant. I have seen no
decision that WP/PWP/EPUB 4 are the specification for only and only born
digital publications, there will be no print equivalent for these
publications and page numbers are not relevant.
You misunderstand me. I am not suggesting that we don't support pages, for
publications where they may exist. What I am saying is that pages are not
a requirement (and hopefully will become the exception rather than the
norm) and we need to ensure that our work reflects that.
|
"You misunderstand me. I am not suggesting that we don't support pages, for I was surprised why there is an argument on this matter. If author decides to put pages, the specifications need to support it, so that pages can be navigated in accessible way. Decision maker is the author and the specifications should enable the author to add pages. |
Rather than ToC, we can talk about a "document outline". It is correct to talk about an outline for generic documents, even if in some cases that outline would be trimmed down to the bare minimum (say, the doc title, with no headings or sections). What we have to figure out as a WG is whether we want to mandate the presence of a declarative author-defined ouline (in HTML, or in a JSON menifest, or both) or if we make it optional, or if just let UAs compute whatever they can from the markup (e.g. using the HTML outline algorithm, or heading levels, or whatever).
Of course, some docs will not have any author-defined pages, we all agree on that. But for the docs that do, we have to decide whether the spec mandates some specific navigation facilities. |
Propose closing: the original issue is probably still open, but it is one of those issues that led to a huge list of comments and lost a bit a focus. I would think closing it and, if necessary, open new, more focused issues when the time comes is better. |
👍 to driving this to actionable action items...action-ably. |
@HadrienGardeur wrote in #6
I agree with you - though for different reasons. I believe (and correct me if I am wrong) that you actually want machine readable navigation - something akin to the NavDoc in EPUB. I, on the other hand, don't want it at all. If an author wishes to provide navigation in their document - they can build it using the same tools they build all other content. The UA doesn't need to know anything about it - except perhaps where it lives (done via something like dpub-aria TOC role)
The text was updated successfully, but these errors were encountered: