-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Information content of the abstract manifest #6
Comments
From @GarthConboy on June 27, 2017 14:56 I'd also throw in: -- Reading order Re the #1 and #2 just above in Dave's original issue, it seems they may want to be pre-manifest -- defined before the manifest is found, or be the actual path to the manifest (or to a "first file" that can be rendered, but also somehow points to the manifest). |
From @iherman on June 27, 2017 14:56
|
From @iherman on June 27, 2017 14:56 (Wow. I just said the same thing as Garth just in other words. I swear we did not conspire...) |
From @mattgarrish on June 27, 2017 15:54 What is meant by required here? Must always be present or must be accounted for in the design? This is why I wasn't sure at the f2f if navigation constituted a top-level or lower-level consideration. A standardized means of locating the table of contents seems critical to me, even if it's optional to define and there are no epub-like rules on its construction. |
From @GarthConboy on June 28, 2017 16:2 The updated #6 in the first panel says "Locating table of contents or other navigation structure", we should also consider: -- Do we need such a Nav file (likely yes for A11Y) |
See #14
Interesting question. I know Hadrien has proposed including section titles in a JSON manifest, but I have major concerns about possible reader-facing text in JSON (especially given that there's a standard html way to do this stuff). |
From @HadrienGardeur on July 2, 2017 20:27
IMO the Navigation Document in EPUB 3 is a failed experiment. Most EPUB 3 documents that I've seen end up including at least two HTML table of contents:
Most EPUB 3 reading systems do not render these Navigation Documents either, they simply parse them, extract the info and display things using their own UI. This is a typical example of "spec purity" (the beauty of the Navigation Document) vs real world usage (no one is rendering these documents and we end up with more redundancy instead of less). Readium (1, JS and 2) ended up parsing the info in the Navigation Document and providing a JSON output instead, which is much easier for developers to work with. In the Readium Web Publication Manifest:
|
From @HadrienGardeur on July 2, 2017 20:35 To go back to the initial question, in Readium we separate clearly the abstract model with the minimal requirements for a manifest. The abstract model has three core concepts:
For each core concept, we make sure that:
The basic requirements for a manifest are then based on that model:
|
From @llemeurfr on July 3, 2017 12:43
Better, an IRI because a) may be a urn (up to the publisher to choose, the Web doesn't care) and b) i18n is important. A URL to the origin is also important but should be another property. |
From @WSchindler on July 3, 2017 14:47 I would like to add: |
From @HadrienGardeur on July 3, 2017 14:50 Language and direction (ltr vs rtl) should be two separate metadata. Agree that we need to allow more than one language. |
From @lrosenthol on July 3, 2017 22:5 If we plan to use anything other than a URL (as defined by the HTML spec - On Mon, Jul 3, 2017 at 8:43 AM, L. Le Meur [email protected] wrote:
|
From @llemeurfr on July 5, 2017 10:22 Re. URL vs IRI, after reading https://www.w3.org/International/wiki/IRIStatus, I must admit that this seems like a can of dirty warms. Apart from trying to allow for an extended i18n of publication identifiers, there is still the question of URNs allowed or not as global identifiers. For instance, I spotted that most @HadrienGardeur's Manifest samples use isbn urns as identifiers. |
From @HadrienGardeur on July 5, 2017 12:47 @llemeurfr you're mixing up two different concept regarding the Readium Web Publication Manifest. Keep in mind that we started this work in the context of BFF and that for Readium-2 we mostly ingest EPUB files. The only requirement in the draft document for the Readium WebPub Manifest is to always provide a Here's a basic example using the Readium WebPub Manifest model: "@context": "http://readium.org/webpub/default.jsonld",
"metadata": {
"title": "The Master and Margarita"
},
"links": [
{"rel": "self", "href": "http://example.com/manifest.json", "type": "application/webpub+json"}
],
"spine": [
{"href": "http://example.com/chapter1", "type": "text/html"}
] If the publication has an additional identifier, this can be provided in its metadata: "metadata": {
"title": "The Master and Margarita",
"identifier": "urn:isbn:9780141180144"
} That second identifier is not a requirement in the Readium model, and we can't expect all Web Publications to have such an identifier either. The reason why most of our current samples have URNs (mostly for ISBNs or UUIDs) is because we ingest EPUB files or provide samples for books where ISBNs are very common. |
My only concern is that HTML already has mechanisms for describing the language(s) of content. What happens when a user agent opens an HTML page declared with language A, finds a rel=manifest link, follows it, and sees language B declared? |
From @HadrienGardeur on July 5, 2017 13:11
The manifest declares the language for the publication, while HTML is meant to declare the language for that resource. |
From @llemeurfr on July 5, 2017 14:4
That's right. If a Web publication is copied to another website, this value will not be modified. Therefore a possible definition of the self link is "The original location of the Web Publication", which can be aligned with Requirement 8 for Web Publications: "There should be a way to uniquely identify a Web Publication." |
From @HadrienGardeur on July 5, 2017 14:10 From RFC5988:
|
From @WSchindler on July 5, 2017 15:36 It's of course true that via |
From @lrosenthol on July 5, 2017 16:15 Actually, I would expect the UA to completely ignore the language settings On Wed, Jul 5, 2017 at 9:11 AM, Hadrien Gardeur [email protected]
|
From @lrosenthol on July 5, 2017 16:16
That's not necessary true. The new site may well change the link(s) in the On Wed, Jul 5, 2017 at 10:04 AM, L. Le Meur [email protected]
|
From @HadrienGardeur on July 5, 2017 16:21
While rendering content, sure I fully agree. But a UA can provide additional services on top of it, for example dictionaries or search. The publication metadata can be useful in that regard. |
From @mattgarrish on July 5, 2017 16:21
I agree it's informative and must not be used for rendering content (or metadata), but the same question about value has been raised in epub revisions and the case has been made that it does have uses (e.g., pre-loading tts languages, offering access to dictionaries, etc.). |
From @lrosenthol on July 5, 2017 16:23 On Wed, Jul 5, 2017 at 12:21 PM, Hadrien Gardeur [email protected]
It could indeed be useful - and whether a UA chooses to use it for that or |
From @HadrienGardeur on July 5, 2017 16:24
Defining the UA behavior is out of scope, but making sure that it has relevant info needed is definitely within scope. |
I'm only responsible for around 25,000 EPUBs, but I've never seen an EPUB with two HTML tables of contents.
|
Well, our reading system, FWIW, builds its in-UX TOC from the Nav document (or NCX, in the old days) -- only very rarely is this sufficiently "basic" as to not be the one the user expects to use for actual navigation. And, I'd suspect our A11Y community would not consider a global navigation document as a failed experiment. (and I'm certainly willing to admit that there are numerous things we invented from whole cloth in EPUB-land that would deserve the "failed experiment" moniker, but I don't think Nav docs would be one of them) |
If we have a "reading order' list, do we also need a "nav doc"? What is
the (perceived or practical) difference - in the current EPUB world?
|
Reading order is the list of files in sequence in which they're presented (the spine). Navigation document contains the table of contents (plus page list and landmarks). Even if the spine documents were titled, you'd only get a rudimentary idea of the document outline from them, as when you factor in content chunking it won't even be clear what rank the headings have (i.e., not every document has to start with a level 1 heading). |
Thanks @mattgarrish
On Tue, Jul 11, 2017 at 5:31 PM, Matt Garrish ***@***.***> wrote:
Reading order is the list of files in sequence in which they're presented
(the spine).
On way in which they can be presented, you mean. Since a user may choose
to navigate content in other orders...
Navigation document contains the table of contents (plus page list and
landmarks).
So this, to me, is actual content - and doesn't require a special place.
If your content requires such a thing - then build it (with HTML & CSS,
hopefully also with dpub-aria's TOC role) as a content element and add it
to the reading order where you think it belongs.
… |
The reading order as defined by the spine isn't dynamic, even if the reader follows a non-linear path through it. I don't see how that is easily changed, unless the UA understands the content at some deeper level. At any rate, even if the reading order were shuffled it doesn't change the limitations as a means of navigating the actual publication outline.
PDF has bookmarks. Word has the ability to view the document outline. EPUB has the navigation document. Do we want WP to be the outlier without a programmatic method of discovering? |
On Tue, Jul 11, 2017 at 6:21 PM, Matt Garrish ***@***.***> wrote:
PDF has bookmarks.
Which are a problem, for all the same reasons that EPUB's NavDoc is.
- Lack of styling
- Gets out of sync with the actual content (during editing/combining
operations)
- etc.
Word has the ability to view the document outline.
Which is dynamic based on styling (aka fake semantics) - much like the
HTML outline algo.
… EPUB has the navigation document. Do we want WP to be the outlier without
a programmatic method of discovering?
I think that WP should be aligned with the web and not special.
|
Thanks @GarthConboy for your proposal, I'll also reply point by point.
I disagree about this, for several reasons:
What would you like to identify as a WP? If we're talking about a resource from the publication, it should be identified by either the presence of a link to a manifest, or because the UA has already accessed a manifest and knows that the resource is part of a publication. For the manifest itself, it should be identified by a specific media type.
In Readium WebPub Manifest we also opted for a requirement, at least one resource must be listed in
In Readium, the only required listed is the spine (which is listed in reading order). The other list (
In Readium we require a title, but @lrosenthol is right that if we extend our scope to any sort of publication this might be problematic.
I strongly lean towards optional. I think we should offer (both as options):
The machine-readable info in the manifest should contain all navigation not rendered directly to the user (either because it shows up in the UI of the UA instead, or because it's useful for internal stuff). From @lrosenthol
Unlike the This would leave a lot of wiggle room for the kind of edge case scenario that you're thinking about (gigantic Web Publications). |
I'll reply point by point too. Though can't help but comment that this sort of design work is really hampered by use of github issues -- something (e.g., Google Docs) where you can really comment in place and have conversations would be better! :-)
I think we only partially disagree. I think it is unwise to have the identifier of the publication be the URL of the manifest, as a clueless UA would render either nothing or something it doesn't understand (depending on format of said manifest). I do think the manifest should be pointed to either from the first markup document in the reading order, or potentially even from all of markup documents in the reading order (yes likely through a link as you say) -- this is the same madness you refer to -- but, doesn't seem that mad to me!
I think Dave wanted to be able to identify the "site" as a WP -- and yes, I think the presence of a link to the manifest would be a fine way of doing that -- that's one of the options I was proposing.
Agree.
Agree.
Somewhat agree. I have list of resources as required (above), but that's almost a detail.
Close to agree.
I think this will be the root of lots of conversation, but I don't think we're too far apart. |
@GarthConboy yeah, it's not always easy to follow all threads in a discussion...
This is where I strongly disagree. I think that a clueless UA won't ever be aware of the URL of a manifest, and that even in a WP aware UA, users will never be aware of the URL of a manifest either. If they're not aware of its existence and therefore don't share it, we don't have any problem at all using the URL of the manifest as the WP identifier.
I think this should be entirely up to the author/publisher to decide where and when they include discovery. There shouldn't be any requirement IMO. Also, I'd like to have the ability to remix content on the Web. If I have zero write-access to the content that I'd like to remix within a Web Publication, there's no way I'll be able to include such a link in HTML or HTTP. About navigation
Right, but I think a lot of the arguments in favour of including all machine-readable navigation in HTML are misguided:
|
@HadrienGardeur I think we may be typing past each other on the first issue above. I'm presuming that the identifier of a WP will be a URL. And that the only two logical places for this URL to point is either at the publication's manifest or the first markup document in the reading order. Do you have a different view? Or do not view the identifier as a URL at all? |
I moved the discussion of this item over to its own issue at #8 |
I moved the discussion of this item over to its own issue at #9 |
|
Yep -- I think we found our disagreement. I think if the identifier is a URL, folks will sent it around, and will expect it to "work". Thus, I disagree with your #2 and #4 above, and I still favor the identifier being the first content document. But, since I'm missing the call on Monday, you all can agree to something else, and I'll just whine for subsequent years. |
I fail to understand how you can completely disagree with my second point, let me quote precisely my previous comment:
These are real issues, how do you address them if you decide that the URL of the first content document is the identifier of the WP? The only situation that would make this acceptable is if we embed the manifest in HTML (which I find problematic for completely different reasons). |
@HadrienGardeur yep, guess I disagree with two of the three! :-)
I don't really see why. Assuming the first document in the spine points to manifest or contains the manifest, this seems an elegant solution -- a clueless UA can do something useful, and a clueful UA can chase down (or process) the manifest and do something better.
I'd think "don't do that" is a fine answer (bug for bug compatible with EPUB today). And like you said, this could lead to want to include rather than link to the manifest.
See #1. If the clueless UA gets the URL to the "lead" resource, it can just render the content, either ignoring an embedded manifest or not bothering to follow the link to an external one. |
Since this argument is spread across issues, I've also had to reply to that same question in a separate issue.
It only feels elegant if the manifest and the first resource are one and the same (manifest embedded in HTML). Otherwise it feels very confusing to me to use the same identifier (URL) for two different concepts (publication vs resource). |
@HadrienGardeur -- Yep saw that too. Doesn't make me a convert. But, we'll see where the group ends up. This is probably an issue we should try to resolve very soon, as it drives a number of subsequent decisions. |
Side question: what if I can't include a link to the manifest in the first resource in reading order? What happens then? |
On Tue, Jul 11, 2017 at 10:33 PM, Hadrien Gardeur ***@***.***> wrote:
Side question: what if I can't include a link to the manifest in the first
resource in reading order?
Under what circumstances would you author a publication where you couldn't
link the manifest?
|
There are plenty of publications on the Web that could benefit from such a manifest, I'll re-use the same example: http://poignant.guide/ |
On Tue, Jul 11, 2017 at 10:39 PM, Hadrien Gardeur ***@***.***> wrote:
What if the publication already exists on the Web and someone else would
like to author a manifest for it?
There are plenty of publications on the Web that could benefit from such a
manifest, I'll re-use the same example: http://poignant.guide/
I would argue that you need to put your own "front page" on it. That
would also address the identifier problem - since you would want each of
the different people who produce a publication of that work to have a
different identifier.
|
I see that you're carefully being very neutral by talking about a "front page" ;-) So, this means that I would either need to:
|
Correct!
…On Tue, Jul 11, 2017 at 10:45 PM, Hadrien Gardeur ***@***.***> wrote:
I see that you're carefully being very neutral by talking about a "front
page" ;-)
So, this means that I would either need to:
- simply publish a JSON manifest if we use the URL of the manifest as
the WP identifier
- or publish an additional HTML resource in addition to a manifest
(plus include that HTML resource in the spine) if the first resource is
used as the WP identifier
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AE1vNdyhMMIeuqEXjrT2VIHsQHKcX0eQks5sNDNggaJpZM4OOnGN>
.
|
Sorry for late comment but none of all our EPUBs have nav doc in spine.
|
From @dauwhe on June 27, 2017 14:33
What information is required for an abstract manifest? [edited to add items from comments]
What else? I think we should distinguish required information from "nice to have" information.
Copied from original issue: w3c/publ-wg#12
The text was updated successfully, but these errors were encountered: