-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The URI, URL, and URN of a web publication #11
Comments
not to be too nitpicky, but I think we have agreed to use the term "address". We are not in the business of identifying anything ;-) |
|
Right, good point.
Why can't we? PWAs require that the app manifest is linked from the HTML (there's no spec, but it's the common documented practice, the "cow path"), why would Web Pub be any different?
For a browser, I'd say it's much more convenient to get HTML first. |
They don't require anything. They give you the possibility to offer auto-discovery. You're not required to include a link to a Web App Manifest from all resources of a Web App, you include a link whenever you want and/or need it.
Not for a browser who knows how to deal with a WP. The other browsers would never be aware of the existence of this manifest anyway. |
Right, but if you want to make the Web App Manifest actually used, you do need this link, right? Is there any implem which uses a Web App Manifest directly? (I'm not aware of any).
Exactly, that's why pointing to HTML first offers better graceful degradation. |
Sure, any API could rely entirely on Web App Manifest directly. For example an OPDS integration in a Readium-2 based UA.
Do you point a RSS reader to a website first and then let it figure out how to discover the right RSS (they could be multiple links)? Of course not. Once again, the main issue here is that @dauwhe is mixing up two concepts together ("start" and "self"). As long as we have a standalone manifest (instead of an embedded one), it'll be accessible directly in a basic UA. This has nothing to do with using this URI as an identifier or not, and saying that the "self" link should be the URI of the manifest won't make the problem any worse or better. |
Since we've moved to separate issues, I'll quote a previous comment:
|
Oh, I can very well imagine how manifest-first could work. I was just commenting that it's not how web app manifests are currently used.
I would point a user to a blog URL, and their browser would discover the RSS feed. In most cases, I would also paste the blog URL straight to my RSS client, which would discover the feed. For a blog, I would posit that in practice both self and start is the URL of the blog site, not its RSS feed. |
Web Apps are not entirely dependent on auto-discovery, Microsoft for instance has announced integration of PWAs in the Windows 10 store. |
Well...not exactly. 😃 One of the beautiful things about "web publications" (writ small as I'm referring to things published on the Web right now) is that you can find the whole from its parts. This can also be true within a library's indexing system--where a search brings up not only a book reference, but also a specific chapter or page. In a webby publication, I could browse/find/retrieve just that page and then (at my option) get the rest of the publication in which that page is a part. That's typically how PWA's are distributed outside of stores--any user-facing content (typically HTML pages) reference the manifest for the PWA they're apart of. If you land on any of them, you have (assuming modern tech is in use) the option to "install" (or "keep") the PWA of which that page is a part. Sometimes those PWA's may only have one "page." Sometimes, they may have many. Same situation is true of rel="serviceworker" fwiw. It can be referenced from any resource, but defines a Whatever we build, let's keep it webby. |
You're missing the point @BigBlueHat, I've never argued against discovery and in fact I've posted numerous exemples about how this can be handled ( But @rdeltour was arguing that this is the only use case, which isn't true. The discoverable and distributed nature of it doesn't mean that it can't also be useful in a more controlled environment (a library's catalog being a very good example of such an environment for publications). |
BTW, that part specifically isn't true. If you include a link to a Web App Manifest on every single page of your Web App you get into some very weird behaviours (I've seen Chrome displaying the Web App Manifest install banner in an app that I've already installed for instance), so in general you should avoid including a link to it on every single page. I'm not saying that we should avoid including a link on every content document of a Web Publication, but forcing them is IMO a bad idea (in some situations you might not be able to provide such links). |
For the record, I was arguing that URLs to HTML documents, with a discoverable link to a manifest, was by far the most frequent usage pattern in today's Web (notably with PWAs). |
As a user, I find nearly everything on the web by clicking on a link or typing a URL. The link to most web applications I've seen are 1. described by a URL that makes no reference to a specific file or file extension and 2. resolve to an HTML document. And from the point of view of an end user, that link does serve as the identifier for a web app or web site. Do I want to go to google? Then I use google.com. Do I want to read the CouchDB book? I go to http://guide.couchdb.org. Look at the Firefox Platform Status PWA? I go to https://platform-status.mozilla.org. In some cases there's an index.html, in other cases there isn't—as a user, I don't care. In my mind, the identity of the web app or site is the URL that allows me to use the app. I notice that the web app manifest spec doesn't include a rel=self link or equivalent. What's different about web publications that we would need such a thing? |
We can go over and over again over the same issue @dauwhe but you're not really answering to the question that I've been asking several times now: why do you believe that having the URI of the manifest as a WP identifier will have any impact on people opening such URIs in a basic UA? As long as we have a standalone manifest, this will always be possible and this has nothing to do with what we use as an identifier. Are you in agreement with @frivoal and suggesting that the manifest MUST be embedded in HTML? |
I am wondering if we can postpone this issue for a while. A WP may have a manifest, the first spine item, and the navigation document. I think that the URI of a WP is either that of its manifest, first spine item, or the nav. doc, and nothing else. Isn't this good enough for now? Pros and cons of each option would be nice. Remember that our FWPD will be a good chance to provide a better big picture and invite browser vendors. |
You don't really need to update a Web App Manifest, while this will be very important for a Web Publication where caching and packaging will most likely be based on a declarative approach rather than a scripted one. This means that we'll need to check if a Web Publication Manifest has been updated fairly often, and having a self/canonical URL for that is extremely useful. Atom feeds have a self link for the exact same reason but that link will be even more important on a Web Publication where the manifest can be cached for much longer periods of time. |
while catching up with other threads, it occurs to me that we may not all talk about the same things. I was considering a URL to a Web Pub as a locator or address (in other words, a user-facing concept; what would be linked to from another web page), whereas @HadrienGardeur seems to be talking about pointing at the manifest as the Web Pub identifier. Am I correct? If yes, what do we exactly mean by identifier, what's the use case and definition? |
Indeed. Good question. And should they be (or do they need to be) different or separable concepts? |
To me, at least, a locator is just a way to find a publication. It could
be just one of many instances of one of many versions of the same
publication. It's a URL to something that a UA can process.
The identifier, on the other hand, is a permanent "unique" way to clearly
differentiate each version of each publication from another. In books, an
ISBN tends to serve this purpose. Of course, that specific implementation
won't work in the larger publication context - but the concept is good.
|
If the goal is de-duplication, isn't the web's way of solving that |
This is why I commented that we were talking about addressing not identification. We have explicitly stated that PWG is NOT working on identification in the sense of unique for a publication. If Wiley publishes Moby Dick and Hachette publishes Moby Dick, PWG is not going to solve the problem of resolve identifiers between the 2 versions. Other organizations have worked on that. PWG WILL work on addressing those identifiers in a web-friendly manner. |
There isn't a single locator for a publication, there are as many locators as they are content documents in that publication as it's been pointed out by multiple people here including @BigBlueHat in a previous comment. That said, one specific content document in a publication may be deemed as more important than others. I don't think that's necessarily the first one in the reading order or the navigation document (if we have any), and marking that resource as the "start" could be one way to identify it. The URI of the manifest on the other hand is best suited as a WP identifier since:
In Atom and other API formats, this is often done by including a |
@TzviyaSiegman I don't think anyone attempted to address that specific problem in this issue or in another issue so far. Providing an identifier for a WP is within the mission of this WG. We haven't talked about Work-Expression-Manifestation-Item anywhere so far (and hopefully we won't). |
Let's talk purely about locators. I'm finding it helpful to look at existing web practices, especially around web apps, which I think are very closely related to web publications, being bundles of web resources viewed as a whole. So let's say I want to find out about what features Firefox is working on. A google search turns up a URL: https://platform-status.mozilla.org And indeed, this brings me to the web app. There's actually an https://platform-status.mozilla.org/index.html My browser also sees that there's a link to a manifest in that HTML page, so it knows about the manifest. As a user, I didn't have to worry about that either. In fact, the only way I can find out the URL of the manifest is to view source on the HTML page. https://platform-status.mozilla.org/manifest.json If I tried to locate the web app by using the URL of the manifest, I'd see the raw JSON, which is not a good user experience. For me, it makes sense that the URL of this web app is https://platform-status.mozilla.org/; that's all I need to know to access the content. The manifest does contain a We seem to have a pattern here. A URL leads to a web page which contains a link to a manifest. This is explicit in the web app manifest spec:
I think this pattern can work just as well for web publications. Can we just say:
|
+1 to @dauwhe's suggestion. This builds on existing functionality. Looking at the Web App Manifest
As we've discussed in the past, the spec is likely adaptable to publications, so s/application/publication. |
Sorry @dauwhe but at least for me this doesn't help to move the conversation forward. I've asked a question that IMO is the very key to that discussion several times, but you're simply avoiding it. I'll reply anyway, but this is not going anywhere...
Can we please stop talking about files and folders? This is a resource, not a file. It may be a static file on a server, but it might be as well a simple route handled by an app.
I don't think that anyone is arguing against discoverability, I've only argued against discoverability as a requirement (I strongly believe that this is a SHOULD, not a MUST).
It should also be optional for a Web Publication but for a different reason. For a Web App, it can be optional because the other option is to rely on the current URL for the resource that contains a For a Web Publication, it can be optional because as long as we have the equivalent of a I also still fail to see why we need a single locator for a WP, when obviously there are as many locators as there are content documents. |
As a start, I agree/repeat that we shouldn't talk about "the identifier of a WP"; the issues titled with that word should be renamed. Even "THE address of a WP" or "THE URL of a Web Publication" is not an interesting subject IMHO (they are several useful addresses for a given publication). By the way, I don't find a trace of an "identifier of a Web Application" in the Web App Manifest spec, but I find a start_url property. start_url is a clear equivalent of what @HadrienGardeur describes as a "start" url. But I see a difference between a Web App and a Web Publication: Web Apps are discovered by humans, where Web Publications will often be discovered and handled by applications (ex. reading systems, now UA). @dauwhe said:
Being member of the Readium team, I'd like to add 2 complementary use cases:
If we agree with these three use cases, we can reach an agreement on a simple solution: And yes, a document of the publication can link to the manifest, offering added discoverability to UAs. But if a UA is given the direct address of the manifest (this is a web resource, it has an address), adding a self link in the manifest itself will bring added benefits to UA (not to humans) |
I believe Hadrien's question was:
If the two instances of URI in that question both have the same value (e..g, foo.com/bar/blat.json), then I think that would not be interesting for presentation for to a basic UA, and a basic UA wouldn't be able to find a renderable resource as it doesn't understand the manifest. Dave, is that your perspective? And:
I'm wondering what consensus is on this. Do we believe this to be true of a "publication"? We may be reaching a point that we should table this clearly out of hand "issue" and attempt to reach consensus on the next call. |
The question is not whether the UA can render it or not. In his latest comment, @dauwhe is still suggesting an external JSON manifest that will be accessible (it has its own URI) and therefore might be rendered in a UA that doesn't know what to do about it. Why would having a
I'm not sure which consensus you're looking for @GarthConboy, I'm simply pointing out the fact that a server can handle a URI such as |
Do you interpret the self link as being the URL of the web publication? Does web publication have a URL at all? Or do only the components of a web publication have URLs? |
On Tue, Jul 18, 2017 at 1:18 PM, Dave Cramer ***@***.***> wrote:
Do you interpret the self link as being the URL of the web publication?
Did you mean *self* or *start*?
Does web publication have a URL at all? Or do only the components of a web
publication have URLs?
The resources of a web publication have URLs - that's a given.
A web publication can have one ore more locators, which are URLs that can
be used to access that publication. Depending on the server on which it is
hosted and how that server is configured, the actual URL would vary but the
result would be the same.
Consider that *http://www.foo.com/ <http://www.foo.com/>* will load a
resource on that server called *index.html* *ONLY IF* the server is
configured that way. The server could just as easily return any other
resource that it wanted. Same is true for any URL that points to a given
server.
|
Is there such a thing as a URL of a Web Publication? I'll quote a comment I've made in another issue:
I don't think we need some sort of "abstract" URI that represents a Web Publication from which a server will serve another resource or provide a redirection. As it's been pointed out by @lrosenthol before, this would make it impossible to host a Web Publication on Dropbox, Google Drive etc.
Once again, it's very unclear to me what you mean by "URL of a Web Publication". Here's what I'd like to have:
|
Acknowledging that we haven't yet decided if a manifest can be embedded in HTML, yes, there would be a URL to the manifest. Perhaps what this URL represents is out of scope.
Absolutely.
At least one?
Would this content document ("start_url") be required to have a link to the manifest? |
When I say "directly", I mean a single GET request so the statement should work for an external manifest or an embedded one.
I'd like to have the ability to entirely re-use existing content documents on the Web. This is a strong SHOULD but not a MUST IMO, not even for a single document.
I think that's an open discussion that we need to have. |
I believe @GarthConboy is suggesting that the discussions here seem to be leading, not to consensus about a particular point, but to further confusion. As a reminder, none of these issues have yet been brought to the WG. And, we do not make decisions without a call for consensus [1]. There are several assumptions in this thread. We cannot write a spec based on assumptions. I am seeing the following proposals for addressing:
I think the best way forward is to write actual proposals. This will give us something concrete to discuss at the next meeting. The purpose of these discussions on GH is to bring the discussion to the larger WG so that we can reach consensus and draft the specs. A separate point that has been made in other threads as well is that a publication might include multiple resources and/or URLs. See #10 for that discussion. I am not sure that we need WPs to work on tools like Dropbox or Google Drive. That seems like an issue for PWP. But, please open a separate issue to discuss portability. [1] https://www.w3.org/publishing/groups/publ-wg/WorkMode/#telco |
@HadrienGardeur said:
and then
I for one am confused about what you mean by "a WP identifier" in the first statement: I understood you were talking about a URL of the Web Publication, but the second statement seems to contradict that. This confusion aside, 👍 on the later statements:
I think it's reasonable to expect that some entities would deal with direct URLs to the manifest (e.g. UA and/or stores) and some other would rather deal with URLs to a starting document (e.g. user-oriented hyperlinks). |
By WP identifier, I mean a string that can uniquely identify a WP (two different WPs won't be using the same string). Since we're on the Web, using a URL sounds natural. We'll also always have a URL for the manifest, no matter if it's external or embedded, which is why the URL for the manifest feels like a good fit for a WP identifier. |
Trying to read through all these posts. I agree with Tzviya, this discussions seems to be increasing confusion rather than leading to clarity :( It seems we have a few requirements:
These requirements argue against a few things:
So, I have some thoughts about how to solve those uses (none amazingly great), but maybe we can try to focus explicitly on use cases for now. And not generic, everything anyone ever might want to do, but very specific things we want to make possible. Does my list (1, 2, 2a, 2b, 3) cover it? Are there others? Are mine incorrect? |
@bduga these requirements work for me. I'd like to point out BTW that 3 is not specific to WP, it's the way the Web works: any HTML document can link to other HTML documents using URLs.
That's why I believe that the URI for the manifest is better suited to identify a WP.
If these links are optional and there are other ways that a WP can be discovered, I don't think that's a problem.
Fully agree. That's a separate discussion, but I'd much rather have the manifest as an external JSON document. |
@bduga One more possible use case:
|
@dauwhe I'm not sure that technically (from a manifest perspective) these publications will have to be nested. This is how it could work with an omnibus containing three novels:
The manifests themselves do not IMO need to be nested. |
I am afraid that I do not understand the scope of this issue well. When I saw "URN" in the title of this issue, I thought that location-independent URNs (such as ISBN numbers) and resolution of such URNs to URLs (or something else) will be discussed. But this is not the case. You guys appear to discuss the relationship among manifest URIs, first-spine-item URIs, WP URIs, and so forth. |
From recent discussions in other threads, this issue should IMO better be renamed "Address of a web publication". This would make things clearer. |
addressed by #28 |
See telco discussion on closure. |
This is another issue being extracted from #5, with the hope that we will focus on the matter at hand.
How do we identify and locate a web publication? We do not seem to have an agreement yet on whether a WP should be located by the URL of the manifest, or the URL of the first content document. I have a strong preference for the latter.
The text was updated successfully, but these errors were encountered: