Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change of the ID, allowing for a URN and not only a URL #456

Merged
merged 8 commits into from
Jun 26, 2019

Conversation

iherman
Copy link
Member

@iherman iherman commented Jun 4, 2019

This is on the basis of the discussion on w3c/pwpub#47, discussed on the telco on 2019-05-03.


Preview | Diff

@llemeurfr
Copy link
Contributor

Only a minor issue: the new wording completes the section about URLs by a sentence stating that "As a slight abuse of notation, this value categories (?) may also be used to represent a URN ..." and later states that the canonical id is an IRI, which SHOULD be a URL but may be a URN in some cases.

Therefore it seems that the first addition (in the URL section) is not useful, even confusing.
(stating that IRI can be URL or URN is true, stating that URL may be a URN as a slight abuse is ... an abuse)

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

@llemeurfr yeah... it is a bit confusing indeed. The (purely editorial) point is that the document refers to "value categories", ie, sort of, datatypes to describe the acceptable values (and this counts for canonicalization) and one of those categories is called 'URL' for holding, well, URL-s. But in the new setting an identifier is an IRI, so the old description of the category does not apply to it because the old description referred to URL-s only (if you can still follow me:-).

Maybe the best option may be to use a different term for the value category, ('address'?) to separate the meaning of the term 'URL'.

@mattgarrish, wdyt?

@laudrain
Copy link

laudrain commented Jun 4, 2019

There is already an 'address' in the manifest properties:
https://pr-preview.s3.amazonaws.com/w3c/wpub/pull/456.html#address
'identifier' perhaps?

Also, following the sentence:

The specification of the canonical identifier MAY be complemented by the inclusion of additional types of identifiers for the Web Publication using the identifier property [schema.org] and/or its subtypes.

could we add a ligne in exemple 54 like:
"isbn" : "9780123456789",

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

@laudrain

There is already an 'address' in the manifest properties:
https://pr-preview.s3.amazonaws.com/w3c/wpub/pull/456.html#address
'identifier' perhaps?

I do not think so. "address" is a real (http) URL, because it is a Web address. Conceptually, and identifier is there for identification that is not necessarily a Web address (although we say it SHOULD). I would think keeping this two notions clearly separated is better.

could we add a line in exemple 54 like:
"isbn" : "9780123456789",

Yes, and it was actually there in the previous version when that example did not have a id term. But if we allow URN-s as identifiers, this means we can use urn:isbn:9780123456789 as an id, I have the impression that it would just muddy the waters to add the separate isbn term there, too...

@mattgarrish
Copy link
Member

A higher-level thought I had yesterday was that canonical identifier may no longer belong under the WP section if we're losing the requirement that it resolve to the publication. Wouldn't that make it usable in any profile, with the must/should requirement to resolve being specifically a WP requirement?

In terms of classing these, I'd create something more specific to the use case, like "Identifiers". The point I take from @llemeurfr's comment is that we're not abusing notation by allowing URNs, but starting to mangle definitions in a way that makes reading complicated. Create a new category and we're not abusing anything.

It is kind of disconcerting to see a return of IRI after we resolved to use URL, but at the same time I don't see how we can call the canon id anything but one. Hopefully, if we use a value class like "Identifier" it won't cause any additional confusion with our use of URL/address elsewhere.

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

A higher-level thought I had yesterday was that canonical identifier may no longer belong under the WP section if we're losing the requirement that it resolve to the publication. Wouldn't that make it usable in any profile, with the must/should requirement to resolve being specifically a WP requirement?

Let me see if I understand. Does it mean

  • we move the 'identifier' part into the publication manifest part, referring to IRI-s as a general term
  • we have a separate 'identifier' part in the Web Publications part where identifier=URL is a... SHOULD? MUST?

To be honest: I am not sure it helps us too much (if my understanding is correct). It is good to separate PublManifest and WPUB because they are conceptually different. But I have difficulties to imagine any profile that would directly build on top of PublManifest and not on WPUB. If that happen, we would then have a profile without reading order, publication bounds, etc.

But I may not understand what you mean...

In terms of classing these, I'd create something more specific to the use case, like "Identifiers". The point I take from @llemeurfr's comment is that we're not abusing notation by allowing URNs, but starting to mangle definitions in a way that makes reading complicated. Create a new category and we're not abusing anything.

Ie, a separate category alongside 'URL'? I thought of that this morning but I shied away, because that is the only term that would use this new category. But, then again, something else may come up later...

Will you pick up the thread? (I will be out until tomorrow afternoon soon, so you can make any editing on the branch...)

It is kind of disconcerting to see a return of IRI after we resolved to use URL, but at the same time I don't see how we can call the canon id anything but one. Hopefully, if we use a value class like "Identifier" it won't cause any additional confusion with our use of URL/address elsewhere.

Yeah well... the Web world, à la browsers, only deals about dereferencable addresses, ie, URL-s, and so far that was all we had. I think using URI (or IRI) only when there may be an explicit need to go beyond dereferencable addresses is actually a good thing. Using IRI-s everywhere in the text, even for, say, an address where we really really want an http-type address, could have been just as disruptive.

@llemeurfr
Copy link
Contributor

An alternative could be, define the value space of canonical id as union(URL, URN) without introducing IRI there.

@BigBlueHat
Copy link
Member

@iherman

Using IRI-s everywhere in the text, even for, say, an address where we really really want an http-type address, could have been just as disruptive.

Agreed. The key distinction we need to make is "locator" vs. "identifier" (now that canonical identifiers are not also required to eventually locate something).

@llemeurfr

An alternative could be, define the value space of canonical id as union(URL, URN) without introducing IRI there.

I'd keep them separate because of relative expansion via <base> and @base. URNs should be required to be absolute to avoid them being expanded to absolute URLs accidentally.

I do think separating URL from URN in the value categories is necessary to avoid confusion all around.

@mattgarrish
Copy link
Member

But I have difficulties to imagine any profile that would directly build on top of PublManifest and not on WPUB.

Isn't audiobooks exactly the genesis of this separation?

The canonical identifier can't be provided, or can't be known, until the publication is deployed on the Web, so why drop the restriction for a resolving identifier if the only case is to make things that aren't quite web publications compatible?

If the desire is just to have a unique identifier for packaged content, then why are we playing with the canonical identifier definition at all? It's not necessary to include the property, so packaged content is perfectly fine without one.

But I'm getting confused what problem we're trying to solve now. Is the goal to be able to provide a unique identifier for packaged content? If so, then I'd agree we leave the existing definition where it is but perhaps rename it the canonical address. We could then look at a unique identifier field to allow ISBNs, UUIDs, and other things to travel with the publication.

Or what am I missing?

@BigBlueHat
Copy link
Member

I have the impression that it would just muddy the waters to add the separate isbn term there, too...

The reason one might still use isbn is that SEO bots won't be (currently) parsing the URN in "id": "urn:isbn:1234234324" (sadly). It is a formally specified method for expressing an ISBN in a way that it can't be confused with other identifiers, and someday the bots will get an upgrade. 😉 In the meantime, it's OK (and probably best) to use both--one for actual identification and one for metadata.

@mattgarrish
Copy link
Member

Or is the problem here that we talk too much about the publication resolution aspect of the canonical ID when it is first and foremost an identifier?

If resolution is just a nice extra feature of the canonical identifier, then I'd suggest it entirely belongs in part 1 of the specification. Perhaps what is confusing me is just this emphasis we have on locating the publication separate from the address.

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

Isn't audiobooks exactly the genesis of this separation?

No. Audiobooks have reading order, possibly resources, table of contents: all are defined as part of WPUB-s.

The same issue may come up with other publications. By restricting identifiers to be http URL-s we cannot properly use things like ISBN-s which are identifiers.

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

@BigBlueHat

The reason one might still use isbn is that SEO bots won't be (currently) parsing the URN in "id": "urn:isbn:1234234324" (sadly). It is a formally specified method for expressing an ISBN in a way that it can't be confused with other identifiers, and someday the bots will get an upgrade. 😉 In the meantime, it's OK (and probably best) to use both--one for actual identification and one for metadata.

that is absolutely correct. My worry is how would you explain, into the WPUB document's example, why you would use

"id" : "urn:isbn:1234234324"

and

"isbn":"1234234324"

the explanation is that (in my view) schema.org mixes up two concepts (the id in the JSON-LD sense, and a pure identifying string that happens to be an ISBN), but that would not help...

@BigBlueHat
Copy link
Member

But I have difficulties to imagine any profile that would directly build on top of PublManifest and not on WPUB.

Isn't audiobooks exactly the genesis of this separation?

Yeah...this comment confused me too, @iherman. I think that the core data model that is expressed in PublManifest should include reading order, bounds, etc, and it has a need for identifiers that aren't necessarily resolvable (i.e. they're not also locators).

If resolution is just a nice extra feature of the canonical identifier, then I'd suggest it entirely belongs in part 1 of the specification. Perhaps what is confusing me is just this emphasis we have on locating the publication separate from the address.

Our current prose states:

A Web Publication's canonical identifier is a unique identifier that resolves to the preferred version of the Web Publication. It is expressed using the id property.

In JSON-LD, the id (or @id) is just an identifier which may or may not locate something. So, WPUB currently adds the requirement that it (eventually) locate something.

Where this has headed is that we remove that WPUB-level requirement returning the id to it's "identifier which may also locate" status and look to other mechanisms for actually locating the publication (either via the url publication address or via a kindly librarian who can resolve urn:isbn:13241235324 for you).

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

Yeah...this comment confused me too, @iherman. I think that the core data model that is expressed in PublManifest should include reading order, bounds, etc, and it has a need for identifiers that aren't necessarily resolvable (i.e. they're not also locators).

Maybe so. At the moment, this is not how the document is structured, though.

I would try to refrain reorganizing the document again (even if it may not be ideal) and try to make the least possible changes...

In any case, audiobooks are Web Publications, too. We may get to something slightly different when they are packaged, but that is a different matter. Audiobooks can be served on the Web (e.g., for streaming) in which case they may have id values and all that jazz...

@mattgarrish
Copy link
Member

In JSON-LD, the id (or @id) is just an identifier which may or may not locate something. So, WPUB currently adds the requirement that it (eventually) locate something.

Right, this is confusing me in terms of understanding just what we need to achieve with this identifier. It seems we've layered HTML canonical addresses onto the JSON-LD identifiers and created a new beast.

I agree with @BigBlueHat that this property is no longer specific to web publications, but has become part of the general data model for digital publications. All publications need an identifier, but only web publications can use a canonical address to achieve that.

Why can't the canonical address just be specified in the links section with rel=canonical, though? Doesn't that begin to extricate us from this problem, as then the canonical identifier can be whatever you want, including the canonical address.

@iherman
Copy link
Member Author

iherman commented Jun 4, 2019

Why can't the canonical address just be specified in the links section with rel=canonical, though? Doesn't that begin to extricate us from this problem, as then the canonical identifier can be whatever you want, including the canonical address.

It is important to have the id as part of the JSON-LD. That means that, per JSON-LD semantics, all the statements we make (author, title, etc) are on that ID.

I agree with @BigBlueHat that this property is no longer specific to web publications, but has become part of the general data model for digital publications.

Which is fine with me if it can done, editorially, easily and properly...

At this point I think we should let @mattgarrish look at the document from an editorial point of view to see how these can be smoothed into the document with the smallest possible amount of work...

@BigBlueHat
Copy link
Member

In any case, audiobooks are Web Publications, too.

What makes an audiobook (or any other similar packaged publication) a "Web Publication"? Is it the JSON-LD manifest? Is it that it has a URL (i.e. can be linked to and retrieved)?

Once packaged, it wouldn't have to have an entry page (so can't load in an existing browser even if unzipped) and wouldn't have a URL to dereference and may not have an identifier (given current discussion) in order to find a copy of it elsewhere on the Web (or related content, other human readers/listeners, etc).

These aren't meant to be pedantic ontological questions (honest!). They have technical implications to the ecosystem from distribution to consumption to citation.

Audiobooks can be served on the Web (e.g., for streaming) in which case they may have id values and all that jazz...

It's knowing when "all that jazz" is important and knowing when there's a need (or lack of one) to add the necessary bits.

Restructuring the document would help explain the inheritance model from a core conceptual "publication data model" through the PublManifest expression and then the bits that make something a packaged publication or a Web Publication.

@mattgarrish
Copy link
Member

Once packaged, it wouldn't have to have an entry page (so can't load in an existing browser even if unzipped) and wouldn't have a URL to dereference and may not have an identifier (given current discussion) in order to find a copy of it elsewhere on the Web (or related content, other human readers/listeners, etc).

I thought the goal of splitting the common manifest format was exactly so that the differences could be tackled at the profile level? With some coercion, any audiobook can be transformed into a web publication.

That doesn't make an audiobook a web publication, but a slightly different subset that retains cross-compatibility.

@iherman
Copy link
Member Author

iherman commented Jun 5, 2019

It's knowing when "all that jazz" is important and knowing when there's a need (or lack of one) to add the necessary bits.

That doesn't make an audiobook a web publication, but a slightly different subset that retains cross-compatibility.

Are we mixing up the audiobook spec and the packaging note? The former is clearly a "profile" of WPUB, with some restrictions on the type of documents considered, with an extra type on the document itself, etc. Why is that put in question now?

Packaging creates a slightly different situation as something that may come from outside the Web, but it is also its intention that it can be "unpackaged" on the Web albeit by a process that creates, if necessary, the PEP to make it really a WPUB.

I do not really see any new problem.

@mattgarrish
Copy link
Member

Why is that put in question now?

That's not what I'm questioning, but how is it that the packaged form can be invalid? If what we're packaging isn't a valid audiobook, what is it?

I was under the impression that we wanted both forms to be valid, but I don't see how that is possible if audiobook inherits from web publications instead of from the general manifest.

@iherman
Copy link
Member Author

iherman commented Jun 6, 2019

Why is that put in question now?

That's not what I'm questioning, but how is it that the packaged form can be invalid? If what we're packaging isn't a valid audiobook, what is it?

I was under the impression that we wanted both forms to be valid, but I don't see how that is possible if audiobook inherits from web publications instead of from the general manifest.

There was indeed a long discussion about the validity/invalidity of the packaged form and the WG, for purely pragmatic reasons, accepted the request that the content in the package can deviate from the WPUB spec on one aspect only: that it is not required to have an PEP but, instead, the manifest alone would be enough for the package. That being said, if the package is unpacked on the Web it is supposed to turn it into WPUB by creating, if necessary, a trivial index.html file that contains just a link to the manifest. So far we do not have any other deviation that I know of, and the reason we have this thread and PR is to avoid another possible incompatibility creeping in…

@llemeurfr I hope what I am saying is correct :-) (also, it may be a good idea to add some words into the LPF document emphasizing these facts to avoid future misunderstandings…)

@rdeltour
Copy link
Member

rdeltour commented Jun 6, 2019

Editorially, it kinda feels weird to have [rfc3987], [url] (and probably also [urn]) as normative references, when the URL Standard obsoletes RFC 3987.

In our spec a URL is normatively defined by the URL standard. Strings that we're used to refer to as IRIs and URNs are valid URL strings according to this spec (if I'm not mistaken).

My suggestion is to:

  • use normatively URL everywhere, and possibly refer informatively to older RFCs to disambiguate the meaning for readers more familiar with those.
  • pay special attention to the concepts and terminology we use: do we require a valid URL string in the JSON manifest? if yes what's the error handling processing model? or do we want any string an refer to the URL parsing to get a URL record out of it (as typically done in HTML)?
  • clarify the identifier vs. locator issue using these exact terms, and by explicitly specifying the processing model, rather than by referencing to "URL" vs "URN".

@mattgarrish
Copy link
Member

Actually, I believe it is there between the lines, but I agree it would be a good idea to add a section on making these things more explicit.

Ya, that's problematic in itself. We should state outright the limitations relative to web publications (e.g., the naming stuff, that the entry page and manifest can't be in different directories, that there can't be any resources above the directory that holds the pep/manifest, that there can't be resources hosted on other domains, etc., etc.).

A "Packaging Limitations" section near the top would be extremely helpful so people understand that "lightweight" is relative to what can actually be handled.

remove unreferenced url and identifier definitions from terminology;
change references to urls to url or identifier types;
add canonical address relation
@mattgarrish
Copy link
Member

I'm not sure if my last commit captures everything, but please have a look and let me know what you think. To recap the changes:

  • it undoes the changes to the canonical identifier section
  • it moves canonical id to part 1
  • it adds a new identifier value type and updates the canonical id section to use it
  • it restricts the url value type to http and https
  • it removes url and identifier from the terminology section as these were never explicitly referenced anywhere
  • it links references to URLs to the value type or identifier type, as appropriate
  • it adds canonical address to the relations since we're losing that aspect from the identifier (arguably, anyway)

I'm not sure about the last change, but it provides resolution to a preferred address without overloading the canonical identifier.

@iherman
Copy link
Member Author

iherman commented Jun 9, 2019

@mattgarrish, all in all, the direction does look great, but I do have some comments. Not in priority order:

  • 2.6.2.3.3: The description of Links refer to URL in general, which may suggest that, e.g., URN-s are also acceptable. Which is probably not a good idea, and the formal specification of Linked Resource in Appendix A specifies that the value type is a manifest URL (Ie, http(s) only). But, maybe, it is better to emphasize in the narrative that the URLs are supposed to be http(s).
  • 2.6.2.5: All other value types describe the value in plural, this is singular...
  • 2.6.3.2: spelling mistake: 'digital publicaion'
  • 2.6.3.2: I think we should make it clear that the canonical id SHOULD be an http(s) scheme. (Maybe this is remark should be put into 2.6.2.5, in fact, because the creator type also refers to it.)
  • 2.6.3.2: in the example with an ISBN, I think it would be better to have the id with a value of urn:isbn:9780123456789 instead of (or together with?) the explicit ISBN value. We have to show that the id can be a different URL scheme, too; this is, after all, what triggered the full change:-)
  • 2.7.2.2: I do not think we need the canonical address. The canonical ID SHOULD be dereferencable (see my comment above), ie, it SHOULD play the role of a canonical address (this is the case, e.g., in the W3C spec example with the short name). The only point is that the canonical id MAY be something else (or may not be present at all). The fact that it is optional was already the case...
  • 3.4.2.1: The definition table says that the value Type is an Array of URL-s. I presume that is a mistake.
  • 2.3.1: the WebIDL is missing the id as well as the address

@llemeurfr
Copy link
Contributor

re. #456 (comment), Ivan summarized the situation very well: a Package can represent a publication we could call a "pre-WP", i.e. a publication made of a Manifest and its resources, which may be exposed as a WP after trivial modification. Every LPF Package can become a WP, and many but not all WP may be packaged as LPF (a "user guide" will be useful to illustrate that aspect).

@mattgarrish
Copy link
Member

Every LPF Package can become a WP, and many but not all WP may be packaged as LPF (a "user guide" will be useful to illustrate that aspect).

The scope of what the specification can actually handle shouldn't be in a user guide.

@llemeurfr
Copy link
Contributor

re. #456 (comment) from Matt:

When a WP can be packaged as LPF (again, this is not the case for all WPs), the WP address could technically be retained in the Manifest as a full URL; but as this "frozen" WP is now detached from the Web, it's really better to consider that the WP address becomes the relative URL representing the path to the PEP inside the Package (ex. "url":"index.html"). Note that the PEP already exists for a packaged WP.

This corresponds to my proposal in w3c/pwpub#49 (comment).

In summary, the discrepancy between WP and LPF isn't wide when we consider the LPF -> WP transform, and totally manageable when we consider the WP -> LPF transform, if applicable.

@mattgarrish
Copy link
Member

mattgarrish commented Jun 9, 2019

  • I think we should make it clear that the canonical id SHOULD be an http(s) scheme.

If we're making this allowance so that packaged WPs are valid, why should it be a warning not to use a URL? Is there a specific reason why it needs to be a URL?

@llemeurfr
Copy link
Contributor

Re #456 (comment) and #456 (comment), this is an interesting question: should the LPF specification contain the processing model (algorithms) which specify LPF to WP and WP to LPF (with the contraints on WP structure for being able to package it as WP)?

Until now I thought the consensus was NO as LPF is a file format spec (which reuses the Manifest defined for WP and is scoped by the introduction of the LPF spec), and not a pure WP packaging spec (i.e. not the final PWP the WG would like to define).

@llemeurfr
Copy link
Contributor

re #456 (comment), now that Romain pointed us to the fact that the URL spec now INCLUDES URNs, I don't understand the issue.
A provider of content (e.g. audiobooks) will be happy to generate a canonical id for each publication he generates, in the form of a URL or URN (*). If he provides a URL, people will try to deference the canonical id to find the publication on the Web: it will succeed sometime, sometime not (think about the XML namespace URIs, the story is well known). If he provides a URN, well people won't click on it, but other means of de-referencing the content may be provided (**).

  • I still don't know if DOI can be considered URLs with this extended concept of URL.
    ** ISCC will get a URN form, I spoke with their developers last week. This could be an interesting way to create free identifiers and have a distributed engine to dereference them.

@mattgarrish
Copy link
Member

  • The definition table says that the value Type is an Array of URL-s. I presume that is a mistake.

The definition normatively says a WP may have more than one address. What does that mean if not an array? Other schema.org properties? Other addresses in the "real world" but you're only allowed to practically specify one?

@mattgarrish
Copy link
Member

mattgarrish commented Jun 9, 2019

  • the WebIDL is missing ... the address

Sure, but the address isn't a property of the publicationmanifest dictionary. Granted, I'm wondering why we bothered to split the specification at all if we're just working back to packaged audiobooks being somewhat invalid web publications.

If we're not attempting to make the packages valid, or we're just going to define WP properties such that they don't have to always be "webby", it's probably more confusion than it's worth.

@llemeurfr
Copy link
Contributor

llemeurfr commented Jun 10, 2019

@mattgarrish what you just said is key. I forgot that the Publication Manifest specification does not contain the Address and Canonical id, which are defined in the Web Publication Manifest. As the Package is using a Publication Manifest, not a Web Publication Manifest, my issue with Address in Packages is solved.
I see now that the Publication Manifest (part I) is what a publisher (i.e. content creator) will define, where the Web Publication Manifest (part II) is what a distributor will expose on the Web.

There are still remaining questions related to this PR:
1/ I went through the URL Spec and didn't find how URNs can be valid URLs. @rdeltour ?
2/ The canonical id is defined in Web Publication Manifest, where I believe it it should be defined in the Publication Manifest as it acts as the rdf id and should be defined by a publisher (e.g. to be included in a Package).

@iherman
Copy link
Member Author

iherman commented Jun 10, 2019

The definition normatively says a WP may have more than one address. What does that mean if not an array? Other schema.org properties? Other addresses in the "real world" but you're only allowed to practically specify one?

Oops, I did not remember that. Then the Array is the good one, forget my note.

@iherman
Copy link
Member Author

iherman commented Jun 10, 2019

@llemeurfr

I went through the URL Spec and didn't find how URNs can be valid URLs.

It does need some forensics...

  1. The url representation https://url.spec.whatwg.org/#url-representation defines a scheme. The table has three categories for schemes: special (i.e., http(s) and similar), file and non-specials. At that point it does not talk about what the latter category can be.
  2. In URL Writing it writes about a URL-scheme string. It refers to the IANA schemes. That table, among others, does include https(s), but also urn.

I admit it is not very explicit, and it should be.

@iherman
Copy link
Member Author

iherman commented Jun 10, 2019

The canonical id is defined in Web Publication Manifest, where I believe it it should be defined in the Publication Manifest as it acts as the rdf id and should be defined by a publisher (e.g. to be included in a Package).

+1

@iherman
Copy link
Member Author

iherman commented Jun 10, 2019

Sure, but the address isn't a property of the PublicationManifest dictionary. Granted, I'm wondering why we bothered to split the specification at all if we're just working back to packaged audiobooks being somewhat invalid web publications.

Actually, I did not realize that the address was not part of the WebIDL before either.

The way I look at the difference between Part I and II is somewhat akin to what @llemeurfr said, but not exactly: the Web Publication Manifest is what the content creator has to provide, and Part II, ie, the Web Publications part is how the manifest and is used when things are put on the Web: the PEP, how to locate the manifest, how to process it (e.g., that the PEP's <title> can complement the author's manifest if not explicitly defined), etc. In this respect (a) keeping the id in that part was probably a bug, and it is better placed in Part I and (b) address is some sort of an odd-man-out because it becomes part of the manifest but its value is conceptually finalized when things are put on the Web. In some sense, it is the "link" that binds the manifest and its presence on the Web. (Maybe it is worth calling it out this way.) At the end of the day, it is a value that MUST appear in the Manifest once it is on the Web (and it would not shock me if it appeared in the WebIDL because, at the end of the day, in practice it will be part of that JSON-LD blob).

Ie, I do not really think we have some sort of a problem, and I believe the current sectioning is fine.

If we're not attempting to make the packages valid, or we're just going to define WP properties such that they don't have to always be "webby", it's probably more confusion than it's worth.

I do not think so. Separating the pure metadata part from, say, how to obtain the manifest from the address is a good thing imho.

The issue of packaging is orthogonal, and should be called out in that document...

@rdeltour
Copy link
Member

@llemeurfr

I went through the URL Spec and didn't find how URNs can be valid URLs.

@iherman’s pointers are on point.
If you need more (albeit informative) evidence, you can look at the URL tests, or comments from the editor ☺️

@mattgarrish
Copy link
Member

To see if we can move this PR forward and stop it blocking other work, I've made canonical identifier again a "should" for url and dropped the relation. I've also moved address to part 1 since we seem to be saying there can't be any significant variation between adaptations of the manifest format.

I'll take a look at our "URL" terminology again in another PR, as how can you possibly get confused by strings and records? ;)

@iherman
Copy link
Member Author

iherman commented Jun 26, 2019

Thanks @mattgarrish. We should indeed merge this and we can finesse this later to align it better to the URL spec's terminology...

@mattgarrish mattgarrish merged commit 0572191 into master Jun 26, 2019
@mattgarrish mattgarrish deleted the id-reworded branch June 26, 2019 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants