-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it acceptable to use HTML for the serialization of some infoset items, or should it all be in separate (JSON) file? #193
Comments
Trying to collect pros and cons, based also on earlier discussions. (Let us try to collect all the Pro/Con arguments in the most concise manner possible to make an informed decision...)
Note that, although we are not discussing WAM-s, the arguments in the section on the same issue in the WAM document (and the links in there) are also relevant. |
Note that editing such infoset by hand would be equally difficult in the html and json cases. Meaning that whatever the choice is btw a highly specific web page and a json structure, an authoring tool seems mandatory. Which leads to an additional Con.
|
Just a few quick notes first:
I'd also like to list an additional con: may require additional network requests that could block the processing of the WP. If I discover a publication through one of its chapter, this means that:
Since I'll only be able to discover these additional HTML resources through the manifest, this means that these fetch requests (plus all the processing related to HTML) will have to be done sequentially and not in parallel. The majority of the pros listed by @iherman could also be challenged IMO because they're mixing up two different issues:
In the case of a single-HTML document, I don't think that using There are less semantic issues with JSON-LD (the metadata is not necessarily about the document that contains them) and I would argue that it's easier to author JSON-LD than RDFa. To go back to the list of pros, we could also say that JSON-LD embedded in HTML is:
I'm not really buying the redundancy arguments (we're not expressing the same information) or the more "natural bridge" one (browsers ignore the vast majority of metadata and links that we would end up using in HTML). I'd like to hear @BCWalters opinion on this as well, now that we have a major browser actively participating in this WG, I think there's a lot of value to what they have to say about this. |
There is a business consideration that weighs into the HTML vs JSON question because it is easier and cheaper for me to find HTML coding resources than JSON coding resources and my team is less likely to follow a standard that is (even more) expensive to maintain. To confirm this, I reached out to our most commonly used vendors - all replied that they would need time to staff up and train JSON developers but that they had plenty HTML developers on staff. |
Thanks @RachelComerford, this is a very important, non-technical point... |
I'd not limit it to just |
Depending on how this is modeled and "gone about" it maybe that the "binding" document is imperceptible from the publication itself. Or, alternatively:
Regardless, this is easily avoidable...so no an implicit "con."
There's no requirement that a DOM, CSSOM, Accessibility OM, etc. be setup or available when extracting metadata from HTML files. It's possible to get it directly out of the markup without those things. Additionally, when "browsed to" the browser will provide all those things, and could potentially make that data more easily extract-able by the developer (or within the UI of the browser).
Couldn't agree more...but that's not a "con" of an HTML-driven approach to these problems. If, for instance, all the primary resources are referenced from an HTML-based "binding document" (perhaps through something like a latent-loading Each of the current infoset items are expressible from within an HTML document (see my last comment for a handful of options), and what's needed next is to know how to enhance their expressions as available now such that they are more useful. Moving such core concepts as the primary resources or redundantly expressing dependencies into a separate "manifest file" is duplication, will cause errors when out-of-sync, does create an over dependence on tooling, and ultimately puts the processing power out of the reach of the publisher/developer and into the hands of the "reading system" developer exclusively. Consequently, I'd not see our currently defined Ultimately, we'd go through the same process of finding homes for each of the infoset things in the HTML "binding document" (which is clearer than "entry point"), remove them from a/the JSON serialization until we find things that must be expressed in JSON. tl;dr web publications exist already (built from HTML, JS, CSS, RDFa, etc), so how do we make them better, stronger, faster, more accessible, offline-able, etc. |
I'd like to make another non-technical point: we should not be creating a complex creation systems for publishers. Descriptive metadata, including navigation items, should go in as few files, and as few formats, as is technically possible. As Ivan said:
If we tell publishers "in order to create a WP, you need to put this infoset data over here in HTML, and this infoset data over there in JSON," we're raising the barrier to entry for anyone who doesn't have a WP-aware authoring tool. IMO, much better to choose an imperfect design which publishers will actually be able to use than the most perfectest beautifullest awesomest architecture which is a pain for creators. (I have no horse in the race of actual location and format, and personally I'd be happiest if all the players in this conversation came to a place where they realize that no solution is perfect and all the people disagreeing have valid points. Unfortunately a classic compromise is the worst possible solution, because we really just need to pick one. There is literally no solution on offer without cons; we still have to choose one and move on to the rest of the work.) |
I know there can be more data than just |
Yes, I agree; this is the case of a single-document PW; this is one of the "Pro" arguments.
Sorry, but I do not agree. The quoted HTML specification does not refer to a URL, it refers to the document itself, whichever path was used to get there. I believe the HTML standard is pretty clear about it. If we use the HTML headers, we should simply accept that we are willfully overstepping the bounds that the HTML standard defines (but I am not sure the rest of the community would accept it, we may face major objections).
I think we have to agree that we disagree on that point.
This is theoretically correct, but I do not think it is practically true. Any implementation will use one of the many, possibly "built-in" HTML parsers, and all those parsers build up the DOM. I do not think we can expect an implementation to have a different parser that would just look at the syntax or do some other tricks.
I am not sure I understand what you mean. Yes, of course, if the UA begins to render, display, etc, the WP, then those data are already there, because they are in the DOM. The "Con" is for the cases when, say, the Reading System or the browser builds up, say, bookshelf, for which a number of Infoset items are necessary. That being said, if we go along with the idea of finding the manifest file via a
True... except that it remains to be proven that all infoset items can be expressed easily and in a user-friendly manner via the current HTML element set. To take an example: we did say that the language tag in a content file (ie, an HTML file) is not the same as the language tag for the publication as a whole. In other words, the regular
To be honest, you lost me here; more exactly, I do not see the problem. If we say (as we seem to converge to in #104) that we simply take the browsing context as given, I just do not see the issue accessing the separate JSON file in this browsing context. That information is accessed from the entry point (in its own browsing context, as we seem to converge to in #104), then all the rest is clear: that is the context we are operating in. Let alone the fact that many elements in the infoset (title, authors, etc) are unaffected by the browsing context.
See my comment above. I am absolutely not sure it is as simple, more exactly that the resulting definitions would be clearer and simpler than doing it in JSON. Note that the experience in RDFa is not really good (alas!), meaning the relying on RDFa may not be that helpful. (Authoring RDFa can be a major challenge, and is very opaque for non RDF-savy persons (and is sometimes difficult even for people like me, I frequently have to run RDFa+HTML through my own distiller to see what the generated RDF is). Microdata is, maybe, even worse, because there are features that cannot even be expressed in microdata...) |
Trying to move forward: would the usage of a
What this means is that there is not necessarily a separate file to be authored; all is in the same file; would that alleviate your issues, @deborahgu and @RachelComerford ? It would not necessarily help with the issues of @llemeurfr because today's authoring tools rarely help for the authoring of embedded data. On the other hand, the semantics of the The experience shows that authoring JSON for metadata-like information is simpler than doing it in, say, RDFa, so we would gain that. Note also DanBri's comment: Schema.org also uses this JSON(-LD) wrapper to extract information. (An even more radical proposal would be to use the embedded |
Ivan, +1 to the JSON-in-script / JSON-as-file approach (although I suspect reading system developers would prefer a directly-accessible standalone JSON, as this saves parsing an HTML document and performing an additional fetch request). |
Consider the following document returned from <!DOCTYPE html>
<html lang="en">
<head>
<title>Moby-Dick</title>
<meta name="author" content="Herman Melville">
</head>
<body>
<nav>
<ol>
<li><a href="c1.html">One</a></li>
<li><a href="c2.html">Two</a></li>
</ol>
</nav>
<iframe id="c1" name="c1" src="c1.html"></iframe>
<iframe id="c2" name="c2" src="c2.html"></iframe>
</body>
</html> If |
How are these steps avoided? Is the idea that user agents will go through the process of obtaining and processing the manifest before the user makes any decision about whether they even want to initiate the reading experience, and stop rendering the document until a decision is made? In other words, does an external file really save anything in processing time, except perhaps in the (rare?) situation where a user says to always initiate publications and the link is available in an HTTP header? |
@iherman about
I totally agree with that statements. At allocine.com, we embedded RDFa, then microdata (preferred), in our film / star etc. pages. But it was the work of the technical team, in page templates: certainly not the work of the editorial team. And I'm pretty sure that this is how 99.9% of websites containing RDFa or microdata are constructed. |
It might be simpler to use something like dcterms/schema.org isPartOf/hasPart to associate the fragments than duplicate metadata, but I don't follow the argument that a multi-part document cannot be wholly identified by the first of its resources. |
@dauwhe (referring to #193 (comment)) great questions... I am not sure, and I do not think the HTML spec clearly says anything about this case. However, looking at the HTML spec, a document within an iframe has its own Document element (and own context), so my gut feeling is that, in your example, the author of the iframe-d content would be unknown. It is probably the same with |
@iherman... to be honest, I don't understand the solution? _Trying to move forward: would the usage of a <script> element alleviate the problems? (See also #122). Here is what this would mean:
What this means is that there is not necessarily a separate file to be authored; all is in the same file; would that alleviate your issues, @deborahgu and @RachelComerford ?_ |
Sorry to be terse, @RachelComerford. My bad. The choices discussed so far were:
Both you and @deborahgu commented that to edit two separate files would be a load on your developers; as @deborahgu said "Descriptive metadata [...] should go in as few files, and as few formats". I do have some significant problems with the 2nd approach. In essence, I believe, it is not really possible to avoid some extra "formats" (where by format I also mean microdata and/or RDFa, or complex set of attributes on HTML elements, etc.) and, among those, I also believe JSON is still the simplest one. But at least the problem of handling several files can be alleviated. Indeed, the HTML standard allows to use the following in the header of an HTML content:
I.e., we can use this element to encode the infoset items in JSON, while staying within the (entry point) HTML file. This is how webmasters mark up their files for schema.org, b.t.w., if they choose JSON-LD for their data. (See a random example: go to the bottom of the page to choose the right tab for some examples.) In other words, we would be in good company:-) and, in fact, the metadata part of our infoset may automatically be used, when on the Web, by schema.org (at least for schema.org terms) which is a bonus... I hope this is clearer... |
Handling multiple files is only a "problem" for single resource publications. With publications spread across multiple resources, going through an HTML resource to extract JSON-LD in a script element makes things more complicated than it should be. That said, I fully agree that for expressing metadata on an HTML resource, JSON-LD + schema.org is the preferred method on the Web today. |
There was no intention to be "harsh." I meant the technical term "out-of-band":
And "out-of-band data":
"different" and "currently unknown" were meant to be positioned in contrast with where Search Engines (the primary incentivize driving web publication metadata) get their information. I apologize that the sentence was so easily misread as to contain a "harsh" tone. I'll work to link to any technical terms that may have taken on a different cultural meaning. |
@iherman it was never my intention to have this HTML approach be limited in anyway to just what is expressible directly on elements and attributes (i.e. RDFa and Microdata). Using JSON(-LD) (or any other format) in a Apologies for tot being clearer about these specifics. Building up from HTML means we have everything HTML provides at our disposal...including JSON. 😄 |
Additionally, the infoset includes both metadata pieces (title, reading progression, etc) and request/response/hypermedia related things (primary resources, etc). The point of using HTML as a foundation is that it already has a known (and carefully crafted) set of specifications for handling the inclusion, processing, and contextualizing of Web resources. We don't have that (to my knowledge) with any other format (because SVG doesn't have iframes 😉). The metadata concerns and the "binding"/rendering/presenting concerns are different domains of use, experience, security, etc. We should model them accordingly. |
The underpinning premise is that "web publications" (lowercase on purpose) exist today (ex: http://guide.couchdb.org/ http://hpbn.co/ http://resilientwebdesign.com/ )--you can load them up and read them now. However, their metadata (mostly RDFa because ogp.me) and "binding" (which is mostly a next/prev experience) is provided in idiosyncratic ways and either expressed "inline" (next/prev links in each resource) or built via JS (as is the cause with the CouchDB Guide). Consequently, the "infoset" items are currently expressed with some overlap in consistency in only a few places (mostly ToC and some ogp.me metadata), so there's not much the browser can do to provide additional or enhanced reading experiences for the publication as a whole (i.e. no publication-wide search unless via a service, linear progression experience is identical to clicking any other link in the book, etc).
The point is that the experience of the "web publication" (lowercase on purpose again) isn't blocked by anything. Obtaining and processing any additional, consistently expressed data or affordances could (in a future browser or via a polyfill in the meantime) enhance the reading experience by adding things like publication-wide search, linear progression, etc.
By conceiving of "web publications" (case sensitive one more time 😁) as extant and enhance-able, we can lay a foundation to build up from. In which case, a user choice or action to "initiate [a] publication" would look like (auto)triggering new experiences (search, linear progression, etc) either directly within the browsers UI or perhaps via some dedicated reading UI "space" (or like one of the many things we've not currently imagined 😃). The provision of those enhancements though, should not block the Web experience already available to readers of "web publications." The reader's lives should only get better from our work. 😸 |
In reply to @deborahgu,regarding the the possibility of a direct copy from a library catalog to JSON or HTML ("any of these solutions"). As you can see, the data structure of JSON is very similar to the structure of your library catalog : this is raw structured information, and translation is immediate. Moving such library structure to HTML is less immediate (use of non-recursive meta elements or use of content elements for embedding a metadata structure using on of the discussed solution, RDFa, microdata ...). About extensibility: JSON is extensible, as @iherman states, but even more interesting, JSON validation tools, i.e. JSON Schemas (a specification + well know software tools), also allow for that type of free extension. A schema can check all required metadata and their value, check that optional metadata have a proper value, and let free external metadata be added without choking. On the HTML side, people can freely extend a metadata vocabulary, but there is no validation mechanism for required and optional metadata: somebody has to code something specific to enforce such rules (a full epubcheck). It's not to say that ANY metadata should be embedded in JSON; only JSON serialized metadata can. XML serialized metadata must be expressed somewhere else and can be referenced from the JSON manifest (the nervous system of the publication). I think that the decision to externalize every metadata that is not in the infoset, a decision that was taken in the early years of EPUB, is to be revised. But this should be discussed in another issue. I also believe that the initial question of this very long thread has been answered: splitting the infoset in JSON + HTML (meta tags or content) is boring for both authors (information to put in two places with different serializations) and developers of user agents. On the other side, embedding the JSON manifest containing the full infoset in an HTML header is possible and is even interesting for publications with a single resource. |
@BigBlueHat that's exactly what we've discussed on the mailing list before and @iherman even hinted at a solution to this issue based on JSON-LD 1.1.
Yikes, I'll have nightmares with that paragraph. The reading order is the most important item in our infoset IMO and a list of (Any thoughts on this @JayPanoz?)
OK, let me get this straight, with your suggestion the list of resources:
Is that accurate? |
Well, all I can say at this point is that I keep seeing the “ease of authoring” concept/argument used in (too) many ways that are partly or completely disconnected from what I have witnessed in the EPUB trenches for 6 years, in a lot of discussions. If you really want to take graphic designers and independent publishers into account, then it means “if the authoring tools they’re using don’t implement it, they won’t use it.” A few examples:
I’m stopping there but there is so much more. The sad truth is HTML is already asking way too much from a significant number of the “spec users” anyway, and they will rely solely on authoring tools – which tend to be underrepresented in a lot of discussions related to authoring, unfortunately. Note that even if they master all of HTML, CSS and JS, a significant portion of authors are very likely to use tools that ease their lives, cf. PWABuilder by Microsoft. |
Forgive me but I am going to ask that we go back to basics here for a minute... what is the problem that these 2 proposed solutions:
are meant to solve? |
@RachelComerford, you are right, it is a good idea to pause and maybe formulate the high level choices... I think a way to formulate the two positions, by formulating them to the extreme might be as follows (and this is obviously my view of things at 6am in the morning...)
There are some intermediate proposals floating around which may make this less clear-cut. No. 1 above, for example, may lead to redundancy of information (which is also present in EPUB) mainly in terms of the list of resources that make a specific WP, or the ToC: there are some entries in the current draft that propagates the re-use of an HTML ToC to extract the list of resources to mitigate that. The extra proposal that I have put forward is somewhere between the two (although I admit closer to No.1), namely to use the JSON syntax as described in No 1, but incorporate it into the header of the entry page as using the What is more difficult is to get a clear set of pros and cons. I will try to get some below, knowing that I will not make full justice to any of the two sides...
I am sure I will get lots of flames now... Fire off! |
@iherman thanks, that's a good summary overall. My own personal take on this matter is that:
The current draft allows the reading order to fallback to HTML and underdefines the TOC. This should be changed IMO:
|
It's been my operating understanding that the primary audience for this specification is for Web browsers. I've also tried to based my proposals on "web publications" (lowercase on purpose) as they exist on the Web today. My goal has been to find ways to enhance existing web publications by adding the least amount of tech required to add key affordances ( topic:affordances ) currently unavailable to the human reading the publication. Perhaps its just that our objectives differ? |
@iherman your summaries in #193 (comment) are spot on. The The core sticking point seems to be around the expression and processing of the "binding"--i.e. the thing that defines the multi-resource/document experience. I'd like to shorten the distance between publication address and readable/experience-able publication. Since the Web (browser) does HTML by default, that (to me) means building up from that foundation. |
This is what I understand the problem and a potential compromise/solution to be. Interested in hearing feedback: Problem: There is information that must be available via the infoset and this needs to be coded and housed somewhere Proposed Solution: Some information (what exactly is TBD) will be housed within the HTML, some within JSON. That JSON may or may not be a separate file. Alternatives Considered:
|
@RachelComerford maybe this issue could be closed if we can extract from it the TOC issue. There were looong discussions last year about the representation of the reading order (ex-spine), the representation of a human-friendly Table of Contents and the need for machine readable navigation. #9 is an example of such loong threads. @HadrienGardeur 's comment also makes reference to the TOC issue. And this issue is far from being solved. So I suggest to keep the TOC out of scope of this issue and rephrase: Proposed Solution: The TOC being kept out of scope for now, the infoset is housed within a JSON manifest. That JSON may be embedded in the entry page if the publication has a single resource. |
Good summary, @RachelComerford. @llemeurfr this thread covers far more than the role of the "TOC issue" and has (afaict) helped clarify that "infoset" currently encompasses inert, descriptive metadata values (i.e. descriptive properties) as well as active, hypermedia-style "binding" expressions (i.e. structural properties). Where one wants to see those things collected seems to pivot on the expected processing models of either a Web browser (+/- future publication affordances) or a more EPUB-style Reading System. @RachelComerford your proposed solution summary is a good one:
Roughing up an example now. |
I've just sent a PR with some examples (no spec changes), one of which is a minimal, |
|
@BigBlueHat good idea to have moved the example to the wpub repo (I have just merged it). Looking at https://github.com/w3c/wpub/blob/master/experiments/html-schema-org-json-ld/index.html, and comparing it with the other example, I do not see how you intend to represent the list of (other) resources, that appear in the other manifest. |
I've also been building on top of existing Web Publications, and the example for "Why's poignant guide to Ruby" is a perfect illustration of that since I only added an entry page and a manifest instead of re-hosting and modifying the content:
For the primary audience, my take on this issue is that we should design something that works well for every type of UA, not just browsers. That said, I don't think that your argument really holds up @BigBlueHat. As @iherman has pointed out, your example lacks a list of resources and some of your previous comments about this infoset item were IMO hard to understand. I already posted a summary in a previous comment, could you confirm that I understood things correctly?
|
@iherman nothing is digested by schema.org, it's only the place where things are defined. But JSON-LD with schema.org terms is indeed digested by a number of search engines. It's what Google recommends and Bing recently added support for it as well. These search engines can only process JSON-LD contained in a For our use case, this might be the optimal outcome:
The second option is often used for SEO optimizations, as it provides an easy way to serve static content with dynamic metadata. As a side-note, once again I really think that we're having two separate discussions on this issue and it's making things more confusing than they should be:
|
@BigBlueHat about #193 (comment), we must now stop with generalities and prototypes and close this issue (65 comments) with a clear answer. As @HadrienGardeur said somewhere, beyond the ToC, they are other lists that are required, but currently not in the infoset: page list, list of illustrations ... we have to open a new issue about these lists, add them in the infoset if agreed, and then start (again) a discussion about the location of these lists (ToC included) as JSON, HTML or both. But we can IMO end this discussion about the serialization of all other descriptive and structural properties by stating that their natural position is in the JSON manifest. |
Oh! That'll work. I'd been thinking we'd use the PR review system to discuss them, but I guess in the repo we can still reference line numbers, so that'll work. I do think it'll help to see actual experiments as code vs. prose.
Dependencies are gathered from the primary resources that depend on them. Just as when you GET an HTML file, the browser will subsequently GET all the JS and CSS (etc) that it references. This does not (by design) include a more package-focused, publication-wide list of all the things that ever thing in the publication depends on. The Web is built from just-in-time referencing and retrieval. This follows that design pattern. |
@HadrienGardeur good idea. I've extracted the above into #197. For...
Do you feel #122 handles that sufficiently? Or do we need something more narrow? |
much discussion has moved to #197 |
This Working Group's been using GitHub for discussion as well as issues, so while I agree that this one has gotten rather long (and am working to break out specific, actionable issues), I don't feel it's closable while there's still things to extract and/or discuss further.
The "infoset" term does currently encompass both types of properties and #197 has been created to explore those two different groups in light of the discussions here.
The dispute seems more about the use of the ToC to potentially provide the reading order and primary resource list. Additionally, there are open questions around how dependencies of primary resources are expressed:
There is much more to discuss and explore, but it doesn't need to stay on this issue, and I concur that narrower topics should be made wherever possible. More specific issues to follow. |
I think that #122 is good enough for that.
Since this is not tied to #197, I'll answer here. I think that this proposal is incredibly bad on many different levels:
If this is really something that you want to push forward as a real proposal, you should open a separate issue, because this is not related to the serialization at all. |
@HadrienGardeur good summary over all, and I'm happy to open a separate issue to focus discussion. I'd say that it is serialization related in as much as HTML already has defined semantics and specifications for everything in that list (afaict).
And yes. That would be the side effect of this approach. It also means less potential for out-of-sync content and manifests (i.e. one or the other having a resource that's no longer needed) and it reduces the number of changes required when adding something to the publication (i.e. the "oh, woops for got to list that in the manifest..." scenario). I'll save further thoughts for that separate issue. Thanks for the feedback, @HadrienGardeur. |
The Working Group just discussed The full IRC log of that discussion<dauwhe> Topic: https://github.com//issues/193<dauwhe> github: https://github.com//issues/193 <dauwhe> garth: I'm in the "put everything in one place" camp <garth> “The infoset mostly resides within a JSON manifest (WP manifest). That JSON may optionally be embedded in the entry page rather than a standalone file referenced by <link> from the entry page. It may be supported to allow some infoset items to reside in HTML of the entry page, if information duplication issues can be sufficiently avoided.” <dauwhe> ... the proposal that Ivan and I came up with in Berlin I'm pasting in <dauwhe> ... it may not be consensus, but I hope it's in the "can live with it" <dauwhe> garth: (reads proposed spec language above) <Rachel> q+ <dauwhe> ... the only html info we've talked about is using nav to define primary reading order in HTML, so this leaves that as a possibility <garth> ack dkaplan3 <garth> ack dkaplan <dauwhe> Rachel: Hello! <dauwhe> ... the Q I have is about language <dauwhe> ... I don't understand "mostly" <dauwhe> ... do you mean "primarily"? <dauwhe> garth: that "mostly" was letting my religion show <dauwhe> ... it means primarily <dauwhe> ... there is a wp manifest, and it is a json thing, and most stuff should live there unless we find exceptions (like the nav thing) <garth> q? <ivan> +1 for primarily instead of mostly <dauwhe> garth: does that anser the question? <garth> ack Rachel <dauwhe> Rachel: can we change mostly to primarily, and qualify that language with what you just said? <dauwhe> garth: we can switch out the words in the resolution <dauwhe> ... and we can assign to matt to make it clearer <garth> q? <garth> q? <dauwhe> garth: this one I view as less controversial; we're not deviating from the existing draft much <ivan> resolved: The infoset primarily resides within a JSON manifest (WP manifest). That JSON may optionally be embedded in the entry page rather than a standalone file referenced by <link> from the entry page. It may be supported to allow some infoset items to reside in HTML of the entry page, if information duplication issues can be sufficiently avoided. <dauwhe> ... if everyone's in the 'can live with it" or "likes it" camp, I'm gonna assume consensus |
Resolved on call: "Proposed resolution: The infoset mostly resides within a JSON manifest (WP manifest). That JSON may optionally be embedded in the entry page rather than a standalone file referenced by from the entry page. It may be supported to allow some infoset items to reside in HTML of the entry page, if information duplication issues can be sufficiently avoided." |
This discussion has permeated many of the various issues (e.g., lately, #159, #181, or #186). It would help to get this design principle settled once and for all. In practice, the issue is whether the "entry (HTML) page" could be used as containing the infoset items, or not.
Note that the answer may not be clear-cut, and may depend on the nature of the infoset items. Indeed, it is different if:
<meta>
or<link>
element (e.g., creation modification date or links to an ONIX file)Another aspect that influences this decision is whether the WP consists of a single HTML file (which is also the entry page), with adjunct files like CSS or images. This is the typical case for, e.g., a scholarly journal article.
The text was updated successfully, but these errors were encountered: