-
Notifications
You must be signed in to change notification settings - Fork 19
Minimal WPUB for a scholarly paper (of sort)
As agreed on the call on 2018-06-04, the content of this page has been transferred to:
https://github.com/w3c/wpub/tree/master/experiments/w3c_rec
Inspired by Dave's minimal WPUB for a book, I tried to create one for what is equivalent to a scholarly paper. I wanted a real-life example; to avoid copyright issues, I took a W3C document instead: the Model for Tabular Data and Metadata on the Web. I believe that, as far as a WPUB goes, it is equivalent to a scholarly paper.
The interesting points of this publication, from our point of view:
- It is a single document publication. Ie, the entry point and the main content is the same HTML resource.
- The publication already has a TOC (as generated for the recommendation by respec): its structure is a
section
element with aul
. It is not anav
, thus. And, of course, they do not usedoc-toc
. In a new WPUB these should be slightly updated, of course, depending on what the final structure is. - Because it is a single document publication, it is o.k. to use the
title
HTML element (as the spec says) for theTitle
infoset item, it is not necessary to use the relevantschema.org
name
property. - The publication refers to further HTML files that are not in the main thread of the paper, but may essential for the publication (i.e., they should be cached/offlined!), namely:
- a diff file comparing the document to its previous incarnation
- a separate html file used for a
longdesc
value for a diagram
- The publication refers to a number of CSV and Excel files, as well as images in different formats, that may be essential for the content of the paper
In other words, the "boundaries" of the publication should include (beyond the CSS files used for rendering) references to other resources. These should be listed explicitly in the resource list of the publication in my view. The document also refers to a number of other HTML files (e.g., in the references) which should not be part of the boundaries, ie, should not be cached/offlined.
I have created two WPUB skeletons. The simple version has the strict minimum according to our spec. It relies on a number of (reasonable) defaults: the language in the manifest is en-US, the names of persons is enough, the references to svg, png, csv, etc, files do not require media type setting because they are all "well known" to browsers. The complex version adds a number of extra metadata entries to, say, the persons, all using the relevant schema.org
entries, while still referring to the values listed in our infoset (schema.org
has many many more metadata entries that could be used, of course). B.t.w., I have added the various @type
values although, in real practice, I am not sure they are necessary (can be deduced from the values).
The resources
property is not a schema.org
term, it is what we could use for the 'Resource List' infoset item. Per JSON-LD rules, this property, with the subtree underneath, will be ignored by any JSON-LD processor; I guess this is a feature not a bug at this point.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Model for Tabular Data and Metadata on the Web</title>
<link href="#wpm" rel="publication" />
...
<script id="wpm" type="application/ld+json">
{
"@context" : [
"https://schema.org",
{
"publ-resources" : null,
"publ-toc" : null
}
],
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"creator" : [
{
"@type" : "Person",
"name" : "Jeni Tennison",
},
{
"@type" : "Person",
"name" : "Gregg Kellogg",
},
{
"@type" : "Person",
"name" : "Ivan Herman",
}
],
"datePublished" : "2015-12-17",
"publ-resources" : [
"datatypes.html",
"datatypes.svg",
"datatypes.png",
"diff.html",
"test-utf8.csv",
"test-utf8-bom.csv",
"test-utf16.csv",
"test-utf16-bom.csv",
"test.xls"
],
"publ-toc" : "#toc"
}
</script>
</head>
<body>
....
<section id="toc">
<h2 resource="#h-toc" id="h-toc" class="introductory">Table of Contents</h2>
<ul class="toc">
<li class="tocline"><a class="tocxref" href="#intro"><span class="secno">1. </span>Introduction</a></li>
...
</ul>
</section>
...
</body>
</html>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Model for Tabular Data and Metadata on the Web</title>
<link href="#wpm" rel="publication" />
...
<script id="wpm" type="application/ld+json">
{
"@context" : [
"https://schema.org",
{
"publ-resources" : null,
"publ-toc" : null,
"@language" : "en-US"
}
],
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"accessMode" : ["textual", "visual"],
"accessModeSufficient" : ["textual"],
"editor" : [
{
"@type" : "Person",
"name" : "Jeni Tennison",
"givenName" : "Jeni",
"familyName" : "Tennison",
"affiliation" : {
"name" : "The Open Data Institute",
"url" : "http://theodi.org/"
}
},
{
"@type" : "Person",
"@id" : "http://greggkellogg.net/",
"name" : "Gregg Kellogg",
"givenName" : "Gregg",
"familyName" : "Kellogg",
"affiliation" : {
"name" : "Kellogg Associates",
"url" : "http://kellogg-assoc.com/"
}
}
],
"author" : [
{
"@type" : "Person",
"name" : "Jeni Tennison",
"givenName" : "Jeni",
"familyName" : "Tennison",
"affiliation" : {
"name" : "The Open Data Institute",
"url" : "http://theodi.org/"
}
},
{
"@type" : "Person",
"@id" : "http://greggkellogg.net/",
"name" : "Gregg Kellogg",
"givenName" : "Gregg",
"familyName" : "Kellogg",
"affiliation" : {
"name" : "Kellogg Associates",
"url" : "http://kellogg-assoc.com/"
}
},
{
"@type" : "Person",
"@id" : "https://www.w3.org/People/Ivan/",
"name" : "Ivan Herman",
"givenName" : "Ivan",
"familyName" : "Herman",
"affiliation" : {
"name" : "World Wide Web Consortium",
"url" : "https://www.w3.org"
}
}
],
"datePublished" : "2015-12-17",
"dateModified" : "2015-12-17",
"publ-resources" : [
"datatypes.html",
"datatypes.svg",
"datatypes.png",
"diff.html",
{
"@type" : "StructuredValue",
"url" : "test-utf8.csv",
"fileFormat" : "text/csv"
},
{
"@type" : "StructuredValue",
"url" : "test-utf8-bom.csv",
"fileFormat" : "text/csv"
},
{
"@type" : "StructuredValue",
"url" : "test-utf16.csv",
"fileFormat" : "text/csv"
},
{
"@type" : "StructuredValue",
"url" : "test-utf16-bom.csv",
"fileFormat" : "text/csv"
},
{
"@type" : "StructuredValue",
"url" : "test.xls",
"fileFormat" : "application/vnd.ms-excel"
},
{
"@type" : "StructuredValue",
"url" : "test.xlsx",
"fileFormat" : "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
}
],
"publ-toc" : "#toc"
}
</script>
</head>
<body>
....
<section id="toc">
<h2 resource="#h-toc" id="h-toc" class="introductory">Table of Contents</h2>
<ul class="toc">
<li class="tocline"><a class="tocxref" href="#intro"><span class="secno">1. </span>Introduction</a></li>
...
</ul>
</section>
...
</body>
</html>