diff --git a/epub33/core/index.html b/epub33/core/index.html index e797628c7..62b7d13f6 100644 --- a/epub33/core/index.html +++ b/epub33/core/index.html @@ -282,6 +282,24 @@
The URL [[URL]] of the Root Directory representing the OCF Abstract Container. + It is implementation specific, but EPUB Creators must assume it has properties defined in .
++ The URL of a file or directory in the OCF Abstract Container, defined in . +
+The name of any type of file within an OCF Abstract Container, whether a directory or a file within a directory.
+The File Path of a file or directory is its full path relative to the root directory, as defined by the algorithm specified in .
+ + +For a given directory within the OCF Abstract
- Container, the string holding all directory File Name in the full path
- concatenated together with a /
(U+002F
) character separating the
- directory File Names.
For a given file within the OCF Abstract Container, the Path Name is the string holding all
- directory File Names concatenated together with a /
character separating the
- directory File Names, followed by a /
character and then the File Name of the
- file.
All [[XML]] elements defined in this section are in the http://www.idpf.org/2007/opf
namespace [[XML-NAMES]] unless otherwise specified.
+ To parse a URL string url used in the Package Document, the URL Parser [[URL]] MUST be applied to url, with the + content URL of the Package Document as base. +
+Files within the OCF Abstract Container MUST reference each other via relative-URL-with-fragment strings [[URL]].
- - - -EPUB Creators SHOULD NOT use path-absolute-URL strings [[URI]] (i.e., where the path begins with a single slash) to - reference resources in the OCF Abstract Container.
- -The base of an EPUB Publication can change from Reading System to Reading Systems depending - on how the content is served. Some Reading Systems may treat the location of the package - document as the base of the EPUB Publication, for example, while others may use the Root - Directory.
-The relevant language specification for a given file format determines the base URL [[URL]] used to parse relative-URL-with-fragment strings [[URL]]. For example, CSS defines how relative URL - references work in the context of CSS style sheets and property declarations - [[CSSSnapshot]].
- -Unlike most language specifications, the base URL [[URL]] for all files within the META-INF
directory is the
- Root Directory of the OCF Abstract Container.
For example, if META-INF/container.xml
has the following content:
-<?xml version="1.0"?> -<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container"> - <rootfiles> - <rootfile full-path="EPUB/Great_Expectations.opf" - media-type="application/oebps-package+xml" /> - </rootfiles> -</container> -- -
then the path EPUB/Great_Expectations.opf
is relative to the root directory for the
- OCF Abstract Container and not relative to the META-INF
directory.
All relative-URL-with-fragment strings [[URL]] MUST, after parsing to URL records [[URL]], - identify resources within the OCF Abstract Container (i.e., at or below the Root Directory).
-In the context of the Abstract Container, Path Names and File Names +
In the context of the Abstract Container, File Paths and File Names are case sensitive.
-In addition, the following restrictions are designed to allow Path Names and File Names to be +
In addition, the following restrictions are designed to allow File Paths and File Names to be used without modification on most operating systems:
Path and File Names MUST be UTF-8 [[Unicode]] encoded.
+File Names and Paths MUST be UTF-8 [[Unicode]] encoded.
File Names MUST NOT exceed 255 bytes.
The Path Name for any directory or file within the OCF Abstract +
The File Paths for any directory or file within the OCF Abstract Container MUST NOT exceed 65535 bytes.
To derive the File Path of a file or directory file in the OCF Abstract + Container apply the following steps (expressed using the terminology of [[INFRA]]):
+ +U+002F (/)
character.
+ The container root URL is the URL [[URL]] of the + Root Directory. It is implementation specific, but EPUB Creators MUST assume it has the following properties:
+ +/
" with the container root URL as base is the container root URL...
" with the container root URL as base is the container root URL.The content URL of a file or directory in the OCF Abstract Container is the result of parsing + the file's File Path with the container root URL as base.
+ +
+ Parsing may replace some characters in the File Path by their percent encoded alternative. For example, A/B/C/file name.xhtml
becomes A/B/C/file%20name.xhtml
.
+
+ In the OCF Abstract Container, when a file uses a URL string to reference another file in the container, the string MUST be a
+ path-relative-scheme-less-URL string, optionally followed by U+0023 (#)
and a URL-fragment string.
+
+ The properties of the container root URL are such that whatever the amount of double-dot path segments in a URL string (for example, ../../../secret
), it will be parsed to a content URL (and not "leak" outside the container). However, for better interoperability with non-conforming or legacy Reading Systems, EPUB Creators should avoid using more double-dot path segments than needed to reach the target container file.
+
META-INF
DirectoryTo parse a URL string url used in files located in the META-INF
directory the
+ URL Parser MUST be applied to url, with the container root URL as base.
body
ElementIdentifies an associated fragment of an EPUB Content Document.
-The value MUST be a relative-URL-with-fragment string [[URL]] with a fragment identifier.
+The value MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#)
and a URL-fragment string.
seq
ElementIdentifies an associated fragment of an EPUB Content Document.
-The value MUST be a relative-URL-with-fragment string [[URL]] with a fragment identifier.
+The value MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#)
and a URL-fragment string.
text
ElementIdentifies the associated fragment of an EPUB Content Document.
-The value MUST be a relative-URL-with-fragment string [[URL]] with a fragment identifier.
+The value MUST be a path-relative-scheme-less-URL string, optionally followed by U+0023 (#)
and a URL-fragment string.
id
@@ -9454,6 +9490,9 @@ Reading Systems MUST process the Package Document [[EPUB-33]].
-To parse relative-URL-with-fragment strings [[URL]] in the Package Document, Reading Systems MUST - use the URL of the Package Document as the base URL [[URL]].
- -When an EPUB Publication is zipped, the base URL of the Package Document is obtained from the URL of - the EPUB Container together with a fragment identifier that specifies the path to Package - Document (relative to the Root Directory). This specification does not require a specific URL - scheme for referencing the path to the Package Document within the EPUB Container.
-This specification does not mandate any particular implementation technique for the creation of - a unique origin in the absence of a reliable - Web origin (e.g., HTTP URL scheme + host + port). The necessary heuristics may include the - combination of the Publication's Unique Identifier (although, in practice, these identifiers may in fact not guarantee - globally-unique identification, that is why it is recommended to combine multiple techniques), - filesystem path, OCF Zip - Container checksum, etc.
-The unicity of the origin per EPUB Publication - instance means that if two different users acquire a copy of the same EPUB Publication, the - origins will be different for the two users on those copies even if the same Reading System is - used.
-Note that the definition of epubReadingSystem
is currently marked as "at
risk". If, in the final version of this document, the object becomes non-normative, then each "MUST"
statement in the last bullet item would become a "MAY".
Reading Systems MUST assign a URL [[URL]] to the Root Directory of the OCF Abstract Container. This URL is called the container root URL. It is implementation specific, but the implementation MUST have the following properties:
+ +/
" with the container root URL as base is the container root URL...
" with the container root URL as base is the container root URL.The unicity of the origin per EPUB Publication + instance means that if two different users acquire a copy of the same EPUB Publication, the + origins will be different for the two users on those copies even if the same Reading System is + used.
+ +The required properties of the container root URL are such that it behaves similarly to a URL defined as follows:
+ ++ URL component + | ++ Values + | +
scheme | +http or https |
+
host | +localhost |
+
port | +a dynamic port uniquely assigned to the EPUB instance | +
For relative-URL-with-fragment strings [[URL]], Reading Systems MUST determine the base URL [[URL]] according to the - relevant language specifications for the given file formats. For example, CSS defines how - relative URL references work in the context of CSS style sheets and property declarations - [[CSSSnapshot]].
+for example:
+ +Container File | +File Path | +URL | +
+ Root Directory + | ++ empty string + | +
+ http://localhost:49152/
+ |
+
+ Package Document + | +
+ OPS/package.opf
+ |
+
+ http://localhost:49152/OPS/package.opf
+ |
+
+ Content Document + | +
+ HTML/file name.xhtml
+ |
+
+ http://localhost:49152/HTML/file%20name.xhtml
+ |
+
Some language specifications reference Relative URLs for Referencing Other Components for content in that particular language.
Unlike most language specifications, Reading Systems MUST use the Root Directory of the
- OCF Abstract Container as the base
- URL [[URL]] for all files within the META-INF
directory.
Unlike most language specifications, Reading Systems must use the
+ container root URL as the base
+ URL [[URL]] for all files within the META-INF
directory.
+ See also the section on Parsing URLs in the META-INF
Directory in [[!EPUB-33]].
+
+ We may have to say a few words on what exactly an "EPUB Publication instance" mean. +
+ +The previous version contained this note:
+ ++ This specification does not mandate any particular implementation technique for the creation of + a unique origin in the absence of a reliable + Web origin (e.g., HTTP URL scheme + host + port). The necessary heuristics may include the + combination of the Publication's Unique Identifier (although, in practice, these identifiers may in fact not guarantee + globally-unique identification, that is why it is recommended to combine multiple techniques), + filesystem path, OCF Zip + Container checksum, etc. +
+ +This note may not be relevant any more...
+Although EPUB Creators are required to follow various File and Path Name + href="https://www.w3.org/TR/epub-33/#sec-container-filenames">File Name and File Path restrictions [[!EPUB-33]] for maximum interoperability, Reading Systems SHOULD attempt - to process File and Path Names that are not valid to these requirements. Invalid File and Path - Names may only be problematic on some operating systems.
+ to process File Names and Paths that are not valid to these requirements. Invalid File Names and + Paths may only be problematic on some operating systems. -This specification does not specify how a Reading System that is unable to represent OCF File and - Path Names would compensate for this incompatibility.
+This specification does not specify how a Reading System that is unable to represent OCF File Names and + Paths would compensate for this incompatibility.
If a Reading System cannot preserve the names of files during an unzipping process, it will have to compensate for any name translation that took place in the content (i.e., in any URIs that @@ -1957,7 +2038,7 @@
To limit the possible damage of untrusted scripts, this specification recommends that Reading Systems establish a unique origin [[URL]] allocated to - each EPUB Publication (see ). Adopting this approach + each EPUB Publication (see ). Adopting this approach isolates publications from each other, thereby limiting access to cookies, DOM storage, etc. Examples of Web APIs that are tied to the concept of "origin" include Web Storage [[WEBSTORAGE]] and IndexedDB [[INDEXEDDB]], which EPUB Content Documents can interact with via scripting. Reading @@ -2244,6 +2325,11 @@