-
Notifications
You must be signed in to change notification settings - Fork 405
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix: normalize URLs before checking if the resources exist
URLs were not normalized before performing existence checks. So percent-encoded URLs sometimes triggered `RSC-001` or `RSC-007` errors. This commit introduces a new `normalize(URL)` method in the `URLUtils` class. Normalization is now used when checking a URL. This notably applies to resource and ID existence checks. Important Note: URL normalization is not well-defined. Some percent-encoding normalization is described in RFC3986, but is not defined in the URL standard. Also, normalization (as useful for EPUBCheck) is also dependent on the URL scheme. The normalization we apply is quite naïve and might need to be improved in the future. It should however cover the majority of HTTP URL real-world scenarios. Fix #1479
- Loading branch information
Showing
9 changed files
with
90 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11 changes: 11 additions & 0 deletions
11
src/test/resources/epub3/04-ocf/files/url-percent-encoded-valid/EPUB/content&001.xhtml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<!DOCTYPE html> | ||
<html xmlns:epub="http://www.idpf.org/2007/ops" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> | ||
<head> | ||
<meta charset="utf-8"/> | ||
<title>Minimal EPUB</title> | ||
</head> | ||
<body> | ||
<h1>Loomings</h1> | ||
<p>Call me Ishmael.</p> | ||
</body> | ||
</html> |
14 changes: 14 additions & 0 deletions
14
src/test/resources/epub3/04-ocf/files/url-percent-encoded-valid/EPUB/nav.xhtml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
<!DOCTYPE html> | ||
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xml:lang="en" lang="en"> | ||
<head> | ||
<meta charset="utf-8"/> | ||
<title>Minimal Nav</title> | ||
</head> | ||
<body> | ||
<nav epub:type="toc"> | ||
<ol> | ||
<li><a href="content%26001.xhtml">content 001</a></li> | ||
</ol> | ||
</nav> | ||
</body> | ||
</html> |
16 changes: 16 additions & 0 deletions
16
src/test/resources/epub3/04-ocf/files/url-percent-encoded-valid/EPUB/package.opf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" xml:lang="en" unique-identifier="q"> | ||
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/"> | ||
<dc:title id="title">Minimal EPUB 3.0</dc:title> | ||
<dc:language>en</dc:language> | ||
<dc:identifier id="q">NOID</dc:identifier> | ||
<meta property="dcterms:modified">2017-06-14T00:00:01Z</meta> | ||
</metadata> | ||
<manifest> | ||
<item id="content_001" href="content%26001.xhtml" media-type="application/xhtml+xml"/> | ||
<item id="nav" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"/> | ||
</manifest> | ||
<spine> | ||
<itemref idref="content_001" /> | ||
</spine> | ||
</package> |
6 changes: 6 additions & 0 deletions
6
src/test/resources/epub3/04-ocf/files/url-percent-encoded-valid/META-INF/container.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
<?xml version="1.0" encoding="UTF-8" ?> | ||
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container"> | ||
<rootfiles> | ||
<rootfile full-path="EPUB/package.opf" media-type="application/oebps-package+xml"/> | ||
</rootfiles> | ||
</container> |
1 change: 1 addition & 0 deletions
1
src/test/resources/epub3/04-ocf/files/url-percent-encoded-valid/mimetype
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
application/epub+zip |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters