mimetype file content not properly extracted sometimes #2
Labels
type: bug
The issue describes a bug
type: not an issue
The issue is rejected (not an actual issue or not relevant)
From [email protected] on December 28, 2007 13:35:44
(using epubcheck-0.9.2.jar)
epub file: http://www.hxa7241.org/articles/content/EpubGuide- hxa7241.epub (including correct mimetype file)
was zipped with: http://www.info-zip.org/Zip.html but produced error:
EpubGuide-hxa7241.epub: mimetype contains wrong type (application/
epub+zip expected)
The problem: In a zip file, there is an extra field between the file name
and the file content (it is zero-length only optionally).
To fix:
Considering the zip file as a byte array, then:
filename starts at [30]
content starts at [30 + filenameLength + extrafieldLength]
filenameLength is ([27] << 8) | [26]
extrafieldLength is ([29] << 8) | [28]
contentLength is ([21] << 24) | ([20] << 16) | ([19] << 8) | [18]
So, summarising comments for possible code would be (separating
deserializing from checking):
// read checkable values (filename, filecontent)
// open file stream
// check values
// check filename
// make string from filename bytes
// compare with "mimetype"
// check filecontent
// make string from filecontent bytes
// compare with "application/epub+zip"
But, probably it would be better to use java.util.zip.ZipFile...
Original issue: http://code.google.com/p/epubcheck/issues/detail?id=2
The text was updated successfully, but these errors were encountered: