-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check that the manifest only lists publication resources #1452
Comments
@mattgarrish do you think that check would be useful and/or reasonable? Existing EPUBs may list things in the manifest that are not strictly speaking publication resources (e.g. unused images). As we did not check this in the past, so I'm wondering if it's opening a can of worms… |
The requirement is only there to keep linked resources and resources travelling in the container from littering the manifest, since reading systems don't need these (I guess it might keep them from being extracted for reading). It was never meant to trap former publication resources that are still listed but have stopped being referenced, so I agree this does seem like a can of worms to implement now. It is useful to know there are unused resources still kicking around in the publication from a cleanup perspective, but I don't know this is worth the headache of making errors. And I don't think there's any simple fix, either here or for the spec. We're just too far along in EPUB 3. What if we deliberately ignore the spec and make it a usage message so at least publishers will know if a resource is still used but without adding new errors? |
Yes, it looks like a good compromise, I was thinking the same.
IMO the requirement is too strong. We already say that publication resources MUST be listed; we don't need to add that others MUST NOT, especially when this cannot be checked. Also, EPUBCheck reports files present in the container but not listed in the manifest as a WARNING ( In any case, I for one always have a little difficulty understanding what is a publication resource. For instance, when the exemption section says:
if a resource is not for use by RS, is it even a publication resource, and in that case is it even relevant to create exemption rules for it? |
That's what makes wading into this so complicated. We've carved out a lot of niche cases over the years for resources that don't quite conform to the standard. But I was forgetting last night that resources travelling in the container still have to be listed in the manifest. It's just that they're exempt from fallback requirements. How do you differentiate a resource travelling in the container from one that used to be a publication resource and is no longer in use, though? That still remains the problem I don't see a way of writing around. But that particular problem aside, the only resources in the container that shouldn't be in the manifest are linked resources (the ones that aren't also publication resources!) and the stuff in META-INF. It's not clear why you don't list files in META-INF in the manifest, though -- there should probably be a statement that because they're reserved files for processing the OCF, they don't need listing. If we add that statement, then I think we could take the problematic "must not" here and turn it into a note that just mentions the resources that don't get listed (or even just remove it altogether, as at that point the manifest requirements would cover all cases). That would only leave the problem of former publication resources / resources travelling in the container, but I think that's a legit case for a usage message. |
Ok, so assuming w3c/epub-specs#2506 is merged, that leaves us with the following EPUBCheck behavior:
About the last point: EPUBCheck historically warns about those (since #58, which was released in v1.1, 12 years ago).
I'm still slightly hesitant to change this, based on @dauwhe's original comment in #58 (and the following discussion):
|
Ya, I thought it was a new warning, but if it's always been around... But, the problem is the new "allowed to travel in the container without being used" resources (data sets, etc.). We can't make one camp happy without annoying the other. We probably should have added a manifest property for the new case to make clear they've been added intentionally. What do you think @iherman? Doesn't seem like too big an ask, and it's not like there's an established base using the container to carry files around that we need to worry about yet. I just have no idea what we'd name the property. |
I had two worries:
It seems that both of my worries are moot, so I am fine with this change. I have added some additional (minor) comments to w3c/epub-specs#2507. B.t.w., if the group decides it is not a good time to make the change, we can postpone it for later. It would not create any practical problems. |
EPUBCheck used to report any resource found in the the container but not listed in the manifest as a warning (since #58 was fixed, in v1.1). But the EPUB specification does not require that. This commit downgrades the severity of `OPF-003` to a usage report. See also #1452 See also w3c/epub-specs#563
This PR introduces a new usage message (`OPF-097`) reported when the manifest includes a resource for which no reference was found in content. This is legit for instance for exempt resources (like data files included in the container). But it is likely an author error otherwise, so reporting a usage message can help to detect that. We do not report an error since: - references can be added via scripting, so the check can only be informative for scripted content - EPUBCheck historically required all container resources to be listed in the manifest (not doing so was reported as a warning, `OPF-003`), which is conflicting with a requirement to **not** list unsused resources Close #1452
The spec says:
I don't think this is covered by EPUBCheck.
For each manifest item, we could lookup the reference registry to see if a reference to the resource was found, and issue an error if not.
We would issue a mere usage message if the EPUB contains scripted content, as the item could potentially be a legit publication resource used in scripting.
The text was updated successfully, but these errors were encountered: