Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML output and error handling #486

Closed
mkraetke opened this issue Nov 25, 2014 · 13 comments
Closed

XML output and error handling #486

mkraetke opened this issue Nov 25, 2014 · 13 comments
Labels
type: improvement The issue suggests an improvement of an existing feature
Milestone

Comments

@mkraetke
Copy link
Contributor

The XML output from epubcheck is very constrained. The message elements contain only a plaintext message with the encountered error.

<message>
  OPF-003, WARN, [Item 'iTunesMetadata.plist' exists in the ePub, 
  but is not declared in the OPF manifest.], 978XXXXXXXXXX.epub
</message>

Currently, automatic error handling is only possible by string analysis. It would be helpful to have attributes besides the error message which provide

  • error code
  • file (e.g. 'iTunesMetadata.plist', '978XXXXXXXXXX.epub')
  • error type (e.g. 'error', 'warning')
  • source path (optional)

The XML output could be read as follows:

<message code="OPF-003" type="warning" sourcepath="/package/manifest[1]">
 Item 'iTunesMetadata.plist' exists in the ePub, 
  but is not declared in the OPF manifest.
</message>
@tofi86 tofi86 added this to the 4.0 milestone Nov 25, 2014
@tofi86
Copy link
Collaborator

tofi86 commented Nov 25, 2014

I think with 3.0.1 we used some kind of standard xml format. Seems as this got mixed up or isn't appropriate for the new message format...

@mkraetke
Copy link
Contributor Author

I've used version 4.0.0-alpha11 to generate the output above. Is there a specification available for the new message format?

@rdeltour
Copy link
Member

Yes, AFAIK @mkraetke is right: the XML output is based on the JHove schema (introduced by @tledoux if I remember correctly).

I agree that it would be nice to have an alternative, more detailed format.

Note: @mkraetke in case you're interested, see this XSLT example of how EpubCheck's XML is naïvely parsed and converted to a custom format at @daisy.

@mkraetke
Copy link
Contributor Author

Thank you for the example @rdeltour! my colleague @polypunkt developed this XSLT as workaround. But a more detailed XML output would be far better and not hard to develop.

@rdeltour
Copy link
Member

Note also that with the 4.x branch you can also preoduce JSON output. Also, all errors have a distinct error code so it s/b easier to extract or manipulate the message.

@tofi86
Copy link
Collaborator

tofi86 commented Nov 25, 2014

Well, the JHove schema has at least a @severity attribute which can be used for epubcheck severities and we could use @subMessage attribute for the epubcheck error code.

What do you think?

Otherwise we should probably switch to a new schema....

@mkraetke
Copy link
Contributor Author

XProcs Error Vocabulary could be an option:

<c:error
  name? = NCName
  type? = QName
  code? = QName
  href? = anyURI
  line? = integer
  column? = integer
  offset? = integer>
  (string | anyElement)*
</c:error>

@rdeltour
Copy link
Member

Removing from the 4.0 milestone as it is not critical, a better XML output can be added in a later release.

@rdeltour rdeltour removed this from the 4.0 milestone Jun 16, 2015
@mkraetke
Copy link
Contributor Author

Thank you. Currently we have a workaround.

@tofi86
Copy link
Collaborator

tofi86 commented Jun 16, 2015

Removing from the 4.0 milestone as it is not critical, a better XML output can be added in a later release.

Sure, it's not a critical issue, but changing a report schema is still a major change. Don't you think we should do this in 4.0 rather than in 4.0.2 or 4.1?
If we can agree on a better report schema and we know about your release schedule, @rdeltour, I can possibly take this issue.

XProcs Error Vocabulary could be an option:
I'm not very happy with this as the only obvious severity is "error", expressed by the tag name.
Sure, we could use the type attribute, but this doesn't feel right:

<c:error type="warning" href="...">...

Other suggestions?

@rdeltour
Copy link
Member

Sure, it's not a critical issue, but changing a report schema is still a major change. Don't you think we should do this in 4.0 rather than in 4.0.2 or 4.1?
If we can agree on a better report schema and we know about your release schedule, @rdeltour, I can possibly take this issue.

It's precisely the problem: I do not have a strong suggestion, as I've personally never had a strong need for change (like @mkraetke I've used workarounds when needed).

If changing the report schema is deemed too big a change for a minor version update, we can still release this change as 4.1, 4.2 or whatever when it's ready.

I'm not saying that I object to include it in 4.0, it's just that I don't have much time to find the best candidate or contact possible users (via the mailing list) and run a quick survey, so for now I'd rather keep the status quo.
That said, if you have a suggestion and want to implement it and contact the list, go for it! In terms of timeline, the sooner the better. There are still some issues left in the 4.0 milestone, but they shouldn't take long now that the hard part has been done.

@mkraetke
Copy link
Contributor Author

Schematron Validation Report Language (SVRL) might also provide a declarative markup for a epubcheck XML report. The location attribute can be used to store the file path and the XML sourcepath. Other report languages can be addressed with diagnostic-reference tags with corresponding @xml:lang attributes.

tledoux added a commit to tledoux/epubcheck that referenced this issue Feb 19, 2016
As per suggestion of @tofi86, we add the subMessage and severity
attributes to message to get the exact code.
Refactor the unit tests to check the error code instead of the message
to be more robust on locales.

Fixes w3c#486.
@tofi86 tofi86 added this to the Next milestone Oct 4, 2016
@tofi86 tofi86 added the type: improvement The issue suggests an improvement of an existing feature label Oct 4, 2016
@tofi86 tofi86 modified the milestones: Next, 4.0.2 Dec 11, 2016
@tofi86
Copy link
Collaborator

tofi86 commented Nov 29, 2017

See issue #816 for further discussing a new XML output schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: improvement The issue suggests an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants