RFC: Message schemas #11

thewilkybarkid · 2018-06-29T12:51:17Z

Problem

Libero needs to allow for services to communicate with each other (either through the API or the event bus), these messages need a structure.

Suggestion

Use RELAX NG as a schema for the messages (both the HTTP API and the event bus)
Require the absolute minimum, and provide extensions for common concepts and allow custom extensions (including embedding other schemas, eg JATS)
Provide clear mappings between parts of other standards (eg JATS+JATS4R, TEI) and Libero definitions
Use XML namespaces instead of versioning schemas
- A breaking change requires a new namespace, the old method remaining available but deprecated (ie schemas are immutable)

Concerns

JSON is easier to work with, but doesn’t handle mixed content
RELAX NG is an opinionated choice, but other existing options/standards don’t meet our requirements (XML Schema is common, as are DTDs in publishing)
- XML Schema is more common than RELAX NG, but doesn’t have the same level of support for extensions
Non-eLife usage so far appears to be purely for JATS content (though this doesn’t meet eLife’s need)
Reinventing the wheel?
Is an immutable schema overcommitting for Libero at an alpha/beta stage?

thewilkybarkid · 2018-06-30T07:47:48Z

Non-eLife usage so far appears to be purely for JATS content (though this doesn’t meet eLife’s need)

Had an offline question about this: to clarify, eLife’s scholarly articles are currently sourced from JATS, everything else comes directly from a CMS. Some types (such as blog posts) could fit into JATS without too much trouble, others wouldn’t (eg collections, podcast episodes).

stephenwf · 2018-07-30T12:47:10Z

@thewilkybarkid Does this model still allow a micro-service to offer, through content negotiation, JSON documents?

Also, with versions being dropped in favour of XML namespaces, is there a mechanism over HTTP to request a particular XML document under a specific XML namespace?

Perhaps out of scope for this, but do you think the namespaces will or should match semantic versioning of the components themselves?

thewilkybarkid · 2018-07-31T00:59:22Z

Does this model still allow a micro-service to offer, through content negotiation, JSON documents?

More relevant to #8; requesting a JSON (or whatever) format can work if you choose to do that, as the API should respect HTTP. Libero itself should only recommend one format though.

Also, with versions being dropped in favour of XML namespaces, is there a mechanism over HTTP to request a particular XML document under a specific XML namespace?

Not considered that, but I guess Libero will need to define a content type (even if it's just application/xml, though something more specific would be preferable), so related to the above you could request another type. (We have wondered whether we should specify anything at all, but hard to produce reusable code without something to go on.)

Perhaps out of scope for this, but do you think the namespaces will or should match semantic versioning of the components themselves?

Reading the RFC I realise it's not that clear that the breaking change would appear in a different namespace, not the whole document. So, take the following example:

<front xmlns="http://libero.pub" xmlns:libero2="http://libero.pub/2" xml:lang="en">
    <id>some-id</id>
    <title>Some title</title>
    <some-element>some-text</some-element>
    <libero2:some-element>
        <libero2:some-sub-element>some-text</libero2:some-sub-element>
        <libero2:some-sub-element>some-more-text</libero2:some-sub-element>
    </libero2:some-element>
</front>

The some-element element has change from mixed content to element content. That's a breaking change to required a namespace change. The rest of the document is still under the original namespace.

Migration from the old to new is still controlled:

Enable new syntax
Duplicate data in new syntax
Switch support to new syntax
Remove data from old syntax
Disable old syntax

Which is a cookbook entry, rather than having to support explicit versioning everywhere. Relies on a few differences to eLife's setup though: the API isn't meant for public consumption (you can choose to expose it though), elements that are unknown are to be ignored, and most things should be optional. A new element is also preferable to a new namespace (as long as the new name makes sense).

As to the format the namespaces should take, not sure. Other XML format tend to use dates. Libero components won't be individually versioned, but could be grouped together even if breaking changes were made at different times to avoid namespace proliferation. That said, we really should avoid having to make any breaking changes.

thewilkybarkid added the rfc A request for comments label Jun 29, 2018

thewilkybarkid mentioned this issue Jan 28, 2019

RFC: JATS support #21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Message schemas #11

RFC: Message schemas #11

thewilkybarkid commented Jun 29, 2018

thewilkybarkid commented Jun 30, 2018

stephenwf commented Jul 30, 2018

thewilkybarkid commented Jul 31, 2018

RFC: Message schemas #11

RFC: Message schemas #11

Comments

thewilkybarkid commented Jun 29, 2018

Problem

Suggestion

Concerns

thewilkybarkid commented Jun 30, 2018

stephenwf commented Jul 30, 2018

thewilkybarkid commented Jul 31, 2018