Skip to content
This repository has been archived by the owner on Apr 27, 2023. It is now read-only.

RFC: Message schemas #11

Open
thewilkybarkid opened this issue Jun 29, 2018 · 3 comments
Open

RFC: Message schemas #11

thewilkybarkid opened this issue Jun 29, 2018 · 3 comments
Labels
rfc A request for comments

Comments

@thewilkybarkid
Copy link
Contributor

Problem

Libero needs to allow for services to communicate with each other (either through the API or the event bus), these messages need a structure.

Suggestion

  • Use RELAX NG as a schema for the messages (both the HTTP API and the event bus)
  • Require the absolute minimum, and provide extensions for common concepts and allow custom extensions (including embedding other schemas, eg JATS)
  • Provide clear mappings between parts of other standards (eg JATS+JATS4R, TEI) and Libero definitions
  • Use XML namespaces instead of versioning schemas
    • A breaking change requires a new namespace, the old method remaining available but deprecated (ie schemas are immutable)

Concerns

  • JSON is easier to work with, but doesn’t handle mixed content
  • RELAX NG is an opinionated choice, but other existing options/standards don’t meet our requirements (XML Schema is common, as are DTDs in publishing)
    • XML Schema is more common than RELAX NG, but doesn’t have the same level of support for extensions
  • Non-eLife usage so far appears to be purely for JATS content (though this doesn’t meet eLife’s need)
  • Reinventing the wheel?
  • Is an immutable schema overcommitting for Libero at an alpha/beta stage?
@thewilkybarkid thewilkybarkid added the rfc A request for comments label Jun 29, 2018
@thewilkybarkid
Copy link
Contributor Author

Non-eLife usage so far appears to be purely for JATS content (though this doesn’t meet eLife’s need)

Had an offline question about this: to clarify, eLife’s scholarly articles are currently sourced from JATS, everything else comes directly from a CMS. Some types (such as blog posts) could fit into JATS without too much trouble, others wouldn’t (eg collections, podcast episodes).

@stephenwf
Copy link
Member

@thewilkybarkid Does this model still allow a micro-service to offer, through content negotiation, JSON documents?

Also, with versions being dropped in favour of XML namespaces, is there a mechanism over HTTP to request a particular XML document under a specific XML namespace?

Perhaps out of scope for this, but do you think the namespaces will or should match semantic versioning of the components themselves?

@thewilkybarkid
Copy link
Contributor Author

Does this model still allow a micro-service to offer, through content negotiation, JSON documents?

More relevant to #8; requesting a JSON (or whatever) format can work if you choose to do that, as the API should respect HTTP. Libero itself should only recommend one format though.

Also, with versions being dropped in favour of XML namespaces, is there a mechanism over HTTP to request a particular XML document under a specific XML namespace?

Not considered that, but I guess Libero will need to define a content type (even if it's just application/xml, though something more specific would be preferable), so related to the above you could request another type. (We have wondered whether we should specify anything at all, but hard to produce reusable code without something to go on.)

Perhaps out of scope for this, but do you think the namespaces will or should match semantic versioning of the components themselves?

Reading the RFC I realise it's not that clear that the breaking change would appear in a different namespace, not the whole document. So, take the following example:

<front xmlns="http://libero.pub" xmlns:libero2="http://libero.pub/2" xml:lang="en">
    <id>some-id</id>
    <title>Some title</title>
    <some-element>some-text</some-element>
    <libero2:some-element>
        <libero2:some-sub-element>some-text</libero2:some-sub-element>
        <libero2:some-sub-element>some-more-text</libero2:some-sub-element>
    </libero2:some-element>
</front>

The some-element element has change from mixed content to element content. That's a breaking change to required a namespace change. The rest of the document is still under the original namespace.

Migration from the old to new is still controlled:

  1. Enable new syntax
  2. Duplicate data in new syntax
  3. Switch support to new syntax
  4. Remove data from old syntax
  5. Disable old syntax

Which is a cookbook entry, rather than having to support explicit versioning everywhere. Relies on a few differences to eLife's setup though: the API isn't meant for public consumption (you can choose to expose it though), elements that are unknown are to be ignored, and most things should be optional. A new element is also preferable to a new namespace (as long as the new name makes sense).

As to the format the namespaces should take, not sure. Other XML format tend to use dates. Libero components won't be individually versioned, but could be grouped together even if breaking changes were made at different times to avoid namespace proliferation. That said, we really should avoid having to make any breaking changes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
rfc A request for comments
Projects
None yet
Development

No branches or pull requests

2 participants