-
Notifications
You must be signed in to change notification settings - Fork 1
RFC: JATS support #21
Comments
Related Browser work at libero/browser#30 and schema changes at libero/schemas#14. |
Another way to see this issue is: when there is a discrepancy between two services (client/server, upstream/downstream like
I'd assume the JATS used as the original input is not identical to the JATS served by the API, so there can be processing steps that substitute in URLs. The IIIF format is a strong dependency though, as it evolves more frequently than JATS (unstable should depend on stable rather than the other way around) and is not ubiquitous.
If we have a top level Libero element acting as a wrapper, this should be dealt with the same tooling that would validate Libero documents in general. Relies on integrating, for example, the RelaxNG JATS definitions into the schemas so that it fits with the Libero RelaxNG ones that are there now. |
Agreed. (Not restricted to JATS too, as all assets will probably get moved around etc.)
Embedding is quite simple. But
was meant to refer to actually using the data, eg Browser being able to convert different types of XML into consistent HTML, Search being able to index different types of XML. |
Examples:
I guess some very common information that can be used in other services like
Since the listing-based services would return ids only, they wouldn't necessarily need to know about JATS. |
Sorry if this is very basic but I'm trying to understand what is the definition of Libero's data model? |
@GiancarloFusiello, essentially what's in https://github.com/libero/schemas. Rather than being one big schema, it's broken down into the core (ie the required part, which is as small as possible), then a whole load of extensions that you can enable (so the opposite of JATS, which is one massive schema that you have to cut down to the parts that you want). Currently there's only 1 extension (italic text) along side the required parts (eg content item). The walking skeleton has this in more detail: a bunch of schemas for different publishers comprised of a set of extensions, but with some customisations. So your schema for your content, sharing where possible but not blocked from doing anything. |
This is what makes me inclined to 👍 this RFC: we can build the current features now with a borrowed schema (some version of JATS) and introduce a different (Libero) schema when we know more about the complexity of putting all the service together. |
Could the default be DAR? It seems DAR will need to be extended to cater for missing use cases. Or do you mean it's too strict in how the structure should look like? If it's the latter, and the JATS served by the API, could we do an up-front transformationstep like Giorgio suggested, to make it DAR? There may be already existing efforts to normalise JATS. Patrice from GROBID has for example created Pub2TEI (the output here is obviously TEI - which we also converted back to JATS if we wanted to, but might make the pipeline more complicated). Using TEI altogether could be another option. It may not cater for IJM, although with a translation tool like Pub2TEI it might? *None of the above is meant to favour one standard over the other. Having the option to use an existing "standard" seems to make sense to me. |
DAR is very strict and is being developed for a tool for editing and so decisions are made based on getting one product ready for use. Because of the decisions being made for it, it could likely alienate 50% of publishers because their XML decisions would not work in it. Examples being how authors and affiliations are linked to each other. JATS seems like a good standard to work with as most publishers who create full text are familiar with it and a learning curve to learn a new standard may be off-putting. IMO :-) |
Problem
Libero's data model is planned to support schemas like JATS (see #11), but developing solely based on Libero's native schema is slow (as it doesn't exist yet). We have to model all the possibilities, which is not amenable to rushing (especially as we don't want to version it). For example, libero/publisher#5 would require a lot of schema work.
Users like eLife will need non-JATS content, but IJM only have scholarly content and have already investigated converting their archive to JATS.
Suggestion
Concerns
The text was updated successfully, but these errors were encountered: