-
Notifications
You must be signed in to change notification settings - Fork 4
understanding stencila/schema architecture #62
Comments
@100ideas apologies for taking so long to respond to this. I got caught up with various other things and forgot to come back to it.
That sounds super cool, and useful! I'd be interested in finding out more.
I initially found it difficult to reconcile how these two technologies fit and complement each other. Personally, I think that schema.org is a poor choice of name, and that it adds to the confusion. Because JSON-LD is closely aligned to schema.org (although, of course, one can use another vocab), it's easy for ppl to think that JSON-LD is an alternative to JSON-Schema. It became clearer in my mind how they complemented each other when I started thinking about JSON-Schema being for data modelling and validation, JSON-LD as a mechanism for mapping between vocabularies, and schema.org as one possible vocabulary. Maybe that is more obvious at the outset for some people but it wasn't for me.
The approach evolved. It begun by us having an implicit, not-well documented schema for transferring data (e.g. tabular, column-wise data) between languages. We realised that we were probably reinventing the wheel there so looked at other schemas such as Avro. We were also documenting our code execution API using OpenAPI (which uses JSON-Schema). We then realised that using JSON-Schema for everything in an executable document (like Jupyter Notebooks but with finer granularity e.g We originally took a Typescript-first approach and were defining schemas using Typescript classes with decorators on properties and then generating JSON Schema from them. That was fine but we decided to invert the relationship to be language agnostic, and more data modelling and validation focused. We use a custom
The decision to use the a custom Also, we want to make it really easy for people to understand and contribute to the schemas. We have found that using So in summary, the YAML-with-custom-extensions approach, provides a lighter-weight, less intimidating way to write schemas (which ultimately get translated to JSON-Schema documents; analogous to authoring Markdown that gets translated to HTML I suppose). Hope that is of use. Again, apologies for the slow response. |
Hey yall, I'm working on an experimental schema-autosuggest frontend interface that helps a user consume, transform, mashup, & remap data tables. I've been reviewing the source code of
stencila
&stencila/schema
to see how I might implement something that will be as broadly useful as possible.I am really torn about json-schema & json-ld. Ultimately I want to use both, like you are, to ensure data workflows can be serialized and reused in an unambiguous & repeatable manner. But my design intent is to allow the user to be as initially unconstrained as possible as they create and structure hierarchies of tabular data, nudging them towards standard / published schemas without requiring them to restructure their raw data before doing anything else. What I need to do - in the frontend experience - is help users explore various ways of mashing up, overriding, and fragmenting existing json-schema as they build a data processing workflow, then reconcile the resulting definitions with the preexsiting ones in the most parsimonious (least redundant) way.
It seems like
stencila/schema
has an architecture designed to support modular, hierarchical, reusable schema definitions, and I'd like to know more about how this particular approach was developed and how the major parts of it work together.I can roughly see that schemas are initially defined in lightweight yaml - that's nice - and then compiled into a hierarchical json-schema (and ts-definitions) at runtime. In particular schemas can
[extend](https://github.com/stencila/schema/blob/master/CONTRIBUTING.md#the-extends-keyword)
other schemas, setting up a class-like inheritance mechanism. Why did you decide to design it this way, and would it have been possible to use native json-schema $refs or json hyper-schema links instead? Where those options too verbose / user-unfriendly?(I initially posted this over in the stencila gitter room but it got a bit involved so I think this might be a better place for it)
The text was updated successfully, but these errors were encountered: