-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RO-Crate "license" field should be URI #456
Comments
The below tries to capture a Skype conversation on 2020-11-03 between @stain @fbacall @stuzart @alaninmcr This would improve consistency with the schema.org generation, which currently use https://github.com/seek4science/seek/blob/master/lib/seek/license.rb to parse https://github.com/seek4science/seek/blob/master/public/od_licenses.json from https://licenses.opendefinition.org/licenses/groups/all.json aka https://opendefinition.org/licenses/api/ Although opendefinition.org and opensource.org use SPDX identifiers from https://spdx.org/ they have additional information such as if it's suitable for data or software - which is important in SEEK. SPDX is in Open Source considered the gold standard for license information, particularly the short SPDX identifiers embedded inside source code files as comments (thus allowing different files to have different licenses):
For packages or repositories SPDX also have something called SPDX documents which either have a simple yet unwieldy long text file, or an RDF document. In here we find spdx:licenseId which we could in theory use from RO-Crate as a way to equalize across A potential gotcha here is that this would also allow license expressions e.g. for dual licensing or exceptions - so these don't always match straight up into URLs https://spdx.org/licenses/MIT. Many of the licenses have their own URIs as well, so we could have many potential inconsistencies:
Some licenses like BSD 3-Clause are templates that needs to be completed with their own copyright. Thus it would be insufficient to say the https://github.com/spdx/license-list-data provide more Linked Data information from SPDX, which in theory could be combined with the https://licenses.opendefinition.org/licenses/groups/all.json data using the common SPDX identifier, for instance using the "licenseId" field https://github.com/spdx/license-list-data/blob/master/jsonld/BSD-3-Clause.jsonld#L6 which we could also use in the RO-Crate in combination with an arbitrary In short - the string However it is not something we can lift into the main RO-Crate spec https://www.researchobject.org/ro-crate/1.1/contextual-entities.html#licensing-access-control-and-copyright as we can't refer to the To get consistency the cleaner would need some kind of map of URIs aliased to SPDX identifiers. We should still document a list of "known" licenses by their identifier. We need something that maps the alternative urls to the opendefinition api which is what our license dropdown list is informed by. @stuzart says as long as we do the lookups through the |
In Ruby land, https://www.rubydoc.info/gems/spdx/3.0.1 can parse expressions like https://www.rubydoc.info/gems/spdx-licenses/1.2.0 seems to be able to look up from that JSON file but do not provide any information except if the license exist and is OSI compliant. Our own https://github.com/seek4science/seek/blob/master/lib/seek/license.rb looks up and expose elements from the Open Definition JSON. |
Perhaps in RO-Crate http://schema.org/identifier can be used to give the SPDX value. As SPDX is "industry standard" it could go straight as string, being the implied scheme:
A more scoped one using PropertyValue identifiers could cover an cover license expressions for a local file, AND say that they are
|
If logged out, takes you directly to the github issue tracker (configurable link). If logged in, takes you to a page that directs the user to Github by preference, but also provides the feedback form as an alternative
As pointed out in #183 (comment) the Workflow RO-Crate
license
field is in SEEK treated like a text field (e.g."MIT"
), however both https://www.researchobject.org/ro-crate/1.1/contextual-entities.html#licensing-access-control-and-copyright and https://schema.org/license says it should be a URL, meaning {"@id": "https://spdx.org/licenses/MIT"} or similar.The Workflow RO-Crate
license
field should be valid and ideally also consistent with the full URIs in the generated schema.org annotations.The text was updated successfully, but these errors were encountered: