Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a complete shacl schema with UI annotations #2

Closed
jsheunis opened this issue Mar 21, 2024 · 6 comments
Closed

Create a complete shacl schema with UI annotations #2

jsheunis opened this issue Mar 21, 2024 · 6 comments
Assignees

Comments

@jsheunis
Copy link
Collaborator

For reference:

We could start with a schema that we have authored in LinkML, from datalad-concepts, and then remove unnecessary complexities (at first) and add all the constraint annotations and dash annotations that we feel would be necessary for form or viewer generation.

@jsheunis
Copy link
Collaborator Author

jsheunis commented May 8, 2024

A blocker for this is that LinkML's shaclgen has several shortcomings in terms of propagating ranges and annotations to SHACL shapes. See e.g. #9, and linkml/linkml#1618. I'm first focusing on patching shaclgen...

@jsheunis
Copy link
Collaborator Author

jsheunis commented May 8, 2024

Current status:

  • I have a fair enough understanding of what the shaclgen code does
  • I was able to amend the code to let nodeKind and datatype information flow through to SHACL for slots with a custom type (or any type) as range
  • I was able to amend the code to let annotation information flow through to SHACL for slots with annotations, or for slots with custom types, that include annotations, as ranges

The current challenge is how to interpret the types of the annotation tag and value. For example if we have a custom type (or could also be a slot) with an annotation:

types:
  NameString:
    typeof: string
    uri: myschema:NameString
    pattern: "^[^\\n]$"
    description: ...
    annotations:
      dash:singleLine: true

Here the annotation tag is a CURIE and the value is xsd:boolean. But how would the shaclgen code know this? Annotations could be anything.

Other TODOs:

  • add (close/exact/broad)mappings to SHACL property shapes
  • figure out how to handle pattern or annotations for a single slot that originate both from the slot definition as well as the definition of the range type.

@jsheunis
Copy link
Collaborator Author

jsheunis commented May 9, 2024

I was able to amend the code to let nodeKind and datatype information flow through to SHACL for slots with a custom type (or any type) as range

has been turned into a PR: linkml/linkml#2102

@jsheunis
Copy link
Collaborator Author

Here the annotation tag is a CURIE and the value is xsd:boolean. But how would the shaclgen code know this? Annotations could be anything.

I added code to shaclgen to:

  1. take a command line parameter --include-annotations so that the user can specify whether they want annotations to be part of the generated SHACL shapes
  2. grab annotations from slots, as well as from types that bubble up to the slots that have said types as the range
  3. recognise an annotation tag or value as a CURIE by the :
  4. if not CURIE, to write the tag / value as an RDF Literal with an added XSD datatype IRI, which is determined by using python type and, in addition to python bool, several types used in linkml:
    from linkml_runtime.utils.yamlutils import (   
        extended_float,
        extended_int,
        extended_str,
    )
    

This works quite nicely. Example:

Schema:

id: https://example.org/test-schema
name: myschema

prefixes:
  dash: http://datashapes.org/dash#
  dlco: https://concepts.datalad.org/
  myschema: https://example.org/test-schema/
  sh: http://www.w3.org/ns/shacl#

default_prefix: myschema

imports: https://w3id.org/linkml/types

types:
  NameString:
    typeof: string
    uri: myschema:NameString
    pattern: "/^[a-z ,.'-]+$/i"
    annotations:
      dash:singleLine: true
      dash:editor: dash:TextFieldEditor

slots:
  my_attr:
    range: NameString
    annotations:
      dash:singleLine: false
      sh:group: dlco:NamePropertyGroup
      sh:order: 0
  mydate:
    range: date
    annotations:
      sh:group: dlco:NamePropertyGroup
      dash:editor: dash:DatePickerEditor
      sh:order: 1
  mydatetime:
    range: datetime
    annotations:
      dash:editor: dash:DateTimePickerEditor
      sh:group: dlco:NamePropertyGroup
      sh:order: 2

classes:
  MyClass:
    slots:
      - my_attr
      - mydate
      - mydatetime

code to run:

gen-shacl --include-annotations myschema.yaml

Output:

@prefix dash: <http://datashapes.org/dash#> .
@prefix dlco: <https://concepts.datalad.org/> .
@prefix myschema: <https://example.org/test-schema/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

myschema:MyClass a sh:NodeShape ;
    sh:closed true ;
    sh:ignoredProperties ( rdf:type ) ;
    sh:property [ dash:editor dash:TextFieldEditor ;
            dash:singleLine false,
                true ;
            sh:datatype myschema:NameString ;
            sh:group dlco:NamePropertyGroup ;
            sh:maxCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:order 0 ;
            sh:path myschema:my_attr ;
            sh:pattern "/^[a-z ,.'-]+$/i" ],
        [ dash:editor dash:DateTimePickerEditor ;
            sh:datatype <xsd:dateTime> ;
            sh:group dlco:NamePropertyGroup ;
            sh:maxCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:order 2 ;
            sh:path myschema:mydatetime ],
        [ dash:editor dash:DatePickerEditor ;
            sh:datatype <xsd:date> ;
            sh:group dlco:NamePropertyGroup ;
            sh:maxCount 1 ;
            sh:nodeKind sh:Literal ;
            sh:order 1 ;
            sh:path myschema:mydate ] ;
    sh:targetClass myschema:MyClass .

TODO:

  • figure out how to handle conflicts between annotations that derive from a type used as a slot range, and annotations on that same slot directly. At the moment, shaclgen will add both values as the comma separated object of the property shape triple (see dash:singleLine false, true ; in the output above). Should this be accepted? or one always prioritised? or user-specified)
  • The main thing left to figure out is how to get the sh:PropertyGroup nodes into the SHACL output, if they indeed would first somehow be specified as part of the LinkML schema. The thing that still confuses me here is that these property groups are actually all data items, and not separate schemas per se, since they have the same structure, e.g.:
dlco:BasicPropertyGroup a sh:PropertyGroup ;
	rdfs:label "Basic" ;
	sh:order "0"^^xsd:decimal ;
    rdfs:comment "" .

dlco:DataPropertyGroup a sh:PropertyGroup ;
	rdfs:label "Data" ;
	sh:order "1"^^xsd:decimal ;
    rdfs:comment "" .

@jsheunis
Copy link
Collaborator Author

Noting the definitions of mappings for the case when we need those also as part of generated shapes: https://linkml.io/linkml-model/latest/docs/mappings/

@jsheunis jsheunis changed the title Create a complete shacl schema with DASH annotations Create a complete shacl schema with UI annotations May 28, 2024
@jsheunis
Copy link
Collaborator Author

Update:

  1. With https://github.com/jsheunis/datalad-concepts/blob/c59650ce640e081723ef0e3fa088feedb291367f/src/sddui/unreleased.yaml we now have a LinkML schema with UI annotations that demonstrates the basics. It does not have all possible annotations/properties that we currently find useful and are aware of (e.g. there aren't any slots with the required property yet), but the important parts are there: sh:group and sh:order.
  2. With shaclgen: Add --include-annotations option to let annotations be part of shacl shapes linkml/linkml#2111 the functionality exists to get annotations into exported SHACL shapes.
  3. Then we have a few components to help get user-specified property groups into the same SHACL shapes graph exported from the LinkML schema:
  4. The single SHACL file is used as the basis for the user interface, which is rendered based on a component factory being worked on in Implement component factory for data viewing and entering #5

I think this pretty much provides all the pieces necessary to address this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant