Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rdf:type vs dlthing:meta_type in the context of LinkML type designation #176

Open
jsheunis opened this issue Jul 29, 2024 · 0 comments
Open

Comments

@jsheunis
Copy link
Contributor

jsheunis commented Jul 29, 2024

The issue psychoinformatics-de/shacl-vue#32 in shacl-vue brought to light that data converted from YAML to TTL format using the current state of thesdd schema (which inherits from distribution, thing, and more in dlco) does not contain the expected type designations.

Demonstrative example

With linkml 1.8.1:

The thing schema shows for meta_type and type:

meta_type:
slot_uri: dlthing:meta_type
designates_type: true
description: >-
Type designator of a metadata object for validation and schema structure
handling purposes. This is used to indicate specialized schema classes
for properties that accept a hierarchy of classes as their range.
range: uriorcurie
exact_mappings:
- dcterms:type
type:
slot_uri: rdf:type
description: >-
State that the subject is an instance of a class.
range: uriorcurie
exact_mappings:
- dcterms:type

and the input data shows the following for one of the authors, note that there is no type specified:

- id: exthisds:#ahorst
meta_type: dldist:Person
name: Allison Horst
email: [email protected]
identifier:
- notation: 0000-0002-6047-5564
schema_agency: https://orcid.org
affiliation:
- exthisds:#UCSB
# we can also use the ORCID as an equivalence statement, and web resource
# to link to
same_as:
- https://orcid.org/0000-0002-6047-5564

We can then convert the YAML to TTL using:

>> linkml-convert -s src/sdd/unreleased.yaml -t ttl --target-class Distribution src/sdd/unreleased/examples/Distribution-penguins.yaml > distribution-penguins.ttl

The output for the same author after running linkml-convert is:

<https://example.org/ns/dataset/#ahorst> dldist:affiliation <https://example.org/ns/dataset/#UCSB> ;
    dldist:email "[email protected]"^^dldist:EmailAddress ;
    dlthing:identifier [ a dlthing:Identifier ;
            dlthing:notation "0000-0002-6047-5564" ;
            dlthing:schema_agency <https://orcid.org> ] ;
    dlthing:meta_type "dldist:Person"^^xsd:anyURI ;
    dlthing:name "Allison Horst" ;
    dlthing:same_as "https://orcid.org/0000-0002-6047-5564"^^xsd:anyURI .

Note that the output does not contain the expected

<https://example.org/ns/dataset/#ahorst> a "dldist:Person"^^xsd:anyURI ;

which is the problem.

Discussion

The dlthing:meta_type slot was implemented in order to allow validation of data according to a specialized schema (indicated by the meta_type) where the range of the property accepting the data object is actually a super-class of the specialized one. (I couldn't find a more intuitive way of stating this....)

For example, let's say a Distribution has a was_attributed_to field (aka property) with range/type dlco:Agent, while dlco:Agent has multiple subclasses such as dlco:Person or dlco:Organization. This means the data object can pass through a dlco:Person or dlco:Organization and it should pass LinkML validation, as long as these are specified in the meta_type field of the data object and as long as these are actually subclasses of the accepting slot's range class.

However, the dlthing:meta_type specification does not really have meaning outside of the process of LinkML-based data validation. E.g. when data is exported to TTL and then used by shacl-vue, such an application is interested in the nodes and their types, such as the currently missing:

<https://example.org/ns/dataset/#ahorst> a "dldist:Person"^^xsd:anyURI ;

It is only when data generated/updated by an application such as shacl-vue wants to be validated in LinkML against the dlco-based schemas that the meta_type becomes important again.

After discussions with @mih, several points were raised:

  • try setting rdf:type as the slot_uri of dlthing:meta_type (instead of the current dlthing:meta_type)
  • from the context of importing data to LinkML (for validation) and exporting from LinkML (for use in the world) rdf:type and dlthing:meta_type are essentially two-way aliases:
    • on export, rdf:type should be dlthing:meta_type, since rdf:type is the meaningful type designator in the real world
    • on import, dlthing:meta_type should be rdf:type, since the specific type (not a superclass) is what validation should be based on
  • could we ditch dlthing:meta_type, and only use type (with slot_uri: rdf:type and designates_type: true)? Probably not because something about LinkML slots with designates_type: true only being able to accept LinkML classes/types as the range (i.e. a class/type defined in the validation schema)? @mih please correct me here if I'm misrepresenting.

Investigating rdf:type as slot_uri of dlthing:meta_type

I tried this by updating the line:

slot_uri: dlthing:meta_type

diff --git a/src/thing/unreleased.yaml b/src/thing/unreleased.yaml
index 486fbb2..d010834 100644
--- a/src/thing/unreleased.yaml
+++ b/src/thing/unreleased.yaml
@@ -190,7 +190,7 @@ slots:
     range: string

   meta_type:
-    slot_uri: dlthing:meta_type
+    slot_uri: rdf:type
     designates_type: true
     description: >-
       Type designator of a metadata object for validation and schema structure

changing nothing in the data (i.e. the data object still specifies the meta_type field, and not the type field), and then running the conversion code again.

This was the output:

<https://example.org/ns/dataset/#ahorst> a "dldist:Person"^^xsd:anyURI ;
    dldist:affiliation <https://example.org/ns/dataset/#UCSB> ;
    dldist:email "[email protected]"^^dldist:EmailAddress ;
    dlthing:identifier [ a dlthing:Identifier ;
            dlthing:notation "0000-0002-6047-5564" ;
            dlthing:schema_agency <https://orcid.org> ] ;
    dlthing:name "Allison Horst" ;
    dlthing:same_as "https://orcid.org/0000-0002-6047-5564"^^xsd:anyURI .

The difference compared to the initial output:

  • The rdf:type (i.e. a) is now included!
  • The dlthing:meta_type "dldist:Person"^^xsd:anyURI ; is not in the output anymore (this is in fact removed completely from all output data)

I also ran checks and validations locally after the change, with no unexpected errors.

Is this what we want?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant