-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize domain of dcat:distribution #1576
Comments
I am wondering why you think that semantic resources like ontologies and vocabularies cannot be described as |
Hi @makxdekkers, In our group, we are discussing the extension of MOD (https://github.com/FAIR-IMPACT/MOD) as a DCAT profile. The extension of dcat:Dataset would imply stating that all ontologies are datasets, and some of us feel like this is not the case. And I think that tools, websites and papers are other elements that are usually catalogued that do not qualify just as "data". We will probably be creating a sister class for the extension, but it would be great if we can directly extend dcat:Resource and be able to use |
Tagging @mariapoveda to the thread so she can provide more details |
To add to @makxdekkers' comment, the intentionally broad definition of In the European Data Catalogue, ontologies and controlled vocabularies are typed as
A similar approach was used for assets that may not seem “obvious” members of the dcat:Dataset class in the mapping done (@andrea-perego) from DataCite to DCAT. ADMS is a vocabulary for describing semantic assets. The latest release is available at: https://semiceu.github.io/ADMS/releases/2.00/ |
Thanks for your answers and the link to ADMS, I was not aware of that W3C note (@agbeltran maybe we can use it also for inspiration). I was reviewing dcat, and a dataset is defined as a If that is the case, is there really a difference between dcat:Resource and dcat:Dataset except for having a distribution? If a resource dcat:Resource Thanks in advance! |
Thanks @dgarijo for starting the discussion and @makxdekkers and @ODP-hil for comment. Indeed, the discussion started when several in the FAIR-IMPACT group (and from previous discussions in FAIRsFAIR) proposed to derive However, a few people were opposed to this and would be more comfortable deriving from As the dichotomy dataset/distribution is important, and we want to re-use it for semantic artefacts, we thought that a compromise would be to derive from While I am on the view that we could use |
Is this reluctance to describe things like ontologies as
It is my worry that by doing things differently, your work is going to be in a different silo from other implementations, making it harder to achieve interoperability. But of course, interoperability with others outside your group may not be a crucial requirement in your case. |
@makxdekkers, On the other hand, there are some Dataset properties that would not be used. Yes, we can add a new type and just not use those properties, but it is not very practical to have them. As an intermediate solution I have proposed extending Resource and using distribution, which would essentially make our profile an implicit extension of Dataset without explicitly asserting it. I think that the DCAT standard should probably motivate why is it important to have Resources and Datasets and why these concepts are different by definition. For example, with examples on why a dcat:Resource may not be necessarily a dcat:Dataset. I see no interoperability issues, because if at some point you want to interoperate with any of those services, we can issue a construct query adding the corresponding dctypes. But I think this point is another discussion. |
|
But also the scope statement says that What I do not fully understand is that @dgarijo says that you want to "make our profile an implicit extension of Dataset without explicitly asserting it". This seems to say that your resources are not enormously different from datasets but just a variation. In that case, I wonder if declaring it as a separate class makes things more confusing? |
Thanks for your answers. What I meant by making our profile an implicit extension is a little of a hack I proposed to our group to make everyone happy: We don't extend dcat:Dataset, extending instead dcat:Resource, but we still use dcat:distribution. Then if you infer triples this would make our extension a type of dcat:Dataset, but only if you apply inference. Does that make it clear? |
Sorry for arriving quite late. I also see unnatural to classify ontologies as Dataset (see the difference with classifiying skos vocabularies as datasets which looks fine), an ontology contains definitions rather than a set of data or facts. The point is that for OWL ontologies, for example, would be needed to have distributions but at Resource level or a level sibling to Dataset. It happens that the differences between Resource and Dataset are not so clear (e.g. too general definitions opening the door to "other..." and the EU list of resources) and seems that the term Dataset is being used for duplicate the Resource concept to keep it general as "it was not intended to be used directly". |
What does 'unnatural' mean? If we use the genus/differentia approach to classification, then an ontology is 'a dataset that is composed of axioms' which could be compared with a SKOS vocabulary which is 'a dataset that is composed of concept definitions' or an image which is 'a dataset composed of pixels' or a catalog which is 'a dataset composed of metadata records'. If all the other descriptors associated with a dataset still pertain, then how is it unnatural? |
Following up on this discussion, people may disagree if a semantic artefact is, or should be represented as, a dataset or not and this will have an impact on interoperability in some cases. However, the objective of this issue was to discuss if it makes sense for DCAT to generalise the domain of So, the point is if there are entities other than datasets that may use This may not be important for those communities where "anything" may be represented as a dataset, but this becomes important for those communities in which a semantic artefact, or software, etc is not represented as a What do people think about generalising the domain of |
I would agree with the generalization of dcat:distribution to dcat:Resource so that it can be applied to entities/resources/assets that are not necessarily datasets. |
While I stand by my comment above (that no differentia have been proposed that make an
So I'm OK with the proposal to relax the domain of |
would it be helpful to have a super-property of dcat:distribution that was more relaxed, rather than to change dcat:distribution? |
I have labeled this issue "future work" as the data exchange working group (DXWG) has voted for the DCAT 3 Candidate Rec, and the process requires that we crystalize the features and changes included in the third release of DCAT. |
As a reflection, the WG should consider if dcat:Resource has a semantical meaning or if it just an alternative for rdfs:Resource. I always have read dcat:Resource as a Catalogued Resource, i.e. a resource that is in the catalogue and which is actively managed by the catalogue. Then also dcat:distribution gets maybe a different semantics as dcat:dataset is a subproperty of dcat:resource and thus for coherency reasons also dcat:distribution must be a subproperty of dcat:resource. |
Hello all, I never realized part of our discussion in FAIR-IMPACT was moved here. Sorry. IMO we can see a mod:SemanticArtefact explicitly as a dcat:DataSet (indeed as a dataset of terms which does not mean that everything is a dataset too...). I think this is the case not that much because an ontology match the defintion of "collection of data" but mostly because MOD adheres to the general principle and philosophy of DCAT which is to describe datasets (as a broader term) that can be catalogued and served and distributed. To me this is want to do for SA and then adopting DCAT give the path, and we should not get stuck on the way with the restrictive description as it is of dcat:Dataset. If DCAT enable dcat:Distribution for resource and not dcat:Dataset, to me this will create more confusion than benefits. |
I am submitting this issue on behalf of the FAIR-Impact community, and per suggestion of (@agbeltran).
We would like to reuse DCAT for describing catalogs semantic resources, creating a profile for semantic artefacts (ontologies and vocabularies). Our community represents the Linked Open Vocabularies and the Agrovoc catalogs among others.
Our request is whether it is possible to generalize the domain of dcat:Distribution to dcat:Resource, as not all things with distributions are necessarily datasets. For example, if we want to build a catalog of ontologies, websites, software tools, or papers. All these are resources that have
distribution
, but are not necessarilyDataset
s. We feel that extendingdcat:Dataset
for all these resources is like shoehorning the standard (e.g., properties like temporalResolution do not apply) .Thanks in advance!
The text was updated successfully, but these errors were encountered: