Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace owltools --expand-macros with robot expand #2597

Merged
merged 5 commits into from
Aug 16, 2022

Conversation

anitacaron
Copy link
Collaborator

@anitacaron anitacaron commented Aug 11, 2022

After adding robot expand, it detected the misuse of the present_in_taxon annotation, which expects a resource but was passed a string. The cases are in CL_import.owl.
The 15 cases are:

  • CL_0009068
  • CL_0009069
  • CL_0009070
  • CL_0009071
  • CL_0009072
  • CL_0009073
  • CL_0009074
  • CL_0009075
  • CL_0009076
  • CL_0009077
  • CL_0009078
  • CL_0009079
  • CL_0009081
  • CL_0009082
  • CL_0009083

This needs to be solved to have the QC passed here. I'll create an issue in the CL repo.

Fixes #2382

@gouttegd
Copy link
Collaborator

This works fine (once CL terms have been corrected), but I wonder about the expansion of present in taxon. It results in the creation of several distinct classes (with labels of the form http://purl.obolibrary.org/obo/UBERON_XXXXXXX in taxon http://purl.obolibrary.org/obo/NCBITaxon_YYYYY).

The comment associated with the present_in_taxon annotation in RO says:

The SPARQL expansion for this relation introduces new named classes into the ontology. For this reason it is likely that the expansion should only be performed during a QC pipeline; the expanded output should usually not be included in a published version of the ontology.

Do we really want those automatically generated classes in the released artefacts of Uberon?

@cmungall
Copy link
Member

cmungall commented Aug 11, 2022 via email

@gouttegd
Copy link
Collaborator

@cmungall No opinion about anonymous classes, but in any case if we want that it needs to be fixed in RO. Here we can only expand (or not) to whatever the SPARQL query defined in RO:0002175 does.

@anitacaron Can you update the call to robot expand to include an --no-expand-term http://purl.obolibrary.org/obo/RO_0002175 option?

@anitacaron
Copy link
Collaborator Author

But then the present_in_taxon will not be expanded at any time during the QC?

@gouttegd
Copy link
Collaborator

Do we have any step in our current QC that depends on those classes being present?

I understand that the idea behind those classes is to perform some QC checks on them, but I don’t think we are currently doing that, are we?

I guess in the future, when we do add QC checks on the expanded “present-in-taxon” classes, we will have two options:

  • expand all the annotations, perform the QC checks, then add another step to strip away the generated classes;
  • have two distinct pipelines:
    • one where we expand all the annotations and that is used to perform the QC;
    • one where we don’t expand the “only-for-QC” annotations such as RO:0002175, and from which we generate the released artifacts.

Not sure what is better… Both options seems bad to me, actually…

@uberon
Copy link

uberon commented Aug 11, 2022 via email

@balhoff
Copy link
Member

balhoff commented Aug 11, 2022

The QC check is just checking the ontology for coherency when those classes are present. Generating them as anonymous classes is not an option; anonymous classes wouldn't provide any coherency violations if the expression was unsatisfiable (but I would love to change the expansion if someone can point out a way to make this happen).

I would have a coherency check in the pipeline that runs all expansions. But I would disable 'present in taxon' expansions in targets leading to the release file.

@balhoff
Copy link
Member

balhoff commented Aug 11, 2022

I guess I meant anon individuals, but that makes debugging harder (inconsistency not unsat). So scratch my comment On Thu, Aug 11, 2022 at 7:09 AM Chris Mungall @.***> wrote:

Agree about harder debugging. Also, for reasons I don't know, anonymous individuals are excluded from OWL EL.

@gouttegd
Copy link
Collaborator

@balhoff My bad, I had not understood that the mere presence of these classes was enough.

In this case I agree that we should have somewhere a build target in which present in taxon annotations are expanded as part of the QC checks.

I’d suggest to leave that to another PR, though, and to merge this one for now. What do you think @matentzn ?

@balhoff
Copy link
Member

balhoff commented Aug 12, 2022

Generating them as anonymous classes is not an option; anonymous classes wouldn't provide any coherency violations if the expression was unsatisfiable (but I would love to change the expansion if someone can point out a way to make this happen).

Oh I just thought of a way! We use all these generated 'in taxon' expressions as fillers for existential restrictions using some dummy property, and make the restrictions superclasses of a single dummy class. Not sure if folks want this in their ontology any more than all those generated classes.

@gouttegd
Copy link
Collaborator

gouttegd commented Aug 12, 2022

@balhoff Sorry I am not sure I understand correctly.

What I understand is that, for example: if Harderian gland (UBERON:0004187) has an annotation present_in_taxon Felidae (RO:0002175 NCBITaxon:9681), this should be expanded to a general class axiom as follows:

SubClassOf(
  ObjectIntersectionOf( ObjectSomeValuesFrom(RO:0002162 NCBI:Taxon:9681) UBERON:0004187 )
  ObjectSomeValuesFrom( RO:dummy_property UBERON:dummy_class )
)

instead of being expanded, as it currently is, into a 'http://purl.obolibrary.org/obo/UBERON_0004187 in taxon http://purl.obolibrary.org/obo/NCBITaxon_9681' class (subclass of both UBERON:0004187 and part_of some NCBITaxon:9681).

Is that remotely correct?

@balhoff
Copy link
Member

balhoff commented Aug 12, 2022

@gouttegd it would be something like this:

SubClassOf(
  UBERON:dummy_class
  ObjectSomeValuesFrom(
    RO:dummy_property  
    ObjectIntersectionOf( ObjectSomeValuesFrom(RO:0002162 NCBI:Taxon:9681) UBERON:0004187 )
  ) 
)

The same dummy class can be used for every present-in-taxon statement. This would create only one named class instead of one for every assertion. It might be a bit more obscure though.

@cmungall
Copy link
Member

@balhoff you just want to blow up all ontology browsers don't you :-)

remember this: geneontology/amigo#428

@anitacaron
Copy link
Collaborator Author

So I'll merge this PR to be in the next UBERON release and create another issue to address the present_in_taxon QC.

@anitacaron anitacaron merged commit bdd829f into master Aug 16, 2022
@anitacaron anitacaron deleted the anitacaron/issue2382 branch December 9, 2022 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SPARQL taxon expansion does nothing
5 participants