Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StackOverflowError with "robot remove" #976

Closed
drseb opened this issue Mar 14, 2022 · 19 comments · Fixed by #979
Closed

StackOverflowError with "robot remove" #976

drseb opened this issue Mar 14, 2022 · 19 comments · Fixed by #979
Assignees

Comments

@drseb
Copy link

drseb commented Mar 14, 2022

I tried to remove a BFO class from PATO with robot remove --input pato.owl --term BFO:0000023 --select 'self descendants' --signature true --output pato_removed.owl
but this fails with:

Exception in thread "main" java.lang.StackOverflowError
	at java.base/java.util.HashMap.putVal(HashMap.java:624)
	at java.base/java.util.HashMap.put(HashMap.java:607)
	at java.base/java.util.HashSet.add(HashSet.java:220)
	at java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:352)
	at com.google.common.collect.Iterables.addAll(Iterables.java:352)
	at uk.ac.manchester.cs.owl.owlapi.OWLImmutableOntologyImpl.asSet(OWLImmutableOntologyImpl.java:805)
	at uk.ac.manchester.cs.owl.owlapi.OWLImmutableOntologyImpl.getAxioms(OWLImmutableOntologyImpl.java:1325)
	at uk.ac.manchester.cs.owl.owlapi.OWLAxiomIndexImpl.getSubClassAxiomsForSuperClass(OWLAxiomIndexImpl.java:142)
	at uk.ac.manchester.cs.owl.owlapi.concurrent.ConcurrentOWLOntologyImpl.getSubClassAxiomsForSuperClass(ConcurrentOWLOntologyImpl.java:1779)
	at org.semanticweb.owlapi.search.EntitySearcher.getSubClasses(EntitySearcher.java:839)
	at org.obolibrary.robot.RelatedObjectsHelper.selectClassDescendants(RelatedObjectsHelper.java:2055)
	at org.obolibrary.robot.RelatedObjectsHelper.selectClassDescendants(RelatedObjectsHelper.java:2060)

ROBOT version 1.8.3

My assumption is that there are recursive axioms between ChEBI:50906 and BFO:0000023 (equivalence and subclass)

It works when I first remove the chebi role
robot remove --input pato.owl --term CHEBI:50906 --select "self" --signature true --output pato_removed1.owl And afterwards remove all the other things from the result of the first step:
robot remove --input pato_removed1.owl --term BFO:0000023 --select "self descendants" --signature true --output pato_removed.owl

Structure around "role" in PATO:
Screenshot 2022-03-14 at 10 35 53

@matentzn
Copy link
Contributor

Just to make @drseb excellent analysis a bit shorter:

robot remove -I http://purl.obolibrary.org/obo/pato.owl --term BFO:0000023 --select 'self descendants'  --output pato_removed.owl
--signature true

Has nothing to do with this! While this is a legitimate bug here, it reminds me again why we should not include equivalent classes in our ontology releases.. ever!

@dosumis
Copy link

dosumis commented Mar 14, 2022

it reminds me again why we should not include equivalent classes in our ontology releases.. ever!

That's a very radical statement. Definitely needs discussion given that some downstream resources that may expect these to be present. Obviously this is not the right place to discuss. Can you find a place where we can discuss this?

@jamesaoverton
Copy link
Member

@beckyjackson Please see if you can reproduce, and if there's any smallish change we can make to break such a loop.

@beckyjackson
Copy link
Contributor

@jamesaoverton do we expect CHEBI:50906 "role" to be removed here as well, or only if the equivalent selector is included?

If I fix the stack overflow error, ChEBI "role" gets removed as well, which might be what the user wants, but I'm not sure if it's technically the correct behaviour.

@matentzn
Copy link
Contributor

Hm both ways are justifiable but my feeling is that the cleaner way is if descendants does not include equivalents..

@beckyjackson
Copy link
Contributor

So CHEBI:50906 should remain in pato_output.owl. What about the descendants of ChEBI role? Since ChEBI role == BFO role, the descendants of ChEBI role are also descendants of BFO role even though that's not asserted.

@matentzn
Copy link
Contributor

I dont want to answer this question as I feel like an ontology traitor, but no, my guess is that since ROBOT remove is supposed to operate on assertions, I would not walk down the CHEBI tree.. but.. Its.. odd, I agree.

@beckyjackson
Copy link
Contributor

@matentzn I agree. I think it makes sense to only remove those if we had run the reasoner first to assert that they are also descendants of BFO role. I'd like to get @jamesaoverton's opinion on this, as it might not be what users expect.

Is there anybody else who might have a stake in this?

@jamesaoverton
Copy link
Member

What if ROBOT just fails with a warning, telling the user to merge the named equivalents or something?

@beckyjackson
Copy link
Contributor

@jamesaoverton I'm worried that might confuse some users, if they don't know how to merge terms. Then they're stuck unable to use the file. Maybe I'm overthinking it though, and our non-power-users wouldn't be doing this sort of stuff anyway.

@cmungall
Copy link
Contributor

This is exactly why we have defined profiles of OBO format. These profiles are useful outside the scope of OBO format

https://owlcollab.github.io/oboformat/doc/obo-syntax.html#6.2

if the structure conforms to OBO-Basic, then naive graph walking is guaranteed to terminate. If the structure doesn't conform then naive walking is not guaranteed to terminate. Furthermore, certain operations can be more simply defined if the input is constrained to certain profiles.

@jamesaoverton
Copy link
Member

We added remove and filter in ROBOT 1.2.0, released December 6, 2018. Only experts should use them, because you need to understand a wide range of OWL concepts. This is the first report of a stack overflow with remove. The "correct" behaviour for "descendants" with named equivalents is not at all clear. I think failing with a warning and documentation is reasonable.

I'm open to better suggestions. I don't particularly want to take a performance hit for checking the graph structure every time, but maybe it's small?

@cmungall
Copy link
Contributor

I have used Tarjan's algorithm in the past for cycle detection

Tarjan is O(V+E), here's a java implementation:

https://github.com/asad/GraphMST/blob/master/src/algorithm/Tarjan.java

but this suggests a nested DFS approach may be faster:

https://math.stackexchange.com/questions/917414/tarjans-algorithm-to-determine-wheter-a-directed-graph-has-a-cycle

There are other cases where cycle detection over the full existential graph would be useful for QC, e.g. obophenotype/uberon#1829

@dosumis
Copy link

dosumis commented Mar 15, 2022

I may be in the minority here, but I think ROBOT should follow OWL semantics and not worry about graph crawling. The default behaviour should be to remove subclasses of both. The onus should be on users to understand this behaviour - which they should if they know enough to use equivalence between named classes in their ontology. At most, the command should emit a warning.

Suggesting that users should always merge makes no sense to me. How could PATO merge BFO role and CHEBI role? But these kind of named equivalence bridges can be very useful in reasoning. Arguments about producing release products that lack named equivalence or general OBO policy about named equivalence are separate issues, independent of ROBOT.

@drseb
Copy link
Author

drseb commented Mar 15, 2022

I am not a power-user and I think it is a quite normal to try to make a huge ontology such as UBERON more digestible by removing everything that is unnecessary in a particular context or a particular task.

Regarding what to include: I wanted to get rid of everything about role and I would have included ChEBI and BFO role in the deletion.

@jamesaoverton
Copy link
Member

So CHEBI:50906 should remain in pato_output.owl. What about the descendants of ChEBI role? Since ChEBI role == BFO role, the descendants of ChEBI role are also descendants of BFO role even though that's not asserted.

Looking at the pato.owl I see three relevant axioms:

  1. BFO:0000023 equivalentTo CHEBI:50906
  2. BFO:0000023 subClassOf CHEBI:50906
  3. CHEBI:50906 subClassOf BFO:0000023

In this case user asks ROBOT to remove BFO:0000023 "self descendants". By (3), CHEBI:50906 is a descendant, and so CHEBI:50906 and all its descendants should be selected and then removed.

@beckyjackson can you make a PR with this behaviour, please? Then we can discuss the implications.

If only axiom (1) and not (2) and (3) were asserted, then I would expect "self descendants" would not select CHEBI:50906. I would expect "equivalents" to include CHEBI:50906, so "self equivalents descendants" would remove CHEBI:50906 and its descendants.

I guess (1) implies (2) and (3), but ROBOT remove/filter do not run a reasoner, they just work with the asserted axioms.

@jamesaoverton
Copy link
Member

@drseb Does #979 work for you? You can download a JAR for testing here: https://build.obolibrary.io/job/ontodev/job/robot/job/976-fix/lastSuccessfulBuild/artifact/bin/robot.jar

@matentzn
Copy link
Contributor

I just tested it as well and it does what it should with the PATO.

@drseb
Copy link
Author

drseb commented Mar 17, 2022

@jamesaoverton it ran with no errors and I see the role-branch gone. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants