Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.6: make tag ordering part of normative specification, not informative #138

Open
cmungall opened this issue Mar 20, 2023 · 3 comments
Open

Comments

@cmungall
Copy link
Member

currently the 1.4 spec points to the 1.4 guide for advice on tag ordering. (aside: the link to the guide is a broken google code link, should be)

The tag ordering should be normative (of course, it is still valid to emit in any order, but writers SHOULD follow normative tag ordering)

Also we should base the normative order on what the owlapi does. This doesn't strictly follow the guide. E.g. the guide has

  • property_value
  • is_a

however, the owlapi inverts this. Rather than cause churn and insist on following the guide, we should obsolete the ordering in the guide, and create a new normative standard based on what the owlapi does.

clause ordering needs to be clarified. The owlapi seems to be implementing an odd variant of the original OE rules

@cmungall
Copy link
Member Author

This is what the owlapi considers canonical ordering.

format-version: 1.2
synonymtypedef: x "test synonym type"
synonymtypedef: Y "test synonym type"
synonymtypedef: z "test synonym type"

...

id: X:1
name: synonym order test
synonym: "A" BROAD []
synonym: "A" EXACT []
synonym: "A" NARROW []
synonym: "A" RELATED []
synonym: "a" BROAD Y []
synonym: "a" BROAD x []
synonym: "a" BROAD z []
synonym: "a" BROAD []
synonym: "a" EXACT Y []
synonym: "a" EXACT []
synonym: "a" EXACT x []
synonym: "a" EXACT [PMID:1]
synonym: "a" EXACT [pmid:1]
synonym: "a" EXACT [PMID:2]
synonym: "a" EXACT [pmid:2]
synonym: "a" EXACT [pmid:1, pmid:2]
synonym: "a" EXACT [PMID:1, PMID:2]
synonym: "a" NARROW Y []
synonym: "a" NARROW []
synonym: "a" NARROW x []
synonym: "a" RELATED []
synonym: "a" RELATED Y []
synonym: "a" RELATED x []
synonym: "A " BROAD []
synonym: "A " EXACT []
synonym: "A " NARROW []
synonym: "A " RELATED []
synonym: "Ab" BROAD []
synonym: "Ab" EXACT []
synonym: "Ab" NARROW []
synonym: "Ab" RELATED []
synonym: "ab" BROAD []
synonym: "ab" EXACT []
synonym: "ab" NARROW []
synonym: "ab" RELATED []
synonym: "Ac" BROAD []
synonym: "Ac" EXACT []
synonym: "Ac" NARROW []
synonym: "Ac" RELATED []
synonym: "ac" BROAD []
synonym: "ac" EXACT []
synonym: "ac" NARROW []
synonym: "ac" RELATED []
synonym: "As" RELATED []
synonym: "as" NARROW []
synonym: "astacin activity" EXACT []
synonym: "Astacus" RELATED []
synonym: "astacus" NARROW []
synonym: "Astacus proteinase activity" RELATED []
synonym: "astacus proteinase activity" NARROW []

I can reverse engineer the rules, except two things are baffling me

for BROAD, a null type is listed last

synonym: "a" BROAD Y []
synonym: "a" BROAD x []
synonym: "a" BROAD z []
synonym: "a" BROAD []

yet for other scopes a null is intermediate

synonym: "a" EXACT Y []
synonym: "a" EXACT []
synonym: "a" EXACT x []

also why this?

synonym: "a" EXACT [PMID:1]
synonym: "a" EXACT [pmid:1]
synonym: "a" EXACT [PMID:2]
synonym: "a" EXACT [pmid:2]
synonym: "a" EXACT [pmid:1, pmid:2]
synonym: "a" EXACT [PMID:1, PMID:2]

@gouttegd
Copy link

Regarding the ordering of tags in a stanza, it seems to be done according to priority values that are listed here: https://github.com/owlcs/owlapi/blob/0044b995936d6b51ad536d705b6c7f50a6001d1f/oboformat/src/main/java/org/obolibrary/oboformat/parser/OBOFormatConstants.java#L71, e.g.:

/**TAG_ID.   */ TAG_ID    ("id",   10000,  5,      5),
/**TAG_NAME. */ TAG_NAME  ("name", 10000,  15,     15),

First value is the priority when inside the header frame (here set to 10,000 because those tags do not belong to a header frame), the second value is the priority when inside a term frame, and the last one is the priority when inside a typedef frame.

For the ordering of clauses with the same tag, it’s done by the ClauseComparator defined here: https://github.com/owlcs/owlapi/blob/0044b995936d6b51ad536d705b6c7f50a6001d1f/oboformat/src/main/java/org/obolibrary/oboformat/writer/OBOFormatWriter.java#L800.

My understanding after a cursory look is that this comparator only compares the first two values of a clause, so that when clauses only differ by their third value (if present) or by their cross-references, the resulting order is not actually specified.

@cmungall
Copy link
Member Author

Thanks @gouttegd! The declarative piece of code for the tag ordering is useful, we can drive the normative standard from that

ordering of clauses within a tag is a bit troubling (and that code looks familiar, it is likely not altered much since my initial version). It seems that ordering beyond the first two values is determined by some kind of java internals, and could in theory change at any time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants