Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add all UBERONParcellations from UBERON Minimal Nervous System subset #133

Open
UlrikeS91 opened this issue May 30, 2024 · 2 comments · Fixed by #134
Open

Add all UBERONParcellations from UBERON Minimal Nervous System subset #133

UlrikeS91 opened this issue May 30, 2024 · 2 comments · Fixed by #134
Assignees
Labels
major update large workload or major update needed to complete request any request or update for instances that are not covered by Technique, ContentType or SANDS label

Comments

@UlrikeS91
Copy link
Contributor

UlrikeS91 commented May 30, 2024

We did not add all relevant UBERONParcellations. Since there have been a few requests for adding more parcellations and these are very straight forward to add, I decided to pull the entire subset and create the openMINDS instances.

A short summary:

  • the subset contains both cell ontology (CL) terms and UBERON terms
    • I have created CellType instances for the CL terms but I will not make a PR for those (yet), we need to resolve Need for naming convention of cell type terms - urgent #94 before I want to make any major updates
    • I have created UBERONParcellation instances for the UBERON terms and will make several PRs since this is a list of over 2800 terms
  • the UBERONParcellation instances are structured to following way:
    • at_id and at_type: as usual, name part of the at_id in lowerCamelCase with exception for terms that have a proper name (e.g., Ammon's horn remains capitalized "AmmonsHorn")
    • definition:
    • auto-generated from ontology properties "is_a" and "relationship" (which is always "is part of") in two sentences, but not all terms have both
    • reference:
      • only "is_a": [auto-generated from 'is_a' property of the [UBERON ontology term](http://purl.obolibrary.org/obo/UBERON_<<ID>>)]
      • only "relationship": [auto-generated from 'relationship' property of the [UBERON ontology term](http://purl.obolibrary.org/obo/UBERON_<<ID>>)]
      • both: [auto-generated from properties of the [UBERON ontology term](http://purl.obolibrary.org/obo/UBERON_<<ID>>) ('is_a' and 'relationship')]
    • description:
      • minimally cleaned-up text from ontology property "def" (definition)
        • capitalized first letter if it was lower case
        • added period at the end if ti was missing
        • removed some random symbols (e.g., a single [ at the end of the string)
        • added spaces after a period and capitalized first letter of the next sentence
        • I removed the citation/reference within the definition because it wasn't very useful for our instances (e.g., sometimes had abbreviation of another ontology as reference which wouldn't make sense to keep like this in openMINDS instances)
        • reference: I added [definition of the [UBERON ontology term](http://purl.obolibrary.org/obo/UBERON_<<ID>>)] instead, so that users can go there and see the original reference in its original context where it makes more sense
    • interlexIdentifier: I had a list from quite a few years back which had some terms mapped; I only added those, I don't think all UBERON terms are mapped in InterLex (@tgbugs?) anyway, but if they are it should be easy to supplement them with this later
    • knowledgeSpaceLink: with very few exceptions, all terms received this link; I tested very single one and all lead to the correct term, the exceptions are the ones where I could not find the correct UBERON ID in KS
    • name == UBERON term name
    • preferredOntoloyIdentifier: always UBERON ID (as IRI: http://purl.obolibrary.org/obo/UBERON_<<ID>>)
    • synonym:
      • I only kept "EXACT" synonyms and excluded any "BROAD", "NARROW" or "RELATED" synonyms
      • additionally I excluded "EXACT DEPRECATED" (seemed very old nd/or outdated), "EXACT DUBIOUS" (seemed to be mainly exact synonyms but with typos) and "EXACT PLURAL" (we don't collect the plural as synonyms)
      • the remaining types that were kept are:
        • EXACT
        • EXACT ABBREVIATION
        • EXACT BRAIN_NAME_ABV
        • EXACT HUMAN_PREFERRED
        • EXACT LATIN
        • EXACT NON_AMNIOTE
        • EXACT SENSU

Please note that it is up for discussion which of the over 2800 terms should be added as an UBERONParcellation! I don't think we should add all from this subset, but I want those discussions to happen in the repo (or offline with a conclusion in the corresponding PR).

@UlrikeS91 UlrikeS91 added request any request or update for instances that are not covered by Technique, ContentType or SANDS label major update large workload or major update needed to complete labels May 30, 2024
@UlrikeS91
Copy link
Contributor Author

related issues with UBERON requests:
#131
#104

@UlrikeS91
Copy link
Contributor Author

This will also be an update of existing terms since the convention for the definition (and description) has change e.g., definitions don't start with the repetition of the name of the term anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major update large workload or major update needed to complete request any request or update for instances that are not covered by Technique, ContentType or SANDS label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant