Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to unified DOSDP template #97

Open
dosumis opened this issue May 26, 2021 · 5 comments
Open

Switch to unified DOSDP template #97

dosumis opened this issue May 26, 2021 · 5 comments
Labels

Comments

@dosumis
Copy link
Contributor

dosumis commented May 26, 2021

The pipeline has grown overly complex. It could be simplified to run mostly through a unified DOSDP template allowing a range of variables to feed into automated definitions/label/synonyms etc.

The major dependency for this is an update to DOSDP + DOSDP_tools to allow list variables with 0-many cardinality + a templating system that can work with these. For disucssion of possible extensions see:

INCATools/dead_simple_owl_design_patterns#71

@hkir-dev
Copy link
Contributor

First iteration of the migration to Dosdp templates completed https://github.com/obophenotype/brain_data_standards_ontologies/tree/dosdp_based_pipeline/src/patterns/dosdp-patterns

To keep changes minimal, kept the tsv structures as is. These can be refactored to build a unified template.

Migration of robot BDS individuals creation failed, since it seems that Dosdp is not supporting named individuals yet INCATools/dead_simple_owl_design_patterns#64

@hkir-dev
Copy link
Contributor

When provided a list to a logical axioms, Dosdp constructs intersectionOf them. Such as :

<rdfs:subClassOf>
            <owl:Class>
                <owl:intersectionOf rdf:parseType="Collection">
                    <owl:Restriction>
                        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002292"/>
                        <owl:someValuesFrom rdf:resource="http://identifiers.org/ensembl/ENSMUSG00000022206"/>
                    </owl:Restriction>
                    <owl:Restriction>
                        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002292"/>
                        <owl:someValuesFrom rdf:resource="http://identifiers.org/ensembl/ENSMUSG00000030905"/>
                    </owl:Restriction>
                </owl:intersectionOf>
            </owl:Class>
</rdfs:subClassOf>

Robot was generating direct multiple subclassOf relations for those cases (without intersectionOf). While both are logically equivalent, this caused a problem in the neo4j2owl, seems it is not supporting intersectionOf/unionOf constructs and needs a significant refactoring to do so.

Ideally this needs to be solved by robot Expression Materializing Reasoner in the vfb_pipeline_dumps step. But with our 15Mb ontology, we get out of memory error.

As a workaround, handled subclassOf definitions that are intersection of a set of classes through sparql in the dumps phase.

Same should be applied for equivalent classes that are intersection of a set expressions (classes, existential restrictions etc.). They should be unpacked to a set of subclassOf (not equivalentClass) definitions. But some logical expressiveness will be lost.

@hkir-dev
Copy link
Contributor

hkir-dev commented Sep 3, 2021

Now we have 6 dosdp templates (https://github.com/obophenotype/brain_data_standards_ontologies/tree/dosdp_based_pipeline/src/patterns/dosdp-patterns):

  1. brainCellRegionMinimalMarkers.yaml
  2. taxonomy_class.yaml
  3. taxonomy_equivalent_class.yaml
  4. taxonomy_minimal_markers.yaml
  5. taxonomy_non_taxonomy_classification.yaml
  6. ensmusg.yaml

1, 2, 3 and 4 can be unified to have a single big class template and we will have a table with 27 columns. Should we merge all or go with a subset (such as merge only 1 and 2) ?

@dosumis
Copy link
Contributor Author

dosumis commented Sep 6, 2021

ensmug.yaml will remain a separate (ROBOT) build. It is used to support imports.

We should be able to manage with (many?) fewer columns than 27 for the rest. e.g. minimal markers var needed for generation of def and synonyms is the same as needed for logical axioms. Need a comprehensive review in context of pipelines scripts, configs and templates.

hkir-dev added a commit that referenced this issue Sep 8, 2021
hkir-dev added a commit that referenced this issue Sep 8, 2021
@hkir-dev
Copy link
Contributor

hkir-dev commented Sep 8, 2021

This branch contains the unification updates: https://github.com/obophenotype/brain_data_standards_ontologies/tree/single_dosdp_template

  1. ensmusg.yaml is ROBOT template
  2. taxonomy_class.yaml + brainCellRegionMinimalMarkers.yaml + taxonomy_minimal_markers.yaml merged to single template -> taxonomy_class.yaml
  3. Same template will be used for different species. Such as taxonomy_class.yaml will be used with several data: CCN202002013_class.tsv, CCN201912131_class.tsv and CCN201912132_class.tsv. To maintain this:
    a- Automatic dosdp pattern rolling disabled (it requires same template and data file name), bdscratch.Makefile manages this.
    b- Pattern term extraction re-implemented in the bdscratch.Makefile (it was requiring same template and data file name previously). Current solution is not elegant, need to solve with subst.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants