Consistency tests in FIBO continuous integration #1732

mereolog · 2021-12-13T10:22:11Z

mereolog
Dec 13, 2021
Maintainer

The issue was raised in ontology-publisher repo whether we should automate consistency checking - see: edmcouncil/ontology-publisher#59.

Pull request edmcouncil/ontology-publisher#65 allows us to do so, but running them takes time - to check both PROD and FIBO adds additional 4 hours (sic!) to the process.

So presumably it does not make sense to run it for every change/commit in every branch of FIBO. For example, we could:

run it just for the master branch
run it just for the PRs that has a designated label, e.g., 'check consistency'.
Obviously there may be other possibilities.

Btw., the current assumption is that:

if PROD is inconsistent, the process is terminated (so this is treated as an error)
if DEV is inconsistent, only a warning is thrown.

rivettp · 2022-01-26T00:59:27Z

rivettp
Jan 26, 2022
Collaborator

How much time to check only PROD (and not DEV)?
How clean is DEV right now?

BTW for useful tests we should aim to have test individuals for each class. Segmented away from the files we expect people use for real.

0 replies

mereolog · 2022-01-28T09:41:40Z

mereolog
Jan 28, 2022
Maintainer Author

We now have more data, so testing PROD takes on average approx. 1 hour and testing DEV 1 1/2.

0 replies

rivettp · 2022-01-28T16:00:32Z

rivettp
Jan 28, 2022
Collaborator

Maybe another argument for segregating reference data from the ontologies. As I've proposed before, I think there are the following categories of Individual:
-1) enumerated distinguished individuals - which are part of the ontology (e.g. used is oneOf restrictions)
-2) test individuals (used to exercise all parts of the ontology to show it works, unlikely to be realistic data) (we don't have these)
-3) exemplars (small number of cases to illustrate usage - we already have ExampleIndividuals ontologies)
-4) reference datasets (complete coverage of a subject area e.g. Currencies with definitive/canonical URIs)

could readily be excluded from hygiene tests. Not sure about 3) but since we don't have any of 2) we should keep running them through hygiene.
In some cases it's not clear whether some data is intended to be 3) or 4) e.g. the set of Regulatory Authority individuals. And if people want to add their own are they supposed to have a mix of those and the FIBO ones? Or ignore the FIBO ones?
And even when something is clearly meant to be 3) we often have more individuals than are needed to illustrate the usage. Which makes it harder to use (people don't know whether they should be looking for something different in each - maybe we should have a comment to say what aspects each individual is illustrating)

0 replies

ElisaKendall · 2022-01-28T17:08:56Z

ElisaKendall
Jan 28, 2022
Collaborator

The "all" files in the various domain areas make these distinctions - we should provide about files at the top level that do the same. I'll add that to my pile of todos. Elisa

…

________________________________ From: Pete Rivett ***@***.***> Sent: Friday, January 28, 2022 8:00 AM To: edmcouncil/fibo ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [edmcouncil/fibo] Consistency tests in FIBO continuous integration (Issue #1637) Maybe another argument for segregating reference data from the ontologies. As I've proposed before, I think there are the following categories of Individual: -1) enumerated distinguished individuals - which are part of the ontology (e.g. used is oneOf restrictions) -2) test individuals (used to exercise all parts of the ontology to show it works, unlikely to be realistic data) (we don't have these) -3) exemplars (small number of cases to illustrate usage - we already have ExampleIndividuals ontologies) -4) reference datasets (complete coverage of a subject area e.g. Currencies with definitive/canonical URIs) 1. could readily be excluded from hygiene tests. Not sure about 3) but since we don't have any of 2) we should keep running them through hygiene. In some cases it's not clear whether some data is intended to be 3) or 4) e.g. the set of Regulatory Authority individuals. And if people want to add their own are they supposed to have a mix of those and the FIBO ones? Or ignore the FIBO ones? And even when something is clearly meant to be 3) we often have more individuals than are needed to illustrate the usage. Which makes it harder to use (people don't know whether they should be looking for something different in each - maybe we should have a comment to say what aspects each individual is illustrating) — Reply to this email directly, view it on GitHub<#1637 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABFS3T3HBMWH5CH7HN2R3VLUYK4TFANCNFSM5J5XWTQA>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

0 replies

rivettp · 2022-01-28T18:41:10Z

rivettp
Jan 28, 2022
Collaborator

There are still blurry areas we need to disentangle and better document our intent.

0 replies

mereolog · 2022-02-01T08:55:53Z

mereolog
Feb 1, 2022
Maintainer Author

Excluding some or all individuals from the consistency check may hide "implicit" inconsistency, i.e., unsatisfiable classes.

The current check, and its performance, is given for the bog-standard consistency check, when we effectively check whether owl:Thing is satisfiable. So even owl:Thins is satisfiable, but there is some other class, which is unsatisfiable and has instances, then if we ignore the instances, the check will let this in.

As you can guess checking for unsatisfiable classes takes much more time, the last time I was so patient to run it, it took several hours for PROD.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistency tests in FIBO continuous integration #1732

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Consistency tests in FIBO continuous integration #1732

mereolog Dec 13, 2021 Maintainer

Replies: 6 comments

rivettp Jan 26, 2022 Collaborator

mereolog Jan 28, 2022 Maintainer Author

rivettp Jan 28, 2022 Collaborator

ElisaKendall Jan 28, 2022 Collaborator

rivettp Jan 28, 2022 Collaborator

mereolog Feb 1, 2022 Maintainer Author

mereolog
Dec 13, 2021
Maintainer

rivettp
Jan 26, 2022
Collaborator

mereolog
Jan 28, 2022
Maintainer Author

rivettp
Jan 28, 2022
Collaborator

ElisaKendall
Jan 28, 2022
Collaborator

rivettp
Jan 28, 2022
Collaborator

mereolog
Feb 1, 2022
Maintainer Author