Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename gist instances to use our naming convention #556

Open
rjyounes opened this issue Sep 8, 2021 · 25 comments
Open

Rename gist instances to use our naming convention #556

rjyounes opened this issue Sep 8, 2021 · 25 comments
Labels
impact: minor New, backward-compatible functionality (does not change inferences; e.g., adding a term) status: deferred Deferred to a later release for reasons other than it is a major change topic: instances topic: units and measures

Comments

@rjyounes
Copy link
Collaborator

rjyounes commented Sep 8, 2021

Naming convention from style guide:


  • Leading underscore
  • An infix consisting of the name of the class that is the most specific rigid class the instance belongs to
  • A single underscore
  • The name of the instance, with spaces and hyphens replaced by underscores (no camelcasing) and only alphanumeric characters and underscores allowed
  • Leave case as it is

A "rigid" class is one that the instance inherently belongs; it is part of the essence of the object, which would not be the same object if it did not belong to this class. A non-rigid class may be temporary and/or express a role or relationship; for example, Child, Patient, Employee. The notion of rigid classes originates in OntoClean.

The "most specific rigid" class is the rigid class that the instance most directly belongs to.

For example, given the class hierarchy Living Thing > Person > Student, where the first two classes are rigid and the third is not, the name for Sir Tim Berners-Lee is _Person_Sir_Tim_Berners_Lee.


None of the gist individuals for units and durations follow this convention.

We should also use the namespacing convention we use for client projects - I.e., we should use https://taxonomies.semanticarts.com/gist/ (gistx). As a result we would have, for example, gistx:_DurationUnit_minute. See issue #305.

Original terms can be deprecated to make this a minor change.

@rjyounes rjyounes added impact: minor New, backward-compatible functionality (does not change inferences; e.g., adding a term) topic: instances labels Sep 8, 2021
@uscholdm
Copy link
Contributor

uscholdm commented Sep 8, 2021

This is a good idea, though it will make all the IRIs long and clunky.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Sep 9, 2021

Submit one PR for #370, #526, and #556.

@rjyounes rjyounes added the status: implementation specified Implementation has been specified. A developer should be assigned. label Sep 10, 2021
@rjyounes rjyounes assigned JessSing and dylan-sa and unassigned uscholdm Sep 23, 2021
@rjyounes
Copy link
Collaborator Author

Existing terms need to be added to a gistDeprecated.ttl file - which doesn't currently exist because we just removed it for the major release.

JessSing added a commit that referenced this issue Sep 23, 2021
@rjyounes rjyounes added status: under review In triage and removed status: implementation specified Implementation has been specified. A developer should be assigned. labels Sep 23, 2021
@rjyounes rjyounes assigned uscholdm and unassigned JessSing and dylan-sa Sep 24, 2021
@rjyounes
Copy link
Collaborator Author

See discussion thread on PR #575 (now closed because the issue needs further review).

@rjyounes
Copy link
Collaborator Author

I just noticed that in my current project I've been using the _UnitOfMeasure_ infix, so perhaps that's the way to go after all. In fact, you could argue that BaseUnit is not a rigid class: these units are base units because we've defined them as such, but in another units of measure model they might not be - e.g., minute could be the base duration unit rather than second, or euro could be the base CurrencyUnit. A second would still be a second even if it weren't a base unit.

This would still be an exception to our general 'most specific rigid class' rule, however: a second would not be a second if it were not a DurationUnit. We have to think of a way to express the exception to our convention in a non-arbitrary way. @uscholdm Thoughts?

@uscholdm
Copy link
Contributor

I did not follolw this fully, probably best to have a chat to make progress.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Jan 27, 2022

Defer until #305 is implemented so we don't change the names more than once.

@dylan-sa dylan-sa added status: deferred Deferred to a later release for reasons other than it is a major change status: implementation specified Implementation has been specified. A developer should be assigned. and removed status: deferred Deferred to a later release for reasons other than it is a major change labels Apr 14, 2022
@rjyounes
Copy link
Collaborator Author

@dylan-sa Why does this need to be deferred? We can deprecate the existing terms, as is our usual practice.

@rjyounes rjyounes removed the status: under review In triage label May 12, 2022
@rjyounes
Copy link
Collaborator Author

Deprecate existing URIs.

@rjyounes rjyounes added status: under review In triage and removed status: implementation specified Implementation has been specified. A developer should be assigned. labels May 12, 2022
@rjyounes
Copy link
Collaborator Author

Re rigid class issue above: Michael proposes we could use the most specific rigid class in gist, and that clients could handle it themselves - e.g., by defining their own instances with "UnitOfMeasure" infix and owl:sameAs to the gist instances.
Current instances to be deprecated.

Open question: are these individuals categories or data? I.e., gistxvs gistd namespace?

@uscholdm
Copy link
Contributor

Open question: are these individuals categories or data? I.e., gistxvs gistd namespace?

A question I ask is whether the information is part of the subject matter of the client business. If it is, it goes in the taxo namespace, otherwise it is data. Let's look at some client examples.

  • Schneider Electric: the subject matter for this client is electrical products. To describe electrical product, I needed to add a bunch of new units and new Magnitudes. The product data uses these units.
  • Platts: Their core business is commodities such as oil, aluminum and wheat. To define Brent crude oil I needed to create a unit for kinematic viscosity and one for sulfur concentration. I had to add a bunch more other units to describe grades of different commodities. Thier product data makes use of these units.
  • MD Andersen: Their core business includes measuring things like blood glucose levels and blood pressure and the like. Again, I created new units that were used to describe patient data.

In all three cases, the units and new subclasses of gist:Magnitude (that go hand in hand) are describing the subject matter relating to the core business of the client. They are used to create client data. So to me, the units are a much better fit to be in a gistx namespace than a gistd one.

It would be different if we worked for the international standards organization that tracked all sorts of units and measures. For them, it would probably be as good or better fit to regard the units as data.

@rjyounes
Copy link
Collaborator Author

A question I ask is whether the information is part of the subject matter of the client business

Since gist is an upper ontology, with no defined subject matter, we have to decide whether the instances are taxonomic or data. Once we take one stance with gist, client models should follow suit. It would be odd if a unit defined in gist is data while it's taxonomic in a client model, or vice versa.

Given that UnitOfMeasure and Magnitude are not subclasses of Category, I don't see how these can be taxonomic terms - on the assumption that only categories are taxonomic terms, but I'm not sure this is a valid assumption. In addition, they serve as subjects and objects of predicates such as hasBaseUnit, hasStandardUnit`, and so on, suggesting they are data - I.e., we have things to say about them other than where they fit in a hierarchy. It doesn't make sense to say that a category has a base unit.

One might say that if it's a controlled vocabulary it's taxonomic. At the same client, we have a curated set of suppliers and manufacturers (organizations). Again, Organization is not a subclass of Category, and in addition these instances do things, such as selling and manufacturing products.

@uscholdm
Copy link
Contributor

This is a not a clear-cut decision - your points are generally valid. It's a matter of considering tradeoffs and personal preferences. On the other hand:

It occurs to me that if we go with the International Standards Body, a unit of measure is just a Magnitude - we went through this once before and decided against it. But if we did go with this approach, the data namespace seems to make more sense.

@rjyounes
Copy link
Collaborator Author

That is still on the table as part of the units and measures work: #61.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Sep 8, 2022

This will require more discussion due to the issue of most specific not always being appropriate.

@rjyounes
Copy link
Collaborator Author

rjyounes commented May 2, 2023

Note: although all the gist instances are currently units of measure, I don't think the topic here is units and measures per se, so am removing it from that discussion.

@rjyounes
Copy link
Collaborator Author

I vote to close this issue. There are only a handful of terms at stake, all units of measure, and there is considerable disagreement about whether uoms are taxonomy data or not. IMO it's not worth hashing out.

@uscholdm
Copy link
Contributor

Agree to not address this for now. Is there a 'dormant' status, as opposed to dead?

@rjyounes rjyounes added status: deferred Deferred to a later release for reasons other than it is a major change and removed status: under review In triage labels Jul 27, 2023
@rjyounes
Copy link
Collaborator Author

Closing and labelling as deferred.

@rjyounes rjyounes closed this as not planned Won't fix, can't repro, duplicate, stale Jul 27, 2023
@rjyounes rjyounes reopened this Jul 27, 2023
@rjyounes
Copy link
Collaborator Author

rjyounes commented Jul 27, 2023

Deferred to discussion of unit and measures: issue #759 and #697

@rjyounes
Copy link
Collaborator Author

@uscholdm @dylan-sa @coltonglasgow What are your thoughts about implementing the naming convention in sub-gists? A couple of them have small taxonomies; should we put them in a gistx: namespace even though that does not currently exist? It seems a good opportunity to follow our best practices.

@uscholdm
Copy link
Contributor

As of 12.0.1 there only a dozen or so instances and every one is related to units and magnitudes. It can be addresses in #697.

image

@rjyounes
Copy link
Collaborator Author

@uschold I was referring to sub-gists.

@dylan-sa
Copy link
Contributor

dylan-sa commented Dec 1, 2023

I like the idea of using gistx: for the taxonomy instances, similar to how we do with client ontologies. One thing to consider: While many of the sub-gist instances would definitely go into gistx:, we have some units of measure, too. Did we reach an agreement about whether UoMs should go into gistx: or gistd:? Maybe we could leave UoMs in gist: for now but move the obvious taxo instances into gistx:?

@uscholdm
Copy link
Contributor

uscholdm commented Dec 1, 2023

I think we shoud do whatever we normally do with clients, to the extent that we all do the same thing. If there are differences, we should look into them and make a choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact: minor New, backward-compatible functionality (does not change inferences; e.g., adding a term) status: deferred Deferred to a later release for reasons other than it is a major change topic: instances topic: units and measures
Projects
None yet
Development

No branches or pull requests

4 participants