-
Notifications
You must be signed in to change notification settings - Fork 781
Semantic tagging #1093
Comments
I'd like to think about extending the existing ontology with support for WindowCoverings, GarageDoors, Locks and Fans. (Using HomeKits terms here) Any thoughts on including these in the ontology? For comparison with HomeKit, the characteristics of a WindowCovering are: GarageDoor supports a value of: Open, Closed, Opening, Closing, Stopped. You can only write Open or Closed. Lock supports a value of: Unsecured, Secured, Jammed, Unknown. You can only write Unsecured or Secured. Fan has: Some of these are interesting because they really represent an enumeration of possible values (i.e. Lock state). You could conceivably implement this as a series of mutually exclusive Switches, but that makes for a pretty terrible experience in the UI. They could be implemented as Strings, with the supported values limited by convention only. That might be a good starting point, but it would be nice to have something more tightly enforced by the framework. If we're moving to a paradigm where Bindings are declaring there items within the ontology, we could and should enforce values. I'd add that heating/cooling system state (Off/HeatOn/CoolOn/Auto) needs this as well. |
@kaikreuzer - any thoughts? |
I'm currently travelling - please allow me until next week before I can answer (in a meaningful way ;-))! |
Thanks - enjoy your trip! |
HomeKit also supports new Sensors in iOS 10 like Smoke or Movement Sensors. This should also be supported by openHAB. |
@kaikreuzer - just reminding you about this |
Thanks for the reminder, I sense the urgency, but I am still not on it :-/ Sorry for asking for that much patience... |
Understood - it's not a trivial topic. |
@digitaldan - maybe you have some thoughts on this too? I think the Alexa add-on has the same challenge with enum values for the hearing/cooling state. |
I do have this issue, right now I have to accept strings that match "OFF, AUTO, HEAT, COOL", I also accept numbers "0, 1, 2 ,3" that map to those values. Obviously this will not work for some/most people, having these associated with the item would be better. Also I look for a "Fahrenheit" tag to try and set the user's prefered unit of measurement.
Agreed, where do we start? I read through the serif spec, I think it's a good starting reference although I can't find anyone else using it. Thinking completely selfishly about what I need for the Alexa and HueEmulation bindings:
Another issue is what I want these exposed to. In my case I have the HueEmulation and Homekit bindings plus an external Alexa REST app. I think it may be a more common problem as more and more apps start depending on the tagging ontologies. Not sure how to solve this without getting into tagging hell. |
Would it make sense to have an optional capability in bindings to translate between the semantics of a chosen ontology and the semantics of the underlying systems? Thermostats, as a class of thing, have widely differing semantics across vendors and technologies, and it seems to me that the proper place to translate would be in the binding code, since it is where the full domain knowledge resides and evolves. |
I would agree, if I use the zwave binding as an example, devices have a "Generic Device Class" property so it would be fairly easy to map a ontology to it, also if nearly all items map to a chosen ontology, I would not want to burden the end user with having to do all of that manually. Given a much more complex and integrated strategy about using a formal ontology (our own or something like serif), does it still make sense to use the general tagging system for this, our would we create something new to hold this data? |
It does make sense for the binding to provide the mapping to the ontology type and translation to an enum state. We'd likely still need to come up with a way to let the user define these (probably via tagging and transforms?) - the truth is, the ecosystem of bindings include many that aren't implemented natively against the ESH APIs (i.e. OH1 bindings). |
This is very much the idea of the system channels. Note that bindings CAN define such "enum" channels and provide a fixed set of possible values with them. Different bindings might define it slightly different though, so there is no "enforcement by the framework". Once we come to terms that bindings actually define very similar channels and that it should be always the same, a system channel should be established and the bindings should be updated to adopt it. See e.g. the "signal-strength" system channel for such an enum type.
I would actually try to avoid having to define/reference/list other "chosen ontologies" as well; it can easily get messy to understand, which binding specifically supports which ontology and what works and what doesn't. That's why I rather want to establish a single ESH ontology, which will then become the common ground for bindings on the one end and applications on the other.
Right, but it should be the Z-Wave binding which maps its own device class model to the ESH ontology - then we are on the right track for real interoperability.
As mentioned above, I think it can be a combination of tags on items and the definition of system channels. Note that there is also the "category" as a taxonomy available. Additionally, there will be the ontology itself, which inherently holds a lot of information and which must be considered when implementing features that work on the tags like queries etc. - these will not be as simple as "it is a hit if tag x is there" (as mentioned in #1093 (comment)).
This is a must and that's why the "final" information is on the items and not on the things. The bindings (through things) can bring the knowledge they have about the device, but the user can augment it for their use case (standard example is an outlet that is used to switch a light, so the tag should be about a light). So where do we go from here? So more concretely: Introduce new channel categories:
Introduce new system channels:
Wdyt? |
I think you're right - the channel categories should work. We're not really solving for the complex devices that are typically represented by multiple Items. Requiring a Group for a Thermostat always seemed a little strange to me. However, we're also not really making things worse in that area with these changes.
The channel categories differ a bit from the tag names we were using: i.e. "lighting" vs "light". It makes sense to me to just switch to the channel category name for those, rather than having a different convention for tags. Agree on LOCKED over UNSECURED. "acMode" sounds strange to me. What about just "heatingCoolingMode"? |
What do we tell users that are using older bindings that don't yet conform to the new system channels? Is there a concept of a transform that works on the item value, instead of just the label? |
Aren't the capabilities something like this in the following example?
|
Well, readonly is also defined by the item itself. |
I strongly disagree with this! There are several usecases where it makes sense to have several thermostats in the same room set to different temperatures, a radiator that just supports some other heating would be such a case: The radiator thermostat would have to be set to a lower temperature then. |
Surely there's a need to formalize tags on items, thanks to @kaikreuzer for pointing me to this discussion. I'll adapt to whatever comes out of it, but allow me to share some hindsight after several months working on this, and please bear with me if I fall off-topic. This HABot project had a (equally if not more complex) problem: named-entity recognition, which includes both detection of entities in a sentence, and their classification. In order to simplify the machine-learning training process in that area, I deliberately limited the categories to two: object ("what") and location ("where"). The training data includes more entity types: period, value, color etc. provided to "skills" to use according to their needs and purpose, but they are not used in the context of item identification. What remains is the mapping the detected and classified named entities to either items directly, or maybe an eventual ESH ontology (which would also be a step for other systems which already performed a classification into their own ontology). This is also currently left to the user which must tag the items with all the possible values for these entity types; in other terms there is no mapping between the natural language entity and the tags at the moment. I see inheritance of these tags applied on groups to group members is imho an important feature. Users are likely to have already built hierarchies with groups, maybe not in the sense outlined above, for covering their own needs (for instance, classifying places first by floor then by room) and group inheritance allows a tag to follow these hierarchies. I also disagree with this statement:
"location:first floor" and "location:bedroom" are not mutually exclusive, and it also helps with synonyms (see below). The screenshot below showcases a complete example. I've found the above "lax" approach (and the lack of any real taxonomy) to have both strengths and shortcomings - to be fair some of them are out of scope in the context of this issue, it's not an ESH problem but a problem for the NLP system. Anyways, the strengths were:
The weaknesses were:
I'm also glad to learn about the new metadata infrastructure since I also (mis-)used |
After a longer discussion with @kaikreuzer and @triller-telekom we came up with the following suggestion documented in the wiki: https://github.com/eclipse/smarthome/wiki/Semantic-Tag-Library |
@afuechsel Worth to note that @ghys' valuable input wasn't yet considered in what you have posted to the wiki - this should definitely be taken into account as well (many thanks @ghys, I only had a glance at it so far, but it seems to be closely in line with the proposal - I'll go through it in depth soon and provide feedback). |
@ghys Some feedback from my end:
I think this would pretty much work with what we suggest here - the "object" and "property" tags could probably simply be checked for the "what" part, while the "capability" tags could be used for a more fine-grained NER in future (often, you might want to refer to a specific function of your "object" and that's where those tags imho can be helpful).
I agree. My main point here was that on a single group item, you probably won't have both those tags (only one and the other one being inherited). Clearly, you can have many tags of the same type that are inherited - also note that any hierarchical tags should automatically include their parents (i.e. when tagging an item as "Room" it will automatically also be "Indoor").
I'd actually think that we need both. We should have localisations of the agreed tag library (and those should already include all kinds of derivatives (singluar/plural/synonyms/etc), but will also need to consider the item labels for the recognition (which might be "Amy's room", while the tag is "bedroom"). I would hope that in most cases it is good enough to resort to the label and to have only one. If not, we can also add the possibility for further aliases through metadata.
Yes, this can indeed be used by HABot for UI related information. |
yes, I see now how they can be useful even outside the NER, for instance acting on different items depending on the intent: the thermostat scenario is a good example. If you consider these:
while in both cases the recognized entities are There are a number of ways to solve this - ideally the current temperature is a read only item and only the setpoint accepts an INCREASE command, so there is no ambiguity, but that's not always the case. |
@ghys Yes, these were exactly the use cases we came across when discussing the "capability" tags. I also first thought that this information could be derived from the item type, its read-only definition or its associated system channel type - but in the end, the clients would have to take many different sources into account, which can easily end up in a lot of complexity. Using special tags for this seems to be better solution. So as a summary, I think we have the same view on the stuff, so that we can go ahead introducing the tag library? |
Definitely. If this results in items being automatically tagged with predictable tags when adding things and creating links, saving the user from doing it manually, it would a huge improvement. Just to be clear though: the tagging isn't going to be limited to the tag library and the users would still be able to add their own? I'm concerned the library will never be exhaustive and meet everybody's needs (robot vacuum cleaners/lawnmowers? network/internet speed monitoring? etc) |
@afuechsel & @ghys: Having contemplated a bit more over this, I'd like to suggest a small change in order to also neatly integrate the requirement of #582 into the tagging concept: We should imho differentiate between a) saying that an item represents a room and b) the item is located in a room. For the "item represents a" information, I would suggest to always use the "object:" tag. According to #582, we would thus have |
@ghys Definitely, the user is free to add new tags. I am unsure about the default tagging mechanism - if a ChannelDefinition defines some default tags, these appear automatically on the item - but the user should be able to delete/override them. @kaikreuzer I think this makes sense, I am only unsure about How will we implement the inheritance? |
@afuechsel nice to know - users often have edge use cases that wouldn't (and shouldn't) be covered by the tag library, so it's good they're still be able to expand it locally. @kaikreuzer which types of items would you consider "represent a room"? Groups? If so (and while I agree with the motive and the general principle) I'm not thrilled by this to be honest. I think Group items are more considered by users as containers for items with common charasteristics (i.e. "all these items are lights", "all these items carry temperature measurements", "all these items are in the kitchen") rather than having any semantic value on their own. That's why I like the simple, easily understood principle of group inheritance for tags (tagging a group equates implicitely tagging all direct & indirect members of the group with that same tag) because it feels imho natural and predictable. It's an elegant way of giving semantic meaning to several items at once. If you also want to give meaning to the groups themselves (and e.g. tag a When you say:
how would that work exactly? |
A solution might be to for instance reintroduce the |
Do not try to force users to use the functionality in a specific way. This just limits future use cases and if someone really wants to do something, the workaround will be even worse. Just try to provide a functionality and leave it up to the user if they abuse it or use it like it was intended. |
Sorry, guys, to fully disagree here. As mentioned above: We should have localisations of the agreed tag library (and those should already include all kinds of derivatives (singluar/plural/synonyms/etc), but will also need to consider the item labels for the recognition (which might be "Amy's room", while the tag is "bedroom"). I would hope that in most cases it is good enough to resort to the label and to have only one. If not, we can also add the possibility for further aliases through metadata. What we are trying to establish here is an ontology, i.e. a vocabulary, where every tag is a word that has a clearly defined semantic meaning (we will actually have to thoroughly document each of it). This vocabulary should enable the system to automatically determine the semantics/context. Allowing users to add tags does not fulfull this requirement. I assume @afuechsel rather refers to "solutions are free to define additional tags", which is fine, if they have an extended vocabulary, which they want to use in a special way (meaning within the code). I think you are mixing up labels and labels/+aliases) in some way. Defining that something is a "bathroom" (tag) is something else than saying that you want to be able to refer to it as "bath", "restroom", "morning place" or whatever.
This question is independent of whether we put "location:" or "object:" in front of it. So far, you listed garden yourself under Outdoor->Garden, not under Indoor->Room->Garden.
So you live in a 4-room-flat? Kitchen, living room, bedroom, garden? No, honestly, in any language that I know, you would never refer to your garden as a room, that will make people laugh. What you probably rather mean is "place" - and yes, "place" should the the root node for the place-hierarchy, meaning: any room is a place, a garden is a place, a home is a place, etc.
Yes, this is the status quo and what people are doing - just look at the the demo items or home builder - the groups clearly represent a room (and not just a group) as their label is "Bedroom" etc. I don't see any reason, why this should not be done this way anymore, do you have any argument against it? I rather feel that it is fully in line with what we are trying to do here: Add semantics to the items (independent of whether they are group items or not).
Right, that's why I said "For the inheritance, we could thus use different tags (which would allow us to cleanly differentiate between both cases), i.e. a location:kitchen tag would mean "is located at" and not "represents a"". Note that those two pieces of information are pretty different - we should be able to tell this difference apart. Doing a "dumb" inheritance, we imho would be quickly in trouble.
This depends on how we define and realise the querying mechanism.
No need, because as mentioned above, a "room" is by definition a place and an "object:" that is some place cannot be anything else at the very same time. You should really translate "object:" as "represents a". So an |
Those aliases would be provided by the framework, or have to be implemented by clients for their own use? (I ask because I was contemplating the latter for HABot using item metadata)
No actually, that's probably just me being narrow-minded ;) I always saw those groups as "objects that are in the bedroom" rather than "the bedroom" itself. Now treating the rooms as objects for beacons etc. makes sense. As for inheritance, maybe I'm too fond of this "dumb" inheritance system and feel overly strongly about it, because I found it so easy to work with. I don't mind it being replaced by (I hope!) a better querying mechanism. |
I am not fully clear yet as where to put them - we will probably require those for Alexa, Homekit, Google Assistant likewise, so having them in "add-on specific" metadata means the users have to replicate them a lot. We could add "alias" namespaced metadata, but this feels to me as if we start misusing metadata right away... @SJKA, @maggu2810, @htreu Any good ideas from your end how to deal with that? |
This introduces BREAKING CHANGES! Following the discussion in eclipse-archived/smarthome#1093 the mechanism matching the entities extracted by OpenNLP from the natural language query to ESH items is being altered in this way: 1. Tags are not the primary conduit for item identification: this change introduces the concept of "Named Attributes" which will be implicitely affixed to items by a new class/ OSGi component, the `ItemNamedAttributesResolver` 2. Tags are now expected to conform to a semantic tag library: the current version is at https://github.com/eclipse/smarthome/wiki/Semantic-Tag-Library HABot has internal translations for the most useful semantic tags in the languages it understands and will derive named attributes for items from those `tagattributes_{locale].properties` resource bundles 3. In addition to tags, users may specify additional "monikers" for items by using metadata in the "habot" namespace: ``` Group FF_ChildsRoom { habot="Amy's room" [type="location"] } ``` Those monikers will also be added to item's named attributes set. The "type" configuration property is optional: if left unspecified, monikers will have the "object" type. 4. Inheritance is still assumed for applied tags and monikers specified in metadata for Group items, EXCEPT if the inheritTags configuration property in the "habot" metadata namespace prevents it (for tags only, metadata monikers are always inherited), like so: ``` Group Kitchen ["object:room"] { habot="Cuisine" [ inheritTags=false ] } ``` 5. "habot:" prefixed tags will move gradually to the "habot" item metadata namespace. Signed-off-by: Yannick Schaus <[email protected]>
Hi, I am completely new to openhab, and I am not sure, if I am of any help, but maybe the links can help: |
Thanks for this input, @fab6! I was aware of project haystack, but didn't hear about brick schema so far. Having had a first glance, this is pretty close to our discussed concepts (object<->equipment, capability<->point, place<->location), so it could be an option to adopt. Seeing that they all collaborate with BACnet now sounds great - and having the goal to become an ISO standard sounds even better. I cannot yet find any details about the ASHRAE Standard 223P, I guess this is currently worked out behind closed doors? Do you have any clue about a timeline and how much it is going to be different to the current schema from brick and the tags from haystack? |
I would like to propse an update of the tag library and introduce 2 new capablilites for the 2 system channels "signal-strength" and "battery-level". As we have definied a tag for the "lowBattery" system channel we should also have the tags: "capability:signalStrength" and "capability:batteryLevel". WDYT? |
@kaikreuzer, earlier in the post you mentioned adding two new channel types and three new system channel types. Do you mind elaborating why they didn't make it in? I read through a lot of the sub topics and still couldn't figure out why they were excluded, or were they an accidental casualty of this massive multi-topic thread? |
@kdub454 The discussion about the new system channel is here: #3756 (comment) We added all those which occurred in 5 or more bindings. So which ones are you missing in particular? And in how many bindings are they used? |
I did some further research and found @fab6's advice with looking at the brick schema pretty interesting. They came up with a pretty similar entity types and their meaning as we did. The part that we are lacking is to properly express how entities refer to each other (see the discussion about tag inheritance above). Brick nicely defines a nice small set of references this way: Just translate our "Object" to "Equipment" and our "Capability" to "Point". I'd definitely want to keep our tag library and also allow those as being used as a tag on items, just as discussed. But I see that it is pretty difficult for clients to manually extract all the relations from e.g. group containments, item types, categories and tags - this is very likely far too complicated for clients to implement. So my idea is to use the metadata facility (e.g. with the key "semantics") to gather all relevant information by the framework and provide this (optionally) in an ontological style. This would also allow us to have a place for synonyms (which we discussed for Alexa, Google Home, HABot, etc. already /cc @ghys). Let me briefly use the item dsl syntax for metadata, to give you a short glimpse on what I mean: We could have the group item
which expresses
which could be provided as metadata as
The "main" part of the metadata corresponds to the type. If no namespace is provided, "esh" is the default and if the tag is uniquely identified, it does not need to be fully qualified (with type and hierarchy in underscores). Every further reference can be added as a parameter to the metadata as it is done for "isPartOf". If we have some way for users to specify synonyms, they would then appear here as
Another example:
expresses
or as metadata as
We can consider the ESH Things to be equipment entities, so we can use the isPointOf reference as an information about the linked Thing - it isn't necessary to create a group item for Things for expressing this relation (for which we currently do not have any solution). I am working on a prototype of such a semantics metadata provider. It will probably be a bit tricky to get all information together and especially to keep track of changes, but I assume that it is much better done in the framework than in every client. Will provide an update soon! |
ESH has a basic tagging implementation, which allows to add tags (as simple strings) to items.
The idea behind this is to assign items a semantic. So while the "category" refers to a taxonomy (e.g. this is a "temperature sensor", the tags are supposed to refer to an ontology (this is the "current outside temperature").
So far, a few temporary tags were introduced, like "home-group" for marking group items as a kind of room, as well as "thing" to mark it as a group of items of a physical device.
Furthermore, the new openHAB integrations for HomeKit and Amazon Echo make use of tags like "homekit:DimmableLightbulb".
With more use cases coming up, I think we need to formalise the use of tags and thus we should come up with a (or many?) ontologies to use (and hence have a fixed tag library).
Since there are many efforts for ontologies in the industry, I am not fond of the idea to define one from scratch for ESH. Nonetheless, it also does not seem as if there is anything matching our needs to 100%.
Since ontologies can reference other ontologies, my suggested approach is trying to define an ESH ontology which refers as much as possible to the work of others.
A good starting point is the SAREF ontology. Its "saref:Device" and "saref:Service" can be replacements for "thing", while "saref:BuildingSpace" could replace "home-group".
SAREF itself already refers to other ontologies where it makes sense, such as
https://www.w3.org/TR/owl-guide/
https://www.w3.org/TR/owl-time/
https://www.w3.org/2003/01/geo/
http://www.wurvoc.org/vocabularies/om-1.6/Unit_of_measure
Note that hierarchies and references are an important part of ontologies. So we imho need to enhance the querying possibilities on the ItemRegistry to respect this. It should e.g. be possible to query for "saref:BuildingObject" and get all items that have tags like "saref:Window", "saref:Door", "saref:Device", etc.
One of my initial ideas was also to support multiple different ontologies on ESH and provide a possibility to automatically map between them, where feasible. I fear that this might get pretty complicated though and hence I'd suggest to go for just "the one ESH ontology" for now.
The text was updated successfully, but these errors were encountered: