Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed Interface Specification for data output #827

Merged
merged 7 commits into from
Mar 27, 2019

Conversation

vestom
Copy link
Contributor

@vestom vestom commented Oct 11, 2018

Purpose is to remove some ambiguity and achieve some commonality across different sensors to improve machine readability and ease integration with third party software.

This is sure to spark some discussion :-)

Purpose is to remove some ambiguity and achieve some commonality across different sensors to improve machine readability and ease integration with third party software.
@roger-
Copy link
Contributor

roger- commented Oct 12, 2018

+1!

I suggest having a device_type field to identify the type of sensor (e.g. weather, temperature, motion) to make the data more machine friendly.

Maybe also a payload field to include full the undecoded payload data, in the event that someone wants to subsequently decode a field that isn't supported by rtl_433.

@vestom
Copy link
Contributor Author

vestom commented Oct 12, 2018

I do not think it is really feasible to add a device_type field. Many of the device are some combination of sensors. E.g thermometer/rainfall, thermometer/hygrometer/barometer, thermometer/anemometer/windspeed. Basically we would have to add a device type for each sensor value returned by the device. And in that case you can just as well infer that a device that returns a temperature_C contains a thermometer, humidity contains a hygrometer etc.

@roger-
Copy link
Contributor

roger- commented Oct 12, 2018

I see your point, but you could classify the sensors you mentioned as type “weather”.

But what you say is true, so it may not be necessary.

@merbanan
Copy link
Owner

@vestom we could add a bit field with what sensors a device has. Not sure it would be worth it though. It's obvious from the parameters types. Anyway Temperature_F can be present also if the native representation is that.

@zuckschwerdt
Copy link
Collaborator

I wanted to present TPMS as a good case for type but that could easily be a model prefix, e.g. TPMS-PM107J, TPMS-Ford, … Not sure anymore what good be gained from a type output now.

@merbanan
Copy link
Owner

Maybe we should give up on typing the devices and instead just provide unit measures?

@merbanan
Copy link
Owner

But then again it would help users identify/classify things which is nice and maybe really user friendly.

@vestom
Copy link
Contributor Author

vestom commented Oct 17, 2018

IF we want to add a "type", I think it would be best to add it in a separate field and keep it orthogonal to the "model" field. The primary purpose of the "model" field is to identify/recognize the device, whereas a "type" field would supposedly be some form of "class" or "template" telling something about the data fields you can expect from the sensor and maybe how to interpret them. I have no strong feelings about it, other than I will recommend to define clearly what the different types signify and what you can expect from a given type :-)

Regarding Temperature_F, the big decision is, whether we still want to allow all kinds of units, or we want to standardize to a limited set? If we do not want to standardize, I will change the proposal to reflect the status quo.

However. The main purpose of the proposal is to define some rules for the "model" field, to make this mandatory field a bit more uniform. Hopefully we can agree upon something about this starting point and then later evolve the spec to include possible new changes or additions?

@zuckschwerdt
Copy link
Collaborator

I'd say we need good examples of "types" to narrow the idea.
The prevalent type is Weather (Outdoors?) or Environment(Indoors?) readings. Mainly Temp./Hum. but also anemometers and the likes. Some sensors even have different messages for the readings.
I'd say that's a default case and there is no obvious gain in adding some kind of type (the output is too varied but easily identified).
Then there is maybe TPMS, a distinct class/type where a tag would be neat.
Other than that I only see the security/alarm/doorbell class which also is rather varied and hard to pin down with a "type".

@roger-
Copy link
Contributor

roger- commented Oct 18, 2018

I think ultimately it would be nice if the output had enough information to be automatically consumable by things like Home Assistant.

I’ve noticed that there are certain temperature sensors that aren’t weather sensors (e.g. freezer temperature), so looking at the presence of a temperature field wouldn’t be enough to determine that.

Perhaps there should be a “subtype” field to put more specific information? So type would be more of a category.

Also should there be a separate “manufacturer” and “model” field? This might make it easier to represent decoders that work for multiple models.

@zuckschwerdt
Copy link
Collaborator

While some sensors have a clear purpose, the placement and usage is not easily discoverable. (Some of my Temp./Hum. sensors are outdoor, indoor or even in a fridge)

The protocol is not enough to determine the sensor manufacturer or usage also. There are too many clones with sometimes different purpose, sadly.
So "manufacturer" ("brand") will go away where still used, best we can do is guess the kind of transmitter/protocol.

@merbanan
Copy link
Owner

@vestom I'd like it to default to si-units. But decoders should output the native values.

Prefer native sensor data units
@rct
Copy link
Contributor

rct commented Nov 3, 2018

Regarding converting units - The problem has been rounding errors that creep in when doing multiple conversions. This is due to pretty printing going on even for what's intended to be machine readable output like JSON. At some appropriate point, I think we should stop rounding some of these values to 1 or 2 decimal places.

DATA_FORMAT.md Outdated
## Common Device Data
Various data fields, which are common across devices of different types

* **battery** (string) (Optional)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a case where if we are writing a specification we should attempt to set a standard we want everything to follow.

I propose

  • battery (with no units) should be considered deprecated.
  • battery_low or battery_ok should be added as a boolean (or an int is essentially a boolean).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hot topic. The idea is clear, "low battery" is an "event" thus "1" anything else is default and disregarded, hence "0".
The opposite reasoning is:

  • devices that report a voltage or known steps should use battery_mV
  • other devices should report a level between 0 (empty) and 1 (full)
  • if only a flag is available this degrades to quasi-bool 0/1

@zuckschwerdt
Copy link
Collaborator

As far as I see pretty printing is only for KV and CSV. Specifically JSON disregards DATA_FORMAT.
Reasoning here is KV and CSV are primarily for human consumption, JSON (and other networked ones) are primarily for machines.

I'd say we don't need a default conversion. Users (and packages) can preset options using config files soon.

* Secondary device identification. For devices with more than one identification value
(e.g. both an internal value and a switch).

* **mic** (string) (Optional)
Copy link
Contributor

@rct rct Nov 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of a data architecture, MIC as currently defined is redundant data.

  • It will never change between messages on the same device. So there is no point in sending and storing it on every message.
  • It isn't an attribute of the message., It's an attribute of the device/the decoder.

The use case of disabling devices that don't meet the specified level seems like a runtime option for rtl_433.

Alternatively rtl_433 should output (on start up) a machine readable message about what devices are enabled, and their attributes like MIC.

On a related note, rtl_433 should output some version information at startup in case consumers need to make some decisions based on protocol/messaging changes.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that the consumer can add more trust to a message if mic is present. I get your point but in a blind scanning scenario when you pickup unknown transmitters it is much more user friendly to directly see that the protocol has an integrity check.

These fields are the primary data fields containing the most basic message data and used to identify the specific device.
For some devices these are the *only* fields contained in the message, as the message itself consitutes an event from this
particular device model.

Copy link
Contributor

@rct rct Nov 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: add an rtl_433 message type indicator to allow other message types to be added and easily differentiated from decoded device data.

0 - decoded device data
1 - rtl_433 periodic statistics
2 - rtl_433 invocation information (rtl_433 Version, start time, frequency, sdr identification (id/serial), ...
3 - rtl_433 device/protocol info.

@zuckschwerdt
Copy link
Collaborator

Two more points to discuss: Hideki and Acurite (at least) have 16 wind direction values, i.e. increments of 22.5 degrees. Should we recommend to round these or just switch the specification to double?

The current situation with rain fall measurements is:

acurite.c:            "rain"  ("%.1f mm")
alecto.c:             "rain_total" ("%.02f mm")
fineoffset.c:         "rainfall_mm" ("%.1f mm")
fineoffset.c:         "rain" ("%.01f mm")
fineoffset_wh1050.c:  "rain" ("%.01f")
fineoffset_wh1080.c:  "rain" ("%3.1f")
hideki.c:             "rain" ("%.01f mm")
inovalley-kw9015b.c:  "rain" (raw int?)
lacrossews.c:         "rainfall_mm" ("%3.2f mm")
oregon_scientific.c:  "rain_rate" ("%.02f mm/hr")
oregon_scientific.c:  "rain_rate" ("%3.1f in/hr")
oregon_scientific.c:  "rain_total" ("%3.1f in")
oregon_scientific.c:  "rain_rate" ("%3.1f in/hr")
oregon_scientific.c:  "rain_total" ("%3.1f in")
oregon_scientific.c:  "total_rain" ("%.02f mm")

Perhaps a recommendation of "rain_mm" / "rain_inch" would be better (my thinking is that "fall" doesn't add any meaning -- but I'm not a meteorologist or even native speaker, please correct me on this.)
And then "rain_rate" perhaps needs a unit. Would that be _inph, _inhr, _inh? Analogous km/h would be shortend to kmh, but kph is also well established. I can't really say if mm/h would be mmh or mmph then?

Adjust "rainfall" to "rain". Add imperial unit examples. Add "rain_rate".
Simplify "battery" status indication. Correct F for Fahrenheit.
Use mps for meters per second
DATA_FORMAT.md Outdated

* **model** (string) (Required)
* Device model. Human readable string concisely describing the device by manufacturer name
and manufacturers model designation according to the following syntax: `"<Manufacturer>, <Model>"`.
Copy link
Collaborator

@zuckschwerdt zuckschwerdt Dec 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to read "<Manufacturer>-<Model>"?

Copy link
Contributor Author

@vestom vestom Dec 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding units, I suggest changing from using 'p' as division ('per') to underscore '_':
Examples:
m_s (meters per second)
mi_h (miles per hour)
in_h (inches per hour)
The reason is to avoid ambiguity like the common "mph" (miles or meters?) and "kmh" (kilometer*hour or kilometer/hour?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many keys would that concern? At a glance I only see acurite, and hideki, 3 keys total. Worth the change.

Change model delimiter from comma to dash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wip Work In Progress (for PRs only)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants