-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce ports to handle tasks like device scan in a generic way #29
Comments
That sounds like a very good idea. I could imagine the following different ports: Downstream (device side)
Upstream (host/cloud side)
As a long-term goal I would also like to port this firmware to Zephyr, as the ESP32 support in Zephyr has become much better recently (which was not the case when we started writing this firmware). I'm wondering if Zephyr would provide some better features than ESP-IDF in order to generalize these ports (e.g. as modules). Using an approach with Zephyr modules would potentially allow to implement an upstream port directly in a device if it has IoT connectivity on board, reusing the same code. |
I like the idea, let's clarify some aspects for me so I could try to implement something. Should it be possible to add a port afterwards, without reflashing? For now we decided that changing pins etc is not possible via the webconfig, this would require some more work to change.
Just out of curiosity, what other device do you have in mind, this is very much tailored for the esp32? |
The concept behind evolved a little bit since I wrote that. It is going in the direction of a ThingSet device mesh or tree. The idea is to exchange the active scan operation by a passive monitoring. Every ThingSet device shall therefor send a periodic. heartbeat statement. By this you do not have to issue a scan but can detect the devices by the heartbeat statement. This works also for devices that are attached on a CAN/ RS485/ RS232/ ... bus. To make devices identifiable each device has it's own unique device id. The object path definition is extended by the device id. As in a mesh or tree topology you have to route messages from one port to another. This should be done without any extra copying of data. Zephyr provides network buffers for that. In the concept struct ts_mesh_buf buffers are in fact Zephyr network buffers. Ports shall work on these buffers. The port structure definition in the current concept:
No. I can imagine that a physical uC port (eg. serial) may become several predefined ports - e.g. RS 485, RS 232, I2C. These ports may be activated (open()) / deactivated (close()).
It is a general concept. See ThingSet/thingset-device-library#13 |
I can totally see how that makes sense if the esp is used simply as a gateway, but if we want to display+configure connected devices we need to "know" what devices are connected, otherwise you constantly have to match ALL incoming heartbeats against the list of known devices or do you know a better way? How often is this heartbeat send, maybe the overhead is rather low...? |
Probably a heartbeat once every second would be enough. The updated CAN interface does already have a similar method to detect a new device. It listens to received publication messages and if it receives one from a device it doesn't have in its list it adds it to the Similar thing could be done on the serial. And if no message is received for e.g. 5 seconds the ESP could assume that the device is disconnected and remove it from the list. |
The lowest guessed throughput is that of LoRaWan with ~12 bytes/second. My rule of thumb would be 1% throughput for heartbeat. In the current concept a full fledged heartbeat message is about 20 bytes -> One heartbeat every three minutes. Anyway the period is configurable and there may be intelligent methods to shrink the size of a single heartbeat statement, especially on low throughput links. The period information is part of the heartbeat message. The receiving device can adjust the timeout for loss of device to this given period. In the concept the heartbeat is only send to direct neighbours. If the neighbor does have several ports it converts the heartbeat statement to a neighbour announce statement and passes it on to the other ports. I have to rethink that, maybe it is better to have a port throughput specific neighbour announce period to automatically adjust also the neighbour announce period to low throughput ports and at the same time allow for higher heartbeat rates at high throughput ports. Thank you for the question.
This is the way it works. There is a local device table that holds information about the known devices. |
@martinjaeger do you know a good way to discuss the ThingSet Mesh concept? I have a w.i.p. concept description and some very rudimentary code. EDIT: Please see the actual ThingSet Mesh concept. |
Ok, lots of stuff to understand... didn't know the B.A.T.M.A.N. network before and I don't yet fully understand what it does from a quick look at the docs. Anyway, some general questions/comments from my side already:
|
Regarding good place to discuss the mesh network: Generally GH issues are probably OK. Alternatively we could use a wiki page on GitHub? Or open a dedicated repo to dump some markdown files with ideas? |
It has the concept of throughput based routing. This is what I used as a starting point. There are a lot of other nice features but they are not really applicable to the low level mesh the concept is about. The concept is also for links that cannot or do not run Ethernet.
Sorry, I did not care for ISO/OSI layers. The primary focus is to have some man/machine issueing statements that are routed to the intended sink which is not on the same machine and maybe several hops way. The connection between these hops may be of different kind. In my case it is SPI and some proprietary bus. The number of devices in the mesh is assumed to be low (<= 100).
The translation of a device id to a CAN ID is part of the CAN type mesh port and hidden behind the port API. If you want to make messages rout-able in a generic way you need an universal address scheme which is the device id in this case. There are shure other addressing schemes, but this one looks like it easily can be translated to the physical buses that are used and to IoT cloud protocols.
I think this is a dual use case. Your application is using the heartbeat statement for some safety reaction. The primary focus in the mesh is to keep topology information up to date. You may well use the same statement for different purposes. The heartbeat statement period is configurable. If your devices are running on the same CAN bus this should not be a problem. If your devices are some hops away there is currently a restriction in the concept (the rate of neighbour anouncements is limited to 1% of throughput to prevent congestion). So in this case you have to create your own high frequency safety heartbeat or the concept has to be altered.
LoRaWAN was just an example of a low throughput link. I personally do not use LoRaWAN. You may well attach LoRaWan by a virtual port that acts like a gateway if this is the appropriate solution.
Heartbeat statements provide - besides heartbeat - throughput and update period to steer the routing in a mesh topology. If you have a static configuration this is only needed once. If your device jumps from one hop to another you may want to steer the messages to the correct hop it is currently attached to. This is mostly related to wireless connections. You can and should configure the update rate according to the topology (change) needs.
Shure, do you propose one?
Thank you, I have to adapt the sequence count roll over.
It was taken as a low throughput example. Most probably all LoRaWAN devices will be mesh endnodes without routing capability.
You are right. LoRaWAN devices are a bad example for mesh routing (see above). They were just taken as the low end devices of link bandwidth.
Would you mind creating a 'mesh' branch on the thingset-device-library? This way also source code could be added and finally be tested with different applications. GH discussion issues could be linked to PRs. |
First ideas are now in https://github.com/b0661/thingset-device-library/tree/pr_mesh/src/mesh |
Sorry for the late reply, I was busy with lots of other stuff. As you may have seen, I moved the device library repo to the ThingSet account on GitHub to make it more independent of Libre Solar. I've also created a In addition to that, I updated the website with the specification. It's now available under https://thingset.io. Now regarding the mesh part of the protocol: I'm wondering if this should be part of the existing library or if it should be kept as a separate extension:
I'm wondering if we could not use an MQTT broker. Every device communicates with the broker (independent of lower layer transport), so it's possible to exchange messages between all different devices. However, it's not decentralized and not local (w/o internet access) anymore. In general: In my understanding a mesh is something where devices in a network can directly communicate to each other, potentially via multiple paths. Is that really what you are envisioning? Should also simple devices like sensors interact directly with other sensors? Or do you more think of a star-of-stars topology like LoRaWAN? In that case only gateways would need to store the routing table, which would make more sense for IoT applications in my opinion. BTW: Do you know DDS? It seems to go in a similar direction. |
Just discovered this quite interesting project from Eclipse Foundation: Maybe the zenoh.net layer could be leveraged for the routing of ThingSet messages... but I didn't fully understand how their line protocol actually works. I can only find documentation of higher-level APIs for different programming languages. There is also a Zephyr library already: zenoh-pico. |
A lot of questions - some ideas:
It should be part of the existing library, but be activated by a configuration switch (Kconfig in the case of Zephyr). This way the protocol and the mesh protocol extension can stay in sync more easily.
See above. ThingSet Mesh is an extension that has to be activated. The ThingSet library shall be usable without it. A simple request/ response on a single link can always be done without mesh functionality.
ThingSet Mesh is in between the application (which does know nothing about the mesh topology) and the data link layer abstracted by the mesh ports. IMHO it is not a an application layer protocol. The application has to provide source/ destination information - but is this really the criterium for an application layer protocol?
That is the idea. I already switched to uint64_t device ids in the concept. Maybe this is over-engineered, but a switch to uint32_t or even shorter should be easy.
Yes
As you mentioned in one of the other comments very low-bandwith networks may be better connected by dedicated gateway ports instead of being a direct node in the mesh. Such a gateway can provide address translation.
IPv6, 6Lowpan, ... work on ethernet frames. ThingSet Mesh directly works on the specific data link protocol like CAN, RS232, ... as used by ThingSet. So one could state it works on ThingSet frames as defined for the specific data link. I would call it re-using concepts already available.
In my use case I want to route messages from/ to devices that are within multi hop distance without an internet connection. There may be several originators of requests at the same time. This can be done by a local MQTT broker on one of the devices which creates a star topology for messaging. This may reduce the routing table size for devices but may also create a lot more hops for messages to travel. It introduces the complexity of the local MQTT broker and some problems when the connection to the local MQTT broker is broken and you have to bring up a new broker for the newly created subnet.
Yes and no. I´m expecting most of the ThingSet Mesh topologies used to be simple with having only one path and a very limited number of hops.
This question is more about application than about mesh functionality. From the mesh side sensor devices usually have only one mesh port. These one port devices do not need to implement the full mesh routing capabilities. The routing table may even be reduced to a single default router entry - I am currently looking for a possibility to detect this automatically. Such a one port device could use a stripped down library to save memory.
I´m thinking of three categories of nodes:
Router nodes hold a routing table. One port nodes may have a simplified routing table or none at all if they only issue statements (which are broadcasts anyway).
I do not know DDS by detail. But as far as I can see it is about data distribution not about how to link nodes. So it may be used on top of a mesh infrastructure but does not provide it.
The net layer still seems to expect that there is a network available. I could not find the source of the zenoh-router. It seems it can work on IP layer 2 as for example Batman Adv. So an ethernet network is necessary. In contrast the ThingSet Mesh builds the network using CAN, RS232, ... data links without ethernet. The zenoh.net router principles could be interesting - but without source it looks like these semi open industrial projects. I am burned by such kind of projects and do not want to use more than concepts from them. |
Thanks for the explanations. I think I understand much better what you'd like to do now. And I agree it makes sense to have a simpler layer for CAN, RS232 instead of Ethernet. For the library (maybe we should move discussion over there) I think we need those three modules (each can be switched on and off as you suggested):
The port abstractions should ideally not only include the lower level protocols, but also the "other worlds" like MQTT or at least provide interfaces for them. Regarding zenoh.net: Eclipse foundation sounded quite open to me, but it's somewhat strange that the protocol itself, which is the most important part from my perspective, is not documented, but you have to look it up in the code of the provided tools/libraries. Maybe I just didn't find it, so I raised an issue and asked about further documentation. |
Currently there are special functions to scan for devices on serial and can ports. In my case the devices are connected by SPI. So I would have to add another special function and do some code duplication.
Instead introduce ports that have general functions to scan for devices that are connected to the ports and also for other functions. The ports may be modeled (similar to devices) with a generic structure that can be filled with the the special functions for a specific type of port. The structure should ideally be constant to not consume precious RAM in constrained devices.
The concept of port may be not only applied to downstream links but also to upstream links.
The text was updated successfully, but these errors were encountered: