Increasing bandwidth of multi-device drivers #1499

knro · 2021-06-27T13:13:14Z

Currently some multi-device drivers (e.g. QHY) can run multiple cameras in a single instance connected to INDI server. The driver and server are connected via pipes (stdin/stdout). When a BLOB is sent from the driver to the server, a base64 encoded BLOB is sent via an INDI XML message. The server parses the BLOB and redirects it to interested clients.

A primary limitation of this method is the congestion of the pipes when multiple devices start sending BLOBs at the same time; this is especially evident with multiple large-sensor cameras begin streaming simultaneously. This is further exacerbated by a single-threaded INDI server.

There are a few possible solutions to this:

Make each driver run only a single device.
Use another method to transfer data (such as shared memory region) and make each driver handler runs in its own thread in INDI server.
Make BLOB transfers using dedicated channel between drivers and server?

knro · 2021-07-06T13:51:45Z

@k-ross @jpaana @pawel-soja Can you please share your thoughts on this?

k-ross · 2021-07-06T16:22:49Z

Are you sure the pipe is the bottleneck? What happens once all the feeds are aggregated into a single stream to send to the client connected to the indiserver? Wouldn't that be even more of a bottleneck? What kind of throughput do we actually need?

I did a quick test and measured the actual throughput of piping data through stdout, and I get approx. 2.6 Gbps, which is faster than most TCP/IP networks. This was on my Raspberry Pi, the slowest computer I own. :)

The sender:

#include <stdio.h>

int main(int argc, char*argv[])
{
	for(int i = 0; i < 10000000; i++)
	{
		printf("1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890\n");
	}
	printf("EOF\n");

	return 0;
}

The receiver:

#include <stdio.h>
#include <string.h>

int main(int argc, char*argv[])
{
	auto fp = popen("./sender", "r");

	char buf[1024];
	buf[0] = '\0';
	int bytesReceived = 0;
	
	while(!feof(fp))
	{
		fgets(buf, 1024, fp);
		bytesReceived += strlen(buf);
	}
	
	fclose(fp);
	
	printf("%d bytes received\n", bytesReceived);
	
	return 0;
}

The result, on a Raspberry Pi 4:

pi@telescope:~ $ time ./receiver
1010000008 bytes received

real	0m2.956s
user	0m3.397s
sys	0m2.181s

If we need more throughput than this, my suggestion is to offload the incoming streams to separate threads in the indiserver, since I suspect any bottleneck is probably the processing of the incoming data, and not the actual transfer of the data.

knro · 2021-07-06T16:33:00Z

Thank you Kevin! Let's take an example:

Driver QHY <-----> INDI Server
- QHY Camera 1
- QHY Camera 2
Driver SBIG <----> INDI Server

Both of these drivers are connected to INDI servers via a bidirectional pipe (stdin/stdout). For QHY driver, each camera has its own thread when sending out the data to INDI server. If the pipe is not the bottleneck, then INDI server should be since it is single threaded.

As far as client goes, in KStars, I actually open a separate client to INDI server for each BLOB, so each blob is acting as a separate client to INDI server. I believe the next step is to introduce multi-threading for that each driver is handled in its own thread, and each client is handled in its own thread as well. IIRC, @pawel-soja suggested we migrate INDI server to C++ as well which would make this even easier, and perhaps even achieve more cross-platform compatibility.

k-ross · 2021-07-06T17:22:45Z

Have you done any instrumentation to determine where the bottleneck is? Have you timed how long it takes in indiserver to read the data, vs timing how long it takes to process the data? Maybe the slowness is parsing the XML, doing base64 decoding (if you're doing that), or just moving the data around and making unnecessary copies. All could slow things down, and in a single threaded server, any one area of slowness will affect everything.

The good news with single threaded servers, they are usually a lot easier to profile, and find the bottlenecks. :)

knro · 2021-07-06T18:18:47Z

You're right, let me create a benchmark we can check against.

pludov · 2021-08-11T13:05:49Z

Hi

I have been thinking to this long time ago :

It would be natural to exchange blobs as filedescriptor over unix domain socket. The fds exhange is a special kind of message on that domain. Each fd would refernce an anonymous memory (memfd_create). So local clients can receive raw data from driver without the server or the kernel having to process them at all. Also multiple client processes would efficiently share the memory space

AFAIK this method is used by kdbus for performance

https://indilib.org/forum/development/5465-idea-for-a-zero-copy-blob-transport-protocol-in-indi.html

i would be willing to help here...

knro · 2021-08-11T13:32:59Z

That would be awesome @pludov especially if we can also implement multi-threading in INDI server along the way.

k-ross · 2021-08-12T00:34:02Z

It would be natural to exchange blobs as filedescriptor over unix domain socket. The fds exhange is a special kind of message on that domain. Each fd would refernce an anonymous memory (memfd_create). So local clients can receive raw data from driver without the server or the kernel having to process them at all. Also multiple client processes would efficiently share the memory space

I'm not opposed to this idea, but I have some questions.

Why send the FD over a Unix domain socket? Why not include the FD in the XML response?

Does each blob get a new FD? Or is one shared memory space kept open and reused? If it's reused, how do we communicate locking back and forth between client and server, so server knows when client is done and can overwrite with new blob? I guess one approach is to allocate a large chunk of shared memory, and treat it like a ring buffer.

Would this be significantly faster than sending the blob in binary form, instead of base64, over a TCP socket? Sending the blob in binary form over a TCP socket would allow non-local clients to also enjoy a performance boost.

pludov · 2021-08-13T12:50:09Z

FD is a special information that pass from process to process, like a pointer to a buffer on kernel side. It has special handling from the kernel when exchanged between processes (over the AF_UNIX communication domain). you cannot just embed a number in the existing protocol (the fd actual value on the send will make no sens on the receiver).

Each blob will get its own FD, the emitter is responsible for allocating it. The inter process passing and sharing are handled by the kernel. Basically, the kernel keeps the buffer as long as it is referenced by at least one process. To access actual buffer data, processes need to "mmap" it. In the case of the indiserver, this mmap is not required, so the cost of a blob exchange within the indiserver would not be related to the size of the blob : all messages would be almost equals in processing time on its side.

FD exchange cannot be used over TCP connection (which are presumably not on the same host, so no way to share memory). We need to keep the actual base64 protocol to keep support for remote TCP. The Unix/shared buffer protocol would still benefit the indi_server and local client (no encoding/decoding here, faster message exchange, possibly no memory copy at all).

It would require a "converter" at the TCP/remote endpoint for the support of remote connection. This converter can be a per client process/thread, that way we would get a fast indiserver, pure Unix/shared buffer, and a per connection thread for actual base64 encoding (which should always be enough, since one cpu doing base64 will certainly always be able to saturate a tcp connection).

On the implementation side, I propose to implement in the following order:

first a protocol converter (tcp <=> unix/socket), to ensure every aspect is covered. I would test plugging two converter together. It would be a transparent indi proxy with no feature loss .
then a indiserver speaking 100% unix/socket, and using the converter for client and driver
then update drivers/indi library to natively speak the new protocol (connect to unix domain, send dgram & blobs...)

I would go with an event loop library (libev) for indiserver and its "tcp connector"

knro · 2021-08-13T15:12:38Z

@pludov do you think you can work on some sort of proof of concept sample for this? I think if we can reduce the copying between drivers and indiserver on the same host that would bring tons of benefit. For remote clients & remote drivers (that are snooping on BLOBs), we should always fallback to TCP/Base64.

Now this could also be extended to LOCAL clients. Perhaps they can even be aware of this feature and can support, but base64 should also be there to maintain compatibility.

pludov · 2021-08-13T16:40:41Z

@knro : i'll create a tcp/unix message converter first because it will be required for any latter real testing. And the code will be the basis of the unix socket indiserver. Given my spare time availble, it will take some weesk...

pludov · 2021-08-15T21:23:58Z

I started working on that, starting with C++ convertion of indiserver, here: https://github.com/pludov/indi/tree/fast-blob-transfer

I'll add libev for eventloop and go on with implementing communication over unix datagram sockets

Any idea why I need to add libs/ to the lilxml include ? master is not building on my ubuntu 18 box otherwise...

#include "libs/lilxml.h"

knro · 2021-08-16T05:06:55Z

Excellent progress! Looking forward to seeing the first operational C++ INDI server. The lilxml issue is minor and probably we just need to add an include in CMakeLists.txt later on.

pludov · 2021-08-17T08:22:59Z

I now have a locally working c++ indiserver based on libev. But the CI fails because libev is not installed on the build containers... I'd like to have this working to validate libev compatibility (esp on macos...) How can i fix this ?

knro · 2021-08-17T09:18:09Z

Is it possible to edit the yaml file to apt-get install that?

pludov · 2021-08-18T00:07:49Z

The trick was to use the correct docker registry. Forks push to their owner's repo, but still uses container from the main repo for test. I fixed that.

I am almost finished with c++ convertion & libev integration. It is working for my simple test setup (no remote drv, ...) Would you consider a merge request for this part, before I go on with testing datagram implementation ?

knro · 2021-08-18T05:55:29Z

A merge request is a bit early now. It is better to demonstrate the whole spectrum. I will checkout your branch and give you feedback once you let us know it's ready for first phase testing.

pludov · 2021-08-21T09:18:56Z

Not ready for testing but I now have a working driver=>indiserver connection over unix socket with blob exchanged as shared buffers :-)

It still inefficient as base64 encoding is done in the indiserver eventloop for all clients. I'll now to move that away from indiserver eventloop and allow local clients to receive blob as shared buffer.

Optionally, drivers will then need (light) adaptation to write directly their blob to shared buffer (avoiding copy here as well)

knro · 2021-08-23T10:21:32Z

So this needs change in INDI::CCD? This shouldn't be hard to do. We also need to perhaps even include clients to gain from this, if possible. Looking forward to testing this and comparing the results.

Edit: I know INDI::CCD is just for cameras, but this is where the bulk of BLOBs transfers are happening right now. This along with INDI::StreamManager

pludov · 2021-08-23T19:53:04Z

I just pushed an implementation that can use shared buffer for blob delivery from driver to libindiclient (lots of FIXME but that path is OK).

So I did some quick benchmark :-)

I used the following "setup":

ccd_simulator in 8 bit video streaming 5120x4096 (just disabled ShowStarField to get more throughput out of it)
a simple libindiclient test program that dump the stream to a file
Ran my 4 core AMD Ryzen 5 system

indi 1.9.1:

cpu    time      process
232,2  4:27.11   indi_simulator_ccd
 95,7  0:22.29   streamer
 46,5  0:39.90   indiserver
374,4               total
FPS : 9.94 (Recorded: 9.15)

shared buffer version:

cpu    time      process
172,1   4:45.23 indi_simulator_
 14,3   0:03.48 streamer
  0,3   0:00.50 indiserver
186.7              total
FPS: 10.86 (Recorded: 10.75)

My interpretation:

the overall cpu gain for now is 50%
the win is clear on the client and server side
indi_simulator_ccd is the bottleneck in my scenario
The result FPS is not that impacted because the saturation is within indi_simulator_ccd and there are enough cores on my system to avoid perf impact on the server & client

Also that test does not measure the latency on the driver side. But with such a low cpu usage on indiserver, it will be able to keep up with a lot of drivers traffic :-)

I didn't look at streammanager or INDI::CCD in details for now. As you see there is already an advantage for drivers without more change (overall cpu is divided by 2), but there is probably something done in streammanager that consume a large amount of cpu for copy/convertion/... (provided i disabled all the star/noise rendering code... just producing horizontal bars here !)

knro · 2021-08-23T20:28:25Z

Awesome progress! The CPU utilization is definitely much better but I expected we see a significant FPS boost from this as well since we avoided a lot of copying. What if you reduce the resolution? can you see more differences in FPS? how about with real cameras?

pludov · 2021-08-23T21:25:21Z

I'll debug/profile ccd_simulator to know what takes that time (16-8 convertion ?). The signature of ccdchip should change so that drivers produce a new buffer for every frame and transfer its ownership. This would kill needs for synchro. Currently the frame needs to be copied before next acquisition can start. Then for simplest cases where no convertion is required, the buffer could be sent as is. Depending on hw (esp DMA), this would be cpu free on the driver side as well. Le 23 août 2021 22:28:35 GMT+02:00, Jasem Mutlaq ***@***.***> a écrit :

Awesome progress! The CPU utilization is definitely much better but I expected we see a significant FPS boost from this as well since we avoided a lot of copying. What if you reduce the resolution? can you see more differences in FPS? how about with real cameras? -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #1499 (comment)

-- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

pludov · 2021-08-24T06:43:31Z

The bottleneck in my scenario is GammaLut16::apply. It is however quite simple, I'll check the optimization level of the compiler...

pludov · 2021-08-25T22:41:51Z

Since indi_simulator is not able to saturate one indiserver even before shared buffer optimization, the apparent FPS gain was limited. So i tried to optimize effectiveness of ccd_simulator for my benchmark. It appears that lots of frames are lost because of they reach deliver while the previous image delivery is still ongoing. (previewThreadPool.try_start) I changed it for start which to my understanding create a queue of max 1 item. Makes some difference

indi 1.9.1 47M images (7680x6144) with previewThreadPool.start change:

208,3  6,9    indi_simulator_                                                                                                                                
99,0  0,7   streamer                                                                                                                                       
85,4  0,2    indiserver         
 => 7.5 FPS (was 5.5 without threadpool.start change)

shared buffer version, 47M images (7680x6144) with previewThreadPool.start change::

241,9  5,6  indi_simulator_                                                                                                                                
64,1  0,1    streamer                                                                                                                                       
0,3  0,0    indiserver    
 => 15.28 FPS (was  @7.9FPS without threadpool.start change)

In indi_simulator, the GammaLut16 function is always the slowest point. And then most of the remaining time is spent doing vector copy or initialization : lots of std::vector moves actually end up doing memcpy or memset indeed. Also my current shared buffer has one more copy for shared buffer initialization, that could be optimized by preallocating buffers

knro · 2021-08-26T04:36:59Z

For that resolution, 15fps is impressive. This is also a good chance to optimize any bottlenecks or unnecessary copying in the pipeline. What's the next step?

pludov · 2021-08-26T06:48:37Z

I have the following point to address before other could seriously test:

Ensure base64 encoding on server side is outside the main loop
Ensure proper compilation on Linux CI
Fix indiapi.h (mostly ensure asc compatibility)
Properly disable the feature on Macos (I won't be able to properly dev/test that myself...)

At that point testing will be welcome. Also a good time to start a code review

The next step would be and move on to optimization of the driver / driver api side.
This will require discussions, as totally avoiding copy in this end may have impact on the exposed features/format (to propose configurations where data just fly through a blob without copy/endianess/bit convertion. It could be adding new streaming format...) We probably need to focus on some use cases to fully optimize.

But I think the indiserver should be merged before driver/driver api work occurs - as they are mostly independant

knro · 2021-08-26T20:59:48Z

Sure, we can merge INDI server. Please submit a PR and we'll test it thoroughly under multiple scenarios. I presumed you tested it with chained servers?

Regarding MacOS, I can test on it, so no need to disable there if you need feedback. I have both Intel based and M1 Mac minis.

Regarding transfer formats, in #1200 I proposed a way to decouple capture vs transfer formats. For streaming, we send either motion jpeg if natively supported by the source, or raw 8bit or 24bit bitmap or convert the source to jpeg and send that.

pludov · 2021-08-29T09:38:22Z

I'll fix the four point above and test on my side (takes some time) before.
For MacOS, it won't build as is. It will need some adjustment (which includes are required, what constant exists or not there, ...).

By the way, I've found an existing bug as well about blob handling: #1528. Would be great to have some smoke test in CI about that.

knro · 2021-09-07T06:26:27Z

@pludov Wanted to check if there has been any updates on this feature? anything to test now?

pludov · 2021-09-07T16:36:46Z

Unfortunately, lots of clear nights here kept me away from this.... Le 7 septembre 2021 08:26:37 GMT+02:00, Jasem Mutlaq ***@***.***> a écrit : ***@***.*** Wanted to check if there has been any updates on this feature? anything to test now?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #1499 (comment)

-- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

knro · 2021-09-28T06:29:09Z

@pludov Any good news about this?

knro · 2021-10-20T06:50:09Z

Heya! Sorry to bother you again regarding this, but is there any update? Anything to test?

pludov · 2021-10-21T18:45:29Z

Hi ! No progress so far... but still high on my todo list ! Le 20 octobre 2021 08:50:20 GMT+02:00, Jasem Mutlaq ***@***.***> a écrit :

Heya! Sorry to bother you again regarding this, but is there any update? Anything to test? -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #1499 (comment)

-- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

knro · 2021-11-06T12:27:11Z

Great, let me know if there is anything we can do to make this make it into 1.9.3. Can we break this into milestones that are easily trackable?

knro · 2021-11-24T07:05:27Z

Alright, so I moved this to 1.9.4 milestone. @pludov do you think we can get something ready by then?

knro · 2021-12-16T06:09:39Z

Hello @pludov I hope you're doing great. Just wanted to check in and see if there has been any progress here? What's the next step?

pludov · 2021-12-16T08:12:40Z

Hi ! I am not so active on this atm unfortunately. But Still planning to get back at it ! I started adding a feedback for blob deluvery, because without this, nothing prevents blob from beeing queued within the fifo and as soon as a reader cannot read fast enough, the oom killer get invoked. Also in that case, memory for in transit blobs is not accounted to any process so the choice of the oom killer is not clever... I will ensure that a blob emitter never queues more than one blob (or 2,3, configurable ), and wait for ack from the receiver ( only for local/unix socket, this would make no sensre on tcp) Le 16 décembre 2021 07:09:49 GMT+01:00, Jasem Mutlaq ***@***.***> a écrit :

Hello @pludov I hope you're doing great. Just wanted to check in and see if there has been any progress here? What's the next step? -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #1499 (comment)

-- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

pludov · 2022-01-17T12:42:51Z

I did move forward on that subject (you can see here: https://github.com/pludov/indi/ ) It's working on my test system with phd2 and mobindi. Still some todos here and here, especially in the qt-client.

To ensure feeback for blob delivery I had to add something to the protocol, because it's not possible at lower level: without protocol change/addition, sender cannot get knowledge of what the reader has received, and so may fill the socket buffers with blobs and trigger oom. This comes from the fact that the actual message size in byte occupied within the kernel socket buffer does not reflect the memory used. It's easy with big blobs attached to have GB of RAM used in a small socket buffer (especially if the receiver is busy). And lowering socket buffer to extremely low value (like < 100 bytes) would badly impact performances.

The idea I implemented is that a blob sender will not send a new blob while the previous one has not been acknowledge somehow.
I've implemented for that purpose pingRequest/pingReply messages on the indi protocol. this allows the sender to know that the receiver proceeded up to the given message.

This pingRequest/Reply is used when sending blobs. Before sending a blob (no2), the sender will ensure the previous one(no1) has been acknowledged:

send the blob no1
send a ping request
... go back to normal business
wait for reply of previous ping request
send the blob no2

I think this mechanism is usefull only for using unix connection/shared blob. For existing connection it makes no difference as the blob size is largely bigger than buffering between client and server and no two blobs can be infly at the same time...

Also, I felt the need to add some automated integration tests to indiserver and baseclient. The idea here is to start one process (indiserver or indiclient) and instrument (mock) its connections to test various behaviours (at the xml/protocol level). You can see what it looks like in the /integs directory.
The pingRepquest/pingReply also is greatly usefull here, as it gives a way to ensure the exact timeline of a given test.

For example, when testing indiserver, ping helps in the following test :

we send enableBlob on the client side,
we send a blob value on the driver side,
we expect the blob to come up on the client side.
If there is no synchronization, then there is not guaranty that the enableBlob will be handled by indiserver before the actual blob value. They may enter the process at the same time. In the end, the OS scheduler will decide in which order the receiving process will actually receive the data, and not necessarily in the sending order (from the select syscal, ...) We can try to workaround that be introducing delay, but it will always be subject to time variation and available resources, especially if automated in CI (leading to false breakage).

The alternative is to insert a pingRequest/reply to ensure the state of the server after the enableBlob. The test becomes:

we send enableBlob on the client side,
we send a pingRequest on the client side
we wait a pingRepy on the client side => it's now sure the enableBlob has been processed by the server
we send a blob value on the driver side,
we expect the blob to come up on the client side.

So I propose we add this "ping" mechanism to a new version of indi protocol to go on (with proper handling of previous version), for shared blobs and for integration tests. What do you think about it ?

pludov · 2022-01-20T07:52:45Z

Hi @knro : What's your opinion on the proposal above ? Any chance it would be accepted upstream ? We can discuss live if you prefer...

knro · 2022-01-20T08:10:09Z

Thank you for the proposal, I need time to review it fully (just released INDI 1.9.4 yesterday and I should be able to look at it in depth this weekend).

knro · 2022-02-19T10:41:57Z

Sorry, been busier than usual. Will get back to this soon. I think it's going to be extremely challenging to change INDI wire protocol, but not impossible if it is kept backward compatible.

knro · 2022-05-21T07:48:35Z

@pludov Sorry it took me such a long time to respond to this, but now that INDI 1.9.6 is ready for release, I've been thinking of how we can incorporate this in 1.9.7

Let's take this scenario:

One driver
One Server
1 local client with shared memory support.
1 remote client with shared memory support.
1 local client without shared memory support

From my understanding, once the driver has the data, it can signal to the server that data is ready to be read (perhaps by a property?). The server then checks the clients. Since it has 3 clients, it checks how it sends the BLOB to them.

For first local client with shared memory support: No need to send blob, but only signal to it that it is ready (perhaps by same property mentioned above?)
For second remote client, it doesn't matter if it has shared memory support, we need to send it via TCP sockets. So the server must read the shared memory and then sends it.
Same as 2, server must read and send via TCP.

Driver cannot use shared memory yet (no queue?) until server and 1st client complete reading the shared memory. How is the driver informed about this so that it can start utilizing the memory again? Regarding the change in protocol, you propose a new message type and ? how is that exactly used? is it only between client and server, or also between server and driver?

pludov · 2022-05-22T09:00:03Z

Hi Jasem, It's been a while ! Maybe we should have a live talk about it, but here is a written response attempt : First a slight precision : There is no "shared memory" in my proposal. The driver will generate a new buffer for each frame, and the buffer reference is then passed over to the server and then the client if they support it (or copyed/base64 for tcp connections). What is exchanged is buffer reference, and this flies over using the actual indi protocol when running over unix domain socket (using filedescriptor attachment feature of the unix domain) - this replaces the base 64 enc. Using that scheme, access and synchronisation is ensured by the kernel itself: accessing the memory of a buffer by mmap, releasing a buffer once nobody uses it anymore (ref counted), ... there is no buffer reuse to care about - the kernel will reuse whatever memory -, so the driver doesn't have to implement any synchronization. Also, having data in a kernel's FD object gives access to some highly optimized function from the kernel : reading/writing content directly to a file or device without even mapping the memory in the processes (ex: DMA in the driver, using the https://www.systutorials.com/docs/linux/man/2-sendfile/ or https://man7.org/linux/man-pages/man2/copy_file_range.2.html syscalls). In theory we could have on-disk capture implemented with the existing driver/server/client architecture, with almost no CPU use on system/drivers that have efficient DMA. _Benefit to existing clients:_ Driver can benefit from shared buffers if server does so. This already avoid two copies of the data + the tcp buffering latency (transfer between server/driver). Since the buffer exchange is blazing fast, we could also simplify most driver by removing the need for dedicated "write" thread. _Why a protocol change_: One drawback I found is that in this buffer exchange scheme, when a "blob reader" is slow (either a client or the server), buffers can accumulate on its side, because the socket buffer is large compared to the size of the messages exchanged. Lots of message can fit in that buffer before beeing read by the client, and if they all have attached blob buffer, this can makes up a large ammount of memory. This situation can lead to system memory exhaustion if too many buffers / too big buffers accumulate in this situation. That's why there is a need to get some feedback after a blob has been sent over this channel, so as the sender knows how many blob/memory ammount is still in transit to the reader, and do some skipping if necessary. I added a ping/reply mechanism that is just a way to get the feedback about when a message has actually been read: the blob emitter will send a ping request after a blob, and will not send anymore blob until it received the associated reply. The ping request/reply is local to the link (between driver/server or server/client) and is not propagated. This ping request/reply mechanism also allows writing deterministic integration test. I wrote tests that trigger specific case like "a driver message arrived before a client registration" vs "a driver message arrived after a client registration". From a CI/testing perspective, without protocol synchronization it is not possile to know in advance the order of processing by the server of two messages issued on two differents channels. The best you can do is to insert sleep instruction which ultimetly make your test non reliable (producing false positives when running on loaded machine). Using this ping mechanism, you can make sure a message is processed by the server at any time and test exactly one scenario. I used that to write a non-regression test suite for the rewritten indiserver _Beeing transparent with existing client:_ The whole mechanism buffer exchange relies on using unix domain socket, which garantees the peers of a connection are running on the same system. No client today can use this kind of connection since server is purely tcp up to now. So in my proposal the new protocol message is initiated exclusively on this kind of socket, not changing anything for existing clients. For would-be client using the new connection mechanism, the ping is handled at low level and would be transparent. Ludovic Le 21/05/2022 à 09:48, Jasem Mutlaq a écrit :

…

@pludov <https://github.com/pludov> Sorry it took me such a long time to respond to this, but now that INDI 1.9.6 is ready for release, I've been thinking of how we can incorporate this in 1.9.7 Let's take this scenario: 1. One driver 2. One Server 3. 1 local client with shared memory support. 4. 1 remote client with shared memory support. 5. 1 local client without shared memory support From my understanding, once the driver has the data, it can signal to the server that data is ready to be read (perhaps by a property?). The server then checks the clients. Since it has 3 clients, it checks how it sends the BLOB to them. 1. For first local client with shared memory support: No need to send blob, but only signal to it that it is ready (perhaps by same property mentioned above?) 2. For second remote client, it doesn't matter if it has shared memory support, we need to send it via TCP sockets. So the server must read the shared memory and then sends it. 3. Same as 2, server must read and send via TCP. Driver cannot use shared memory yet (no queue?) until server and 1st client complete reading the shared memory. How is the driver informed about this so that it can start utilizing the memory again? Regarding the change in protocol, you propose a new message type and ? how is that exactly used? is it only between client and server, or also between server and driver? — Reply to this email directly, view it on GitHub <#1499 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADMXFHSYRMCQNMG636QLNBLVLCIOBANCNFSM47MMTRUA>. You are receiving this because you were mentioned.Message ID: ***@***.***>

knro · 2022-05-25T18:39:24Z

Thank you Ludovic for the detailed response. I think we should arrange a video meeting to go over this, perhaps even with the participation of other INDI developers so we can agree on some consensus moving forward. I believe this would bring significant improvements especially if the driver/server/client are all on the same physical machine. Thanks for clarifying my misunderstandings about this and I probably have more!

How about we arrange a meeting next week? how is your availability? You can send me a direct email to discuss the set up.

knro · 2022-07-21T08:58:16Z

This is now fully implemented in INDI thanks to outstanding work by @pludov!

knro added the enhancement label Jun 27, 2021

knro added this to the 1.9.2 milestone Jul 6, 2021

knro modified the milestones: 1.9.2, v1.9.4 Nov 24, 2021

knro added the help wanted label Dec 16, 2021

knro modified the milestones: 1.9.4, 1.9.6 May 21, 2022

knro pinned this issue May 21, 2022

knro closed this as completed Jul 21, 2022

Increasing bandwidth of multi-device drivers #1499

Increasing bandwidth of multi-device drivers #1499

Comments

knro commented Jun 27, 2021

knro commented Jul 6, 2021

k-ross commented Jul 6, 2021

knro commented Jul 6, 2021

k-ross commented Jul 6, 2021

knro commented Jul 6, 2021

pludov commented Aug 11, 2021

knro commented Aug 11, 2021

k-ross commented Aug 12, 2021

pludov commented Aug 13, 2021

knro commented Aug 13, 2021

pludov commented Aug 13, 2021

pludov commented Aug 15, 2021

knro commented Aug 16, 2021

pludov commented Aug 17, 2021

knro commented Aug 17, 2021

pludov commented Aug 18, 2021

knro commented Aug 18, 2021

pludov commented Aug 21, 2021

knro commented Aug 23, 2021 • edited Loading

pludov commented Aug 23, 2021

knro commented Aug 23, 2021

pludov commented Aug 23, 2021 via email

pludov commented Aug 24, 2021

pludov commented Aug 25, 2021

knro commented Aug 26, 2021

pludov commented Aug 26, 2021

knro commented Aug 26, 2021 • edited Loading

pludov commented Aug 29, 2021

knro commented Sep 7, 2021

pludov commented Sep 7, 2021 via email

knro commented Sep 28, 2021

knro commented Oct 20, 2021

pludov commented Oct 21, 2021 via email

knro commented Nov 6, 2021

knro commented Nov 24, 2021

knro commented Dec 16, 2021

pludov commented Dec 16, 2021 via email

pludov commented Jan 17, 2022

pludov commented Jan 20, 2022

knro commented Jan 20, 2022

knro commented Feb 19, 2022

knro commented May 21, 2022

pludov commented May 22, 2022 via email

knro commented May 25, 2022

knro commented Jul 21, 2022

knro commented Aug 23, 2021 •

edited

Loading

knro commented Aug 26, 2021 •

edited

Loading