-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow subscriber might block publishes #60
Comments
Nice catch! I've actually had this issue in the back of my mind for some time now, but haven't gotten around to fixing / documenting it.
That's one option. An alternative solution is to use a |
broadcast::channel sounds like a good solution and might be used on something like a "per path" basis which would be a good way to only send messages to subscribers they are really interested in. The current solution I guess has quite some overhead if there are many subscribers to lots of different topics/paths and each change is propagated to all of them and the subscriber filter them out. |
@jjj-vtm If you have time you can join our community meeting taking place tomorrow and we can discuss it further. Here is the link to it https://eclipse.zoom.us/j/87644929505?pwd=cTRpYklVaS9xYjlhMXRtbS9IN0FCQT09 its planned at 1 pm. |
Hi, and error handling in the update_entries function: What would be nice about the broadcast channel is that it would signal the client that it is lagging behind but on the other hand having a broadcast channel with a single receiver only for this semantic seems to be overkill. I thought about having a broadcast channel per ID so that only Subscribers which are interested in changes for that ID are informed but that would need some refactoring and I only would see this necessary if the databroker must support a high number of subscribers (> 10k). With < 1k subscriptions checking if the subscriptions is interested in the change should not cause problems since the checks are very simple (maybe apart from the HashMap lookup). In my playground I also added a small performance test: Using this I made some changes to the databroker, which we could discuss:
(3) is quite controversial since a lot of ".awaits" are removed and the broker gets more synchronous but since there is no IO involved and everything is backed by a simple in Memory datastore this should be fine. This change increased the performance quite drastically. Using the test above with the standard (main) databroker on my M3 I get: janjongen@tsssss3 kuksa-databroker-1 % ./target/release/examples/pub_perf with my patches: tsssss3 kuksa-databroker-1 % ./target/release/examples/pub_perf Sorry for the long post, I try to join the community meeting next time. |
Awesome! I will have a closer look, but I have a couple of initial thoughts.
You're right that it's probably "overkill" to use one broadcast channel per subscriber, but the other nice semantic of the broadcast channel is that it will drop the oldest value when the subscriber is lagging.
The problem with this is that it would effectively mean that the latest value is dropped instead of the oldest. That's probably not the semantics we want, as the oldest value is probably that of least interest if we're starting to drop things. My other thought is that I'm not sure We would be really happy to improve the performance measuring situation, though. We've recently created kuksa-perf as the first step to have more meaningful performance tests. What it's measuring is the latency between a provider publishing an update and a subscriber receiving an update, and we're working on making it better a simulating realistic workloads with the ability to fine tune the amount of traffic generated while still measuring the resulting latency. This is an area where we need to look further into how we want databroker to behave under excessive load (i.e. when and how it should start dropping stuff). Anyway, thank you for also looking into this, and I will take a further look into the other performance enhancing modifications you've made as well. |
You are right, I thought about it but it too but could not come up with an easy way how to implement this without using the broadcast channel. IMHO there are 3 options
It simply tests the throughput of publishes of the databroker and is not a realistic load test. I just wanted to measure the effects of my changes. Thanks! I must have overlooked that. |
Hi,
I wrote a client which basically simulates an overloaded (not consuming) subscriber.
to trigger the bug faster I also modified the http/2 initial_connect_window and stream_window_size
After around 15-20 publishes via the cli
kuksa.val.v1 > publish Vehicle.Speed 123
[publish] OK
no response will come back and publishing a different value via another cli session
kuksa.val.v1 > publish Vehicle.Width 3
will also hang. I guess that this send
kuksa-databroker/databroker/src/broker.rs
Line 752 in ccc382c
will wait until the receiver reads the data and since a read lock on database is held subsequent publishes wait on the lock. But even with a different database implementation the issue would be that one slow consumer might cause delays in the value propagation for other subscribers on the same path since the send from above is called in a loop over the possible subscribers.
Maybe it would be better to use try_send instead of send.
Cheers,
Jan
The text was updated successfully, but these errors were encountered: