You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem description
We are doing performance benchmarks of Pravega using the Python binding. Once behavior that we found that is different from the Java client is that, when reading via Python client, we observe a lot of log messages like:
Nov 29 10:32:06 ip-10-0-0-107.ec2.internal pravega-segmentstore[13373]: 2023-11-29 10:32:06,042 3419086 [epollEventLoopGroup-9-3] INFO i.p.s.s.h.h.PravegaRequestProcessor - [requestId=689] Iterate Table Entries Delta: Segment=terasort1701253869/terasort_reader_group_kvtable/0.#epoch.0 Count=10 FromPositon=60238.
Nov 29 10:32:06 ip-10-0-0-107.ec2.internal pravega-segmentstore[13373]: 2023-11-29 10:32:06,042 3419086 [epollEventLoopGroup-9-8] INFO i.p.s.s.h.h.PravegaRequestProcessor - [requestId=1745] Iterate Table Entries Delta: Segment=terasort1701253869/terasort_reader_group_kvtable/0.#epoch.0 Count=10 FromPositon=61610.
Nov 29 10:32:06 ip-10-0-0-107.ec2.internal pravega-segmentstore[13373]: 2023-11-29 10:32:06,042 3419086 [epollEventLoopGroup-9-11] INFO i.p.s.s.h.h.PravegaRequestProcessor - [requestId=1651] Iterate Table Entries Delta: Segment=terasort1701253869/terasort_reader_group_kvtable/0.#epoch.0 Count=10 FromPositon=60238.
This looks like queries to the Reader Group segment for the reader group we instantiate to read from Pravega. In Rust, we know that the synchronization mechanism is not the same as in Java (StateSynchronizer), but it is based on KV Tables.
What we have found is that these messages flood the logs of Pravega segment stores, which suggested us to look at the metrics for the same experiments:
As you can see, the number of Iterate Entries (which correspond to WireCommands.ReadTableEntriesDelta messages) and, apparently in consequence, the number of Get Info messages is as large as the number of actual "read messages", for the segment store handling the Reader Group segment. While the tests we are doing use relatively large reader groups (30-100 readers in the same reader group), the overhead apparently related to synchronize the reader group state seems excessive in the Rust client.
Problem location
Probably, the update logic around TableSynchronizer in the Rust client.
Suggestions for an improvement
It would be important for scalability reasons to think ways to minimize the number of "update" calls to the Table Segment keeping the ReaderGroup state, if possible.
The text was updated successfully, but these errors were encountered:
Problem description
We are doing performance benchmarks of Pravega using the Python binding. Once behavior that we found that is different from the Java client is that, when reading via Python client, we observe a lot of log messages like:
This looks like queries to the Reader Group segment for the reader group we instantiate to read from Pravega. In Rust, we know that the synchronization mechanism is not the same as in Java (
StateSynchronizer
), but it is based on KV Tables.What we have found is that these messages flood the logs of Pravega segment stores, which suggested us to look at the metrics for the same experiments:
As you can see, the number of
Iterate Entries
(which correspond toWireCommands.ReadTableEntriesDelta
messages) and, apparently in consequence, the number ofGet Info
messages is as large as the number of actual "read messages", for the segment store handling the Reader Group segment. While the tests we are doing use relatively large reader groups (30-100 readers in the same reader group), the overhead apparently related to synchronize the reader group state seems excessive in the Rust client.cc/ @gfinol
Problem location
Probably, the update logic around
TableSynchronizer
in the Rust client.Suggestions for an improvement
It would be important for scalability reasons to think ways to minimize the number of "update" calls to the Table Segment keeping the ReaderGroup state, if possible.
The text was updated successfully, but these errors were encountered: