Stream Consumption Capped at 150mps #539

ron-mcleod · 2024-04-25T16:24:26Z

ron-mcleod
Apr 25, 2024

I have a stream with around 3000 messages, with each message sized at around 250 bytes. My application is consuming the stream messages using the RabbitMQ Stream Java Client. When the MessageHandler is doing no processing, the consumption rate is consistently around 150 messages/second - regardless of whether the offset is set to the beginning (OffsetSpecification.first()), or a particular point in time (OffsetSpecification.timestamp()).

This rate is considerably lower than what I was expecting, but I am new to RabbitMQ, so maybe my expectations are off.

RabbitMQ and the client application are both running in rootless containers on the same host managed by podman.

Stream configuration
x-max-age: 7200s
x-queue-leader-locator: least-leaders
x-queue-type: stream
x-stream-max-segment-size-bytes: 300000
x-max-length-bytes: 3000000
durable: true

Stream stats
Messages: ready=3,029 unacked=0
Process memory 14 KiB

RabbitMQ
Container image: docker.io/library/rabbitmq:3.13.1 (added 12 days ago)
OS: Ubuntu 22.04
RabbitMQ release: 3.13.1
Erlang: 26.2.4 [jit]

Client Application
OS: Redhat Linux 9
Java version: 21
Stream client version: 0.15.0

Answered by acogoluegnes

Apr 29, 2024

Yes, streams are optimized for high ingress. They store messages in "chunks", which is also the unit of delivery and replication. So a client receives a whole chunk of messages, which can be made of a few messages to a few thousand. The higher the ingress, the bigger the chunk size.

We optimized the delivery code to squeeze several chunks in one TCP packet, but this is not always enough to speed up delivery with very small chunks.

The index file contains the chunk-to-file-index mapping, so the more chunks, the larger it gets.

View full answer

acogoluegnes · 2024-04-26T09:43:41Z

acogoluegnes
Apr 26, 2024
Maintainer

A single stream is usually much faster (~ 100 K messages / second, but that depends on many parameters). You should try experiment with Stream PerfTest to establish a baseline in your environment.

0 replies

ron-mcleod · 2024-04-26T16:33:14Z

ron-mcleod
Apr 26, 2024
Author

Thanks Arnaud for your quick reply. I ran the performance test, and if I am correctly interpreting the results, I am seeing a potential of 400k+ messages per second.

java -jar stream-perf-test-latest.jar -x 1 -y 1 -z 15 -ei 1 \
	-lb 172.25.161.131 -u rabbitmq-stream://test:[email protected]:5552 \
	-st "perf-stream" --id "test-1"

Summary: published 436246 msg/s, confirmed 435581 msg/s, consumed 435581 msg/s, latency 95th 44 ms, chunk size 2435

Can you suggest where I should look to try to determine why the application's performance is so poor? I am using just the basic methods to build the Environment and Consumer instances. I noticed in the StreamPerfTest code that some netty customization is being performed -- is that something that I should investigate?

var entryPoint = new Address(host, streamPort);
var env = Environment.builder()
        .host(entryPoint.host())
        .port(entryPoint.port())
        .addressResolver(address -> entryPoint)
        .username(username)
        .password(password)
        .virtualHost("/")
        .id(connectionId + "-stream")
        .recoveryBackOffDelayPolicy(
                BackOffDelayPolicy.fixed(Duration.ofSeconds(reconnectPeriodSeconds)))
        .build();
		
var consumer = env.consumerBuilder()
        .stream(streamName)
        .offset(offsetSpec)
        .messageHandler(messageHandler)
        .build());

Thanks again for your help.

1 reply

lukebakken Apr 26, 2024
Maintainer

Can you suggest where I should look to try to determine why the application's performance is so poor?

Please provide a GitHub repository containing code we can clone, build, and run, to see performance like what you report. Anything else and we're just guessing! Thanks.

ron-mcleod · 2024-04-26T17:39:13Z

ron-mcleod
Apr 26, 2024
Author

Yes -- I can create a simple project which captures what I am doing on the stream consumer side, but I am not producing messages for the stream from the an application. I realize now that I probably should have mentioned that up-front, because when I use the StreamPerfTest tool to consume from the actual stream that I am working with, the consumption rate drops by a huge factor.

java -jar stream-perf-test-latest.jar -x 0 -y 1 -z 3 -ei 1 \
	-lb 172.25.161.130 -u rabbitmq-stream://test:[email protected]:5552 
	-st "amf_usage_gnb-mon_gnb-state-sum-mon" -o first --id "test"
Warning: stream 'amf_usage_gnb-mon_gnb-state-sum-mon' already exists, but with different properties than max-length-bytes=20gb, stream-max-segment-size-bytes=500mb, queue-leader-locator=LEAST_LEADERS

1, published 0 msg/s, confirmed 0 msg/s, consumed 736 msg/s, latency median/75th/95th/99th 0/0/0/0 ms, chunk size 1
2, published 0 msg/s, confirmed 0 msg/s, consumed 978 msg/s, latency median/75th/95th/99th 0/0/0/0 ms, chunk size 1
3, published 0 msg/s, confirmed 0 msg/s, consumed 878 msg/s, latency median/75th/95th/99th 0/0/0/0 ms, chunk size 1

Summary: published 0 msg/s, confirmed 0 msg/s, consumed 862 msg/s, latency 95th 0 ms, chunk size 1

In my current test environment, I have 3 monitoring applications which publish (each at a rate of 1 message every 10 seconds) to the RabbitMQ broker using MQTT protocol. I have a binding in with the amq.topic exchange to route messages with a binding key of ssi.5gc.amf.*.usage.resources.gnb-mon.gnb-state-sum-mon to the amf_usage_gnb-mon_gnb-state-sum-mon queue (stream), which I later consume with the application. I guess when using this kind of internal routing, combined with messages arriving so infrequently, there is no opportunity to optimize how messages are stored, like what might be achieved with batching and sub-entries (I am new to RabbitMQ so this is just speculation).

1 reply

ron-mcleod Apr 27, 2024
Author

My test test application can be found here: https://github.com/ron-mcleod/streams-test

What I am finding is that when the producer sends messages to stream at a slower rate, the rate at which they can be consume will also become slower.

Producer Delay Between Sends	Consumer Consumption Rate
0 ms	7,812 msg/sec
50 ms	1,093 msg/sec
250 ms	832 msg/sec

Tests based on 5000 messages with a payload size of 250 bytes

java -jar target/rabbitmq-streams.jar produce -h 172.25.161.131 -ps 250 -pc 5000 -ms 1000000 -ml 10000000

java -jar target/rabbitmq-streams.jar consume -h 172.25.161.131
count=5,000   duration=640 ms   rate=7,812 msg/sec

java -jar target/rabbitmq-streams.jar delete -h 172.25.161.131

java -jar target/rabbitmq-streams.jar produce -h 172.25.161.131 -ps 250 -pc 5000 -pd 50 -ms 1000000 -ml 10000000

java -jar target/rabbitmq-streams.jar consume -h 172.25.161.131
count=5,000   duration=4,573 ms   rate=1,093 msg/sec

java -jar target/rabbitmq-streams.jar delete -h 172.25.161.131

java -jar target/rabbitmq-streams.jar produce -h 172.25.161.131 -ps 250 -pc 5000 -pd 250 -ms 1000000 -ml 10000000

java -jar target/rabbitmq-streams.jar consume -h 172.25.161.131
count=5,000   duration=6,006 ms   rate=832 msg/sec

ron-mcleod · 2024-04-28T02:05:45Z

ron-mcleod
Apr 28, 2024
Author

I performed some additional testing and found when there was zero (or close to it) delay when publishing to the stream, or when using batching, that when later consuming from those streams, that it was considerably faster. Also, looking at the file system storage for the streams, for streams where the consumption rate is high, the size of the segment index is small (I tested with a stream where the max length max segment size where equal).

Is this the behaviour of streams, that messages need to be published to the stream in batches or in a continuously in order to be able to later consume from the stream at a high rate?

Delay Between Sends	Batch Size	Index Size	Segment Size	Consumer Consumption Rate
0 ms	1	14kB	1.3MB	7,751 msg/sec
2 ms	1	141kB	1.5MB	1,093 msg/sec
10 ms	1	142kB	1.5MB	1,090 msg/sec
10 ms	100	15kB	1.3MB	6,821 msg/sec

payload size: 250B  max length: 2000000B  max segment size: 2000000B batch size: 1  inter-send delay: 0ms
count=5,000   duration=645 ms   rate=7,751 msg/sec
__2MB_2MB_0ms_1714259597436212530:
-rw-r--r--. 1 100998 100998  14K Apr 27 16:13 00000000000000000000.index
-rw-r--r--. 1 100998 100998 1.3M Apr 27 16:13 00000000000000000000.segment

payload size: 250B  max length: 2000000B  max segment size: 2000000B batch size: 1  inter-send delay: 2ms
count=5,000   duration=4,574 ms   rate=1,093 msg/sec
__2MB_2MB_2ms_1714259574690408432:
-rw-r--r--. 1 100998 100998 141K Apr 27 16:13 00000000000000000000.index
-rw-r--r--. 1 100998 100998 1.5M Apr 27 16:13 00000000000000000000.segment

payload size: 250B  max length: 2000000B  max segment size: 2000000B batch size: 1  inter-send delay: 10ms
count=5,000   duration=4,587 ms   rate=1,090 msg/sec
__2MB_2MB_10ms_1714259845421010951:
-rw-r--r--. 1 100998 100998 142K Apr 27 16:18 00000000000000000000.index
-rw-r--r--. 1 100998 100998 1.5M Apr 27 16:18 00000000000000000000.segment

payload size: 250B  max length: 2000000B  max segment size: 2000000B batch size: 100  inter-send delay: 10ms
count=5,000   duration=733 ms   rate=6,821 msg/sec
__2MB_2MB_10ms_bs100_1714259972515184970:
-rw-r--r--. 1 100998 100998  15K Apr 27 16:20 00000000000000000000.index
-rw-r--r--. 1 100998 100998 1.3M Apr 27 16:20 00000000000000000000.segment

0 replies

acogoluegnes · 2024-04-29T06:45:47Z

acogoluegnes
Apr 29, 2024
Maintainer

Yes, streams are optimized for high ingress. They store messages in "chunks", which is also the unit of delivery and replication. So a client receives a whole chunk of messages, which can be made of a few messages to a few thousand. The higher the ingress, the bigger the chunk size.

We optimized the delivery code to squeeze several chunks in one TCP packet, but this is not always enough to speed up delivery with very small chunks.

The index file contains the chunk-to-file-index mapping, so the more chunks, the larger it gets.

0 replies

ron-mcleod · 2024-04-29T15:34:06Z

ron-mcleod
Apr 29, 2024
Author

Thanks for the explanation Arnaud. Are you aware of any Osiris tool/utility which could be used to repack the messages into higher density chunks?

1 reply

acogoluegnes Apr 30, 2024
Maintainer

There's no out-of-the-box tool to repack the messages. A workaround would be to republish the stream to another stream: the content would be the same but there should be much fewer chunks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream Consumption Capped at 150mps #539

{{title}}

Replies: 6 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Stream Consumption Capped at 150mps #539

ron-mcleod Apr 25, 2024

Replies: 6 comments · 3 replies

acogoluegnes Apr 26, 2024 Maintainer

ron-mcleod Apr 26, 2024 Author

lukebakken Apr 26, 2024 Maintainer

ron-mcleod Apr 26, 2024 Author

ron-mcleod Apr 27, 2024 Author

ron-mcleod Apr 28, 2024 Author

acogoluegnes Apr 29, 2024 Maintainer

ron-mcleod Apr 29, 2024 Author

acogoluegnes Apr 30, 2024 Maintainer

ron-mcleod
Apr 25, 2024

Replies: 6 comments 3 replies

acogoluegnes
Apr 26, 2024
Maintainer

ron-mcleod
Apr 26, 2024
Author

lukebakken Apr 26, 2024
Maintainer

ron-mcleod
Apr 26, 2024
Author

ron-mcleod Apr 27, 2024
Author

ron-mcleod
Apr 28, 2024
Author

acogoluegnes
Apr 29, 2024
Maintainer

ron-mcleod
Apr 29, 2024
Author

acogoluegnes Apr 30, 2024
Maintainer