Memory-sensitive collector queue size #943

jpkrohling · 2018-07-19T08:56:53Z

Requirement - what kind of business use case are you trying to solve?

Avoiding dropping of spans on the collector side when the backing storage isn't really fast, but enough memory is available for a bigger buffer on the collector size

Problem - what in Jaeger blocks you from solving the requirement?

Under stress conditions, the default queue size of 2000 causes spans to be dropped quite frequently, even though there's enough memory to have a bigger queue size

Proposal - what do you suggest to solve the problem or improve the existing situation?

The first idea is to have a dynamic collector queue size, with the following characteristics

Heuristics to determine the "average" span size in memory (could also be used for the in-memory storage)
Flag to specify a memory size for the queue (default: half of the host's memory? 25%?)
Adjust the queue size based a simple division of the above
Update a metrics gauge with the current queue size
Flag to enable usage of this new collector queue mechanism

yurishkuro · 2018-10-31T19:07:33Z

This does not seem related to #355, correct? Maybe retitle to "memory-sensitive queue size" rather than "dynamic".

isaachier · 2018-10-31T19:56:18Z

So there are a bunch of algorithms that have been studied and used for network scheduling (https://en.wikipedia.org/wiki/Network_scheduler). My suggestion is either to reimplement one of them directly, or potentially leverage existing OS configurations to drop packets from spans according to one of those existing strategies.

jpkrohling · 2018-11-02T08:22:54Z

This is not related to #355. This is about automatically adjusting the queue size based on the known span sizes and available memory, to make optimal usage of the assigned resources.

jpkrohling added the enhancement label Jul 19, 2018

jpkrohling mentioned this issue Aug 14, 2018

Performance tests - Collector #989

Closed

yurishkuro changed the title ~~Dynamic collector queue size~~ Memory-sensitive collector queue size Nov 2, 2018

This was referenced Nov 28, 2019

BoundedQueue is dropping new items instead of discarding old ones #1947

Closed

Added 'resize' operation to BoundedQueue #1949

Merged

jpkrohling mentioned this issue Dec 19, 2019

Added support for dynamic queue sizes #1985

Merged

jpkrohling mentioned this issue Jan 15, 2020

Auto-scale collector instances jaegertracing/jaeger-operator#848

Closed

jpkrohling closed this as completed in #1985 Jan 23, 2020

james-bebbington mentioned this issue May 6, 2020

OTel Collector potential memory leak open-telemetry/opentelemetry-collector#802

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory-sensitive collector queue size #943

Memory-sensitive collector queue size #943

jpkrohling commented Jul 19, 2018

yurishkuro commented Oct 31, 2018

isaachier commented Oct 31, 2018

jpkrohling commented Nov 2, 2018

Memory-sensitive collector queue size #943

Memory-sensitive collector queue size #943

Comments

jpkrohling commented Jul 19, 2018

Requirement - what kind of business use case are you trying to solve?

Problem - what in Jaeger blocks you from solving the requirement?

Proposal - what do you suggest to solve the problem or improve the existing situation?

yurishkuro commented Oct 31, 2018

isaachier commented Oct 31, 2018

jpkrohling commented Nov 2, 2018