-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix overflowing batch size #1310
Fix overflowing batch size #1310
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1310 +/- ##
==========================================
+ Coverage 90.01% 90.05% +0.03%
==========================================
Files 216 217 +1
Lines 15200 15244 +44
==========================================
+ Hits 13683 13728 +45
Misses 1101 1101
+ Partials 416 415 -1
Continue to review full report at Codecov.
|
The original idea of the batcher is to batch traces together and do it as performant as possible. That why it doesn't enforce any hard limit on the batch size. It might be not clearly described in docs, but we have this:
I don't think we should add the complexity to enforce the hard limit by default. This behavior might not be what the users want, but can degrade performance. What do you think if we keep existing behavior for |
It looks good to me |
@dmitryax thanks for the feedback. I have made the changes and made the splitting configurable. |
@@ -141,6 +143,16 @@ func (bp *batchTraceProcessor) startProcessingCycle() { | |||
for { | |||
select { | |||
case td := <-bp.newTraceItem: | |||
if bp.enforceBatchSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about some simpler approach. Add items as normal to the batch, then do a simple split not by max size. It is a bit more overhead maybe but simpler logic I feel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be even better from the perf standpoint. The buckets of max size would have to be probably split also bc there can be other items arriving in the meantime (unless we push them directly).
f93cb7a
to
9728c7d
Compare
@bogdandrutu I have rebased the PR and simplified it to do only a single split per consume. |
Signed-off-by: Pavol Loffay <[email protected]>
650d1c8
to
1a5c564
Compare
Signed-off-by: Pavol Loffay <[email protected]>
* Update Span End method documentation Updates to the Span after End is called result it potentially inconsistent views of the Span between the code defining it and the ultimate receiver of the Span data. This corrects the documented language of the API to prevent this from happening. * Add changes to changelog
Signed-off-by: Pavol Loffay [email protected]
Description: <Describe what has changed.
This PR adds configuration property
enfoce_batch_size
to batch processor that ensures that batch size does not overflow the configuredmax_batch_size
. The default value isfalse
hence it does not change the default behavior.If the incoming batch size is bigger than available space in cached traces then the incoming batch is split into reaming size and size of the maximum batch size. The spans that would overflow the buffer are sent again in trace data over the channel.
Link to tracking Issue:
Resolves #1140
Related to #1020
Testing: < Describe what testing was performed and which tests were added.>
Documentation: < Describe the documentation added.>
enfoce_batch_size
has been added to the readme.