-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added batch span processor with test coverage #195
Added batch span processor with test coverage #195
Conversation
|
8ffce98
to
427fff6
Compare
Codecov Report
@@ Coverage Diff @@
## master #195 +/- ##
==========================================
+ Coverage 92.19% 94.22% +2.03%
==========================================
Files 117 115 -2
Lines 4038 3882 -156
==========================================
- Hits 3723 3658 -65
+ Misses 315 224 -91
|
fe416e4
to
bc5f67f
Compare
/check-cla |
84e9684
to
696939c
Compare
696939c
to
a8ac7a1
Compare
|
||
// If the queue gets at least half full a preemptive notification is | ||
// sent to the worker thread to start a new export cycle. | ||
if(static_cast<int>(buffer_->size()) >= max_queue_size_ / 2){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for notification to be missed here? Woudl it make sense to make this a loop like in other places where a notification is sent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this wasn't stated anywhere explicitly in the spec, I was wondering if it was fine to leave this as a hit-or-miss sort of a notify call. All it does is try to make a preemptive notification for better performance and is not a hard requirement.
But I think when there are many producer threads, this sort of preemption might become necessary to minimize the number of spans dropped.
Please let me know what you think about this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reyang @pyohannes
It'd be great to get your feedback too on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if it is not a hard requirement, I think it is fine to leave it as a single attempt to wake up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be a good idea to me.
7addf7e
to
eab7d98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think overall this looks good!
eab7d98
to
1f0bc9c
Compare
LGTM! |
/** | ||
* Class destructor which invokes the Shutdown() method. The Shutdown() method is supposed to be invoked | ||
* when the Tracer is shutdown (as per other languages), but the C++ Tracer only takes shared ownership of the processor, | ||
* and thus doesn't call Shutdown (as the processor might be shared with other Tracers). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @cijothomas @rajkumar-rangaraj
I think @snehilchopra is the first one doing this with the correct consideration of ref counting.
500c646
to
381d435
Compare
9efe9c5
to
fdba096
Compare
@pyohannes Can I please get a final stamp on this? |
fdba096
to
9bf95d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some more remarks. In addition to those, I'd like to propose a simplification for the Shutdown
and DoBackgroundWork
method:
void BatchSpanProcessor::Shutdown(std::chrono::microseconds timeout) noexcept
{
is_shutdown_ = true;
cv_.notify_one();
worker_thread_.join();
exporter_->Shutdown();
}
void BatchSpanProcessor::DoBackgroundWork()
{
while (true)
{
std::unique_lock<std::mutex> lk(cv_m_);
cv_.wait_for(lk, schedule_delay_millis_);
if (is_shutdown_.load() == true)
{
DrainQueue();
return;
}
bool was_force_flush_called = is_force_flush_.load();
if (was_force_flush_called)
{
is_force_flush_ = false;
} else {
if (buffer_.empty()) {
continue;
}
}
Export(was_force_flush_called);
}
}
This passes all the tests. The only drawback would be that the Shutdown
call can take a bit longer, but it simplifies the solution quite a bit and I think it makes it actually easier to implement a stable timeout mechanism on top of that (maybe in another PR).
std::mutex cv_m_, force_flush_cv_m_; | ||
|
||
/* The buffer/queue to which the ended spans are added */ | ||
std::unique_ptr<common::CircularBuffer<Recordable>> buffer_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need to be a pointer. If the batch span processor is allocated on the heap, the buffer will too.
{ | ||
// If we already have max_export_batch_size_ spans in the buffer, better to export them | ||
// now | ||
if (buffer_->size() < max_export_batch_size_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to understand the effect of this statement.
If lots of spans are constantly added to the buffer (and/or if max_export_batch_size_
is very small) this statement might always evaluate to false
, and thus we'll never reach the line with the cv_.wait_for
. So we won't honor the schedule_delay_millis_
under certain critical circumstances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can change it. It was just an effort to enhance performance, and I did this using Java's processor implementation as a reference.
auto end = std::chrono::steady_clock::now(); | ||
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start); | ||
|
||
timeout = schedule_delay_millis_ - duration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get rid of this calculation and just go with waiting for the whole schedule_delay_millis_
in each iteration. This would be more in line with the spec, which says:
the delay interval in milliseconds between two consecutive exports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that it meant the delay interval between when two exports essentially began.
I believe python has the same logic too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I had a look at Java, which doesn't have it.
I was drawn to the simpler solution, but I leave this up to you to decide.
Thank you so much for the simplification Johannes! Around 2 weeks ago my initial implementation was along the same lines, just that we were encountering the case where the condition variable notify calls were being missed by the worker thread, causing the Shutdown() test to take In order to avoid this, we wrapped notify calls in while loops with boolean flags as checks on whether the notify call had been received or not. If it's permissible that the |
I'm ok with that. I think invocations of |
Sure. I'll change it for implementation simplicity. Just that there are similar issues in python's implementation, where a lot of notify calls miss. |
47a9fb1
to
61516a3
Compare
0b49229
to
6d5eb05
Compare
This PR contains the implementation of a Batch Span Processor, as mentioned in #83.
I used the implementations in python, java and golang as references.
However, there are a lot of discrepancies in their implementations. For instance, Golang does not have a forceflush method and
Python has concurrency issues. On the other hand, Java is well synchronized, such that whenever the queue is being manipulated there are always locks guarding it.
I have added comments in the code for better context of the problems I observed.
Please let me know your thoughts on them, thanks!