-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(subscriber): only send *new* tasks/resources/etc over the event channel #238
Conversation
This is currently a draft PR because there's still a decently large amount of code that's commented out. In particular, we don't currently record attribute updates, because I haven't written that part yet, and basically all of the JSON recording is broken by this change and needs to be rewritten. Also, I haven't cleaned up the code, added comments, or fixed most of the warnings yet. But, the console's basic functionality mostly works --- you can run this branch and see tasks and their poll durations and so on. I wanted to go ahead and open a PR to get some feedback from @seanmonstar and @zaharidichev so that we can agree that the overall approach here is a good idea. If you two (or anyone else) has a chance to take a look at this, I'd really appreciate that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan looks good to me. Would be cool to measure the impact of before and after, in case locking and atomic ops hurts the runtime, but mostly just so we are aware...
console-subscriber/src/record.rs
Outdated
at: *at, | ||
op: *op, | ||
}, | ||
// TODO(eliza): this more or less bricks recording, lol...put this back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol 😭
Yeah, it would definitely be good to come up with a benchmark setup for measuring the console instrumentation's performance cost...we haven't really done this yet, so it would be good to do regardless. FWIW, using the MPSC channel also involves locking and some atomic ops, so 🤷♀️ |
59f3996
to
a4b9d2c
Compare
Currently, `console-subscriber` contains a bunch of machinery for rewriting non-sequential `span::Id`s from `tracing` to sequential IDs (see #75). Upon thinking about this for a bit, I don't actually understand why this has to be done on the instrumentation-side. This seems like extra work that's currently done in the instrumented application, when it really doesn't have to be. Instead, the client should be responsible for rewriting `tracing` IDs to pretty, sequential user-facing IDs. This would have a few advantages: - it moves some work out of the application, which is always good - if data is being emitted through an implementation other than `console-subscriber`, we will *still* get nicely ordered ids - this also makes some of the stuff i'm working on in #238 easier This branch removes ID rewriting from `console-subscriber`, and adds it to the `console` CLI's `state` module. Closes #240 Signed-off-by: Eliza Weisman <[email protected]>
0a5c577
to
235b2c4
Compare
af8f9ee
to
b4549ed
Compare
Okay, I've marked this as "ready for review", as I believe everything should work now and it compiles without warnings. @seanmonstar and @zaharidichev, whenever you two have time to take a look, I'd appreciate a careful review, as this is a kind of large change. I think it's possible to remove some of the mutexes; the change I discussed in #254 could remove the need for mutexes in timestamps. However, in practice, I think the mutexes should basically never be contended, as the aggregator task won't try to read from them unless the dirty bit is set, and it only gets set when we're done updating the data. The main overhead of the mutexes is probably adding a heap allocation/ptr dereference, which is fixed by enabling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks great to me and makes sense. I think there is a problem however with the lifecycle of async ops. If you run the mutex example you will see that the lock always stays in one state and after you run it in a bit the async ops associated with the Mutex resource just go to zero. On main
however new ops continue to appear and the state attribute of the resource changes from locked=true
to locked=false
.
|
||
let span = match ctx.span(id) { | ||
Some(span) => span, | ||
// XXX(eliza): no span exists for a resource ID, we should maybe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just go ahead and print an error. This kind of error can lead to wrong data and quite a bit of headscratchers, so we better just print. On the other hand having random output to any app that includes the crate is a bit annoying. I wonder whether we can introduce some more elegant way of handling these logic errors. WDYT?
pub(crate) fn drop_task(&self, dropped_at: SystemTime) { | ||
if self.is_dropped.swap(true, AcqRel) { | ||
// The task was already dropped. | ||
// TODO(eliza): this could maybe panic in debug mode... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to elaborate
Thanks, good catch! I must have missed that because I was mostly testing with the semaphore example. I'll try to figure out what's going on. |
Thanks, good catch! I must have missed that because I was mostly testing with the semaphore example. I'll try to figure out what's going on. @hawkw I think that has something to do with the fact that you are creating resources,async ops and tasks as not dirty. I wonder whether that is correct. The way you have it now if in an instrumented app you simply create a bunch of resources like locks, they will not show up in the console until they are used in some way. Is that what we want? In any case if you flip the The other problem with the |
Oh, that's definitely not right. I'll fix that. |
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
thanks @zaharidichev for noticing that! Signed-off-by: Eliza Weisman <[email protected]>
453b878
to
78f3c4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice refactor, works great. I have tested it with most of the examples and things are as expected
EDIT: after adding some printlns, it does appear that @zaharidichev is correct about this; it looks like it just happens that, because the mutex is re-acquired very quickly in the example, we never actually publish an update where it is in the unlocked state. If I modify the example a bit so that the tasks wait a variable amount if time before locking the mutex, and change the main task to not immediately lock the mutex in a loop, we see it in both the locked and unlocked state. I think this is probably fine. |
Signed-off-by: Eliza Weisman <[email protected]>
@hawkw Yes just sat down to comment that this is exactly what I did as well to ensure that the update gets propagated. In this example it is just the case that the lock has heavily contended - there is basically always an async op awaiting on it so the moment it gets released, it is acquired again. So yes, all works as expected. |
Signed-off-by: Eliza Weisman <[email protected]>
<a name="0.1.1"></a> ## 0.1.1 (2022-01-18) #### Bug Fixes * only send *new* tasks/resources/etc over the event channel (#238) ([fdc77e2](fdc77e2)) * increased default event buffer capacity (#235) ([0cf0aee](0cf0aee)) * use saturating arithmetic for attribute updates (#234) ([fe82e17](fe82e17)) #### Changes * moved ID rewriting from `console-subscriber` to the client (#244) ([095b1ef](095b1ef)) ## 0.1.0 (2021-12-16) - Initial release! 🎉
<a name="0.1.1"></a> ## 0.1.1 (2022-01-18) #### Bug Fixes * only send *new* tasks/resources/etc over the event channel (#238) ([fdc77e2](fdc77e2)) * increased default event buffer capacity (#235) ([0cf0aee](0cf0aee)) * use saturating arithmetic for attribute updates (#234) ([fe82e17](fe82e17)) #### Changes * moved ID rewriting from `console-subscriber` to the client (#244) ([095b1ef](095b1ef)) ## 0.1.0 (2021-12-16) - Initial release! 🎉
<a name="0.1.1"></a> ## 0.1.1 (2022-01-18) #### Bug Fixes * only send *new* tasks/resources/etc over the event channel (#238) ([fdc77e2](fdc77e2)) * increased default event buffer capacity (#235) ([0cf0aee](0cf0aee)) * use saturating arithmetic for attribute updates (#234) ([fe82e17](fe82e17)) #### Changes * moved ID rewriting from `console-subscriber` to the client (#244) ([095b1ef](095b1ef))
Self-wakes were not being detected and displayed in the console. The `burn` task in the `app` example - which deliberately has many self-wakes - was not registering any. It appears that a logic error was present in the self-wake counting in `console-subscriber` since it was added in #238. When a self wake was detected, the `wakes` count was incremented a second time (the `wakes` count is incremented for all wakes before checking for a self wake), instead of increamenting the `self_wakes` count. This PR fixes the logic so that when a self wake is detected, the `self_wakes` count is incremented.
When entering and exiting a span the old code was also updating the parent stats. This was causing excessive polling being reported for the parent tasks. See issue #378 for more details. The regression was introduced by the refactor in #238. This fixes the issue by limiting updates to the current span. Closes #378 Co-authored-by: Hayden Stainsby <[email protected]> Co-authored-by: Eliza Weisman <[email protected]>
Self-wakes were not being detected and displayed in the console. The `burn` task in the `app` example - which deliberately has many self-wakes - was not registering any. It appears that a logic error was present in the self-wake counting in `console-subscriber` since it was added in #238. When a self wake was detected, the `wakes` count was incremented a second time (the `wakes` count is incremented for all wakes before checking for a self wake), instead of increamenting the `self_wakes` count. This PR fixes the logic so that when a self wake is detected, the `self_wakes` count is incremented.
When entering and exiting a span the old code was also updating the parent stats. This was causing excessive polling being reported for the parent tasks. See issue #378 for more details. The regression was introduced by the refactor in #238. This fixes the issue by limiting updates to the current span. Closes #378 Co-authored-by: Hayden Stainsby <[email protected]> Co-authored-by: Eliza Weisman <[email protected]>
Motivation
Currently, there are some rather bad issues that occur when the event
buffer is at capacity and events are dropped.
Completely losing data due to buffer capacity is relatively okay: if
we set a bound on how much memory the console can use, and we don't
record new things that occur when we've reached that limit, this is
correct and acceptable behavior. However, the current design can result
in incorrect data when events are lost due to the buffer being at
capacity.
This is because we currently record things like starting to poll a
task/resource, ending a poll, dropping a task/resource/async op, and
waker operations, as individual events in the buffer. This means that we
can have a situation where the creation of a task was recorded, because
there was buffer capacity at the time, but then when the task ended, the
buffer was full, so we never recorded its termination. That results in
the tasks appearing to run forever and never terminate --- see issue
#230. Similarly, if we record the beginning of a poll, but drop the
end-of-poll event because we're at capacity, this means we will
(incorrectly) record a poll that goes on forever, which is obviously
incorrect. I think this may also be the cause of the false positives
with the lost waker lint (#149), if we record a waker drop but missed a
waker clone that previously occurred.
The change in #212 fixed one category of issue that occurs due to event
buffer capacity --- when a task, resource, or async op's creation
event is dropped due to buffer capacity, we skip any subsequent events
related to that task/resource/op. However, this doesn't fix issues where
the subsequent events are the ones that are dropped.
Solution
This branch proposes a solution to this whole category of event buffer
capacity related issues. Unfortunately, this requires rewriting a lot
of
console-subscriber
's internals.In the new approach, we now only send events over the channel when
creating a new task, resource, or async op. Those events now contain an
Arc
holding the stats for that entity. Another clone of theArc
isstored in the
tracing_subscriber::Registry
's span extensions for thespan corresponding to that entity. When the
ConsoleLayer
recordssubsequent events for a particular entity, such as starting/ending a
poll, it looks up the span by ID, and updates the stats type stored in
its extensions. The aggregator stores its clone of the
Arc
in a map ofentities, just like it does currently, but no longer handles actually
updating the stats; just building wire format updates from any tracked
entities whose data was updated by the layer.
This should fix all issues where dropping something due to event buffer
capacity results in incorrect data. Once we have successfully recorded
the creation of a task, resource, or async op, any subsequent updates
to its stats are guaranteed to be reliable. If the channel is at
capacity and we fail to record a new resource/task/op, we never create a
stats extension for it, and we won't record anything for it at all.
Otherwise, it will always have correct data recorded.
When possible, the stats in the
Arc
ed stats are updated atomically. Insome cases, this isn't easily possible, and some fields of the stats
types are stored in a mutex. In particualr, this is required for storing
timestamps. I don't really love that, but these mutices should be
contented very infrequently. Stats aren't marked as having unset updates
until after the stats inside the mutices have been updated, so the
aggregator will not try to lock the mutex if the layer is currently
updating it; instead, it will simply be included in the next update once
the layer is no longer touching it. Mutices here will only be contended
when multiple threads are updating a task's stats at the same time,
which should occur very rarely...and in most cases, they still won't
have to contend a mutex, since access to most of the mutices are guarded
by an atomic variable for e.g. determining which thread actually was the
last to complete a concurrent poll. The biggest performance downside of
the mutices is probably not actually contention, but the additional heap
allocation required when using
std::sync::Mutex
. However, since wehave conditional
parking_lot
support, parking_lot can be used to avoidrequiring additional allocations.
In the future, it's probably possible to make more of this atomic by
converting timestamps into integers and storing them in atomic
variables. I haven't done this yet because both the protobuf timestamps
and
std::time
timestamps are larger than a single 64-bit number and itmight take a little extra work to ensure we can nicely fit them in an
AtomicUsize
...but we can probably do that later.