-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrency: produce tracing metadata #55583
Comments
Additional context will follow, but as a starting point, consumers of this can roughly expect to get handed a slice of (protobuf) type ContentionEvent struct {
Key roachpb.Key
Txn roachpb.Transaction // pushee at the end of conflict resolution
Duration time.Duration // time spent contending on the key against that txn in that one instance
Outcome enum // pushed?
} |
@tbg, is the |
The other transaction. My expectation is that the "own transaction" is separately available, but if this turns out not to be the case for some unexpected reason (or the exact snapshot of the transaction at the moment of the conflict matters) it's not a problem to include it in the event as well. |
56906: sql: produce mock ContentionEvents and display contention time in EXPLAIN ANALYZE r=RaduBerinde,tbg a=asubiotto Please take a look at individual commits for details Release note: None Closes #56612 @tbg could you take a look at the first commit which defines a `ContentionEvent` protobuf? It's close to what's described in #55583 (minus some fields that can be added later). Co-authored-by: Alfonso Subiotto Marques <[email protected]>
This is a (very dirty) first stab at cockroachdb#55583 that I am posting to answer the following questions: 1. is this basic approach sane or should contention metadata be generated elsewhere (i.e. not in `WaitOn`)? `WaitOn` does not necessarily see all updates, though I would take the point of view that a contenting transaction that a request never ends up waiting on (presumably because it is busy handling other conflicts) is not one we need to observe. 2. how to best structure the code so that contention events are tracked. Completing the current approach would likely result in spaghetti code that is easy to get wrong. We seem to want to keep a bit of state around; more than we do with the current structure which intentionally minimizes state held across iterations of the wait loop. In particular, it's unclear to me whether `state.txn` and/or `state.key` can change within a single invocation of `WaitOn`. I think this all means that I need to understand this code (in particular that on the other side of newStateC) better. Release note: None
#58444 will address the bulk of the work here, sans the outcome. As far as I can tell (just talked this through with Nathan) the outcome is either
Leaving this issue open to add the outcome enum later. |
This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables cockroachdb#57114. Touches cockroachdb#55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None
This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables cockroachdb#57114. Touches cockroachdb#55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None
This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables cockroachdb#57114. Touches cockroachdb#55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None
This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables cockroachdb#57114. Touches cockroachdb#55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None
This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables cockroachdb#57114. Touches cockroachdb#55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None
58444: concurrency: emit structured contention information to trace r=irfansharif,nvanbenschoten a=tbg This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables #57114. Touches #55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None Co-authored-by: Tobias Grieger <[email protected]>
This change attaches a protobuf payload to the current Span whenever a request conflicts with another transaction. The payload contains the contending txn (i.e. the pushee) at the time at which it was first encountered, the key on which the conflict occurred (note that this is not necessarily the key at which the pushee is anchored) and the time spent waiting on the conflict (excluding intent resolution). This enables cockroachdb#57114. Touches cockroachdb#55583. I am not closing that issue yet because we also want to potentially track the outcome of the conflict. Release note: None
We produce metadata and while I said earlier that I wanted to leave the issue open to track improvements, I no longer consider that valuable. We can file new issues as we ship the current contention metadata and determine steps to improve them. |
Is your feature request related to a problem? Please describe.
In the context of always-on tracing, we want to emit structured metadata describing contention events. These will, among others, be consumed by #55243 for inclusion in the statements page. They will also figure prominently in SQL session tracing.
Describe the solution you'd like
Introduce a protobuf that is emitted whenever a transaction conflicts with another transaction during replica evaluation. The guiding principle here should be that given the information in the protobuf, the SQL layer can construct an explanation of the conflict instructive to the average app developer. This definitely needs to include
NB: since the metadata is emitted right when the conflict ends, the duration of the conflict is implicitly known.
The action will take place in the lock table waiter, which manages a single conflict here:
cockroach/pkg/kv/kvserver/concurrency/lock_table_waiter.go
Lines 156 to 399 in 1682c40
We'll want to create the protobuf at the top of the method, register it with the registry of inflight operations (also part of the always-on work) and emit it to the trace span (if any) when the conflict has been handled.
Note that the concurrency manager also has a second path that is taken for conflict resolution, via write intent errors. This also has a subcase that needs to emit an event, though it is not thought to be commonly hit (~only when the lock table grows too large).
cockroach/pkg/kv/kvserver/concurrency/concurrency_manager.go
Lines 279 to 287 in 4ab98a0
The existing testing in the concurrency package should lend itself very well to verifying that the correct events are emitted. The datadriven tests should be updated to automatically print out any events that are emitted, which means that the existing tests will already give us good coverage.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: