Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: txn giving up on refresh span collection causes closed ts to kick it out #44645

Closed
knz opened this issue Feb 3, 2020 · 15 comments
Closed
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. T-kv KV Team

Comments

@knz
Copy link
Contributor

knz commented Feb 3, 2020

Found by user:

  1. txn starts
  2. txn has a lot of operations whereby they exceed max_refresh_span_bytes and refresh span collection stops
  3. txn lasts for more than 30s
  4. closed ts "Catches up", doesn't find refresh spans and "kicks the txn out" (pushes it and client receives an error)
  5. the error is not the usual retry error because it is not caused by contention, but the error message does not clarify what is happening

There are three separate issues here:

  • we want a larger default for max_refresh_span_bytes so that the scenario becomes less likely. This is predicated on better memory tracking in KV, a separate work item (planned for 20.1, see the work @tbg has started on [dnm] kv: expose (and use) byte batch response size limit  #44341 ). I think this is orthogonal and should be kept out of scope here.

  • when the scenario happens we want the error message to be clearer about what needs to happen: either decrease the duration of the txn, or decrease the its number of refresh spans (fewer reads/writes), or increase max_refresh_span_bytes, or increase the closed ts delay

  • or we could avoid the situation entirely? Make the closed ts lag behind the long-running txn if it has disabled refresh spans collection.

cc @ajwerner @tbg for triage.

Jira issue: CRDB-5215

@knz knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. C-support labels Feb 3, 2020
@RoachietheSupportRoach
Copy link
Collaborator

Zendesk ticket #4611 has been linked to this issue.

@ajwerner
Copy link
Contributor

ajwerner commented Feb 3, 2020

txn lasts for more than 30s

To be more precise we should say that the txn lasts more than kv.closed_timestamp.target_interval which defaulted to 30s in 19.1 and 19.2 and will default to 3s in 20.1.

closed ts "Catches up", doesn't find refresh spans and "kicks the txn out" (pushes it and client receives an error)

The closed timestamp subsystem doesn't know anything about refresh spans. The closed timestamp subsystem prevents periodically attempts to make history below some timestamp immutable (more specifically, no new intents may be laid down before the closed timestamp though intents which were already written can still be resolved at that timestamp). When a write fails due to the closed timestamp, the transaction will be forced to refresh. The mechanism is identical to a read in the timestamp cache leading to a push (in fact the closed timestamp value for a write is utilized in Replica.applyTimestampCache:

// minReadTS is used as a per-request low water mark for the value returned from
// the timestamp cache. That is, if the timestamp cache returns a value below
// minReadTS, minReadTS (without an associated txn id) will be used instead to
// adjust the batch's timestamp.
func (r *Replica) applyTimestampCache(

When a transaction coordinator detects that it has been pushed then it will perform a refresh (the exact details of when this refresh occurs relative to other operations is not currently paged into my head). If there are too many spans to refresh (as defined by the kv.transaction.max_refresh_span_bytes cluster setting) then we won't even try.

the error is not the usual retry error because it is not caused by contention, but the error message does not clarify what is happening

Is the error not a usual retry error?

we want a larger default for max_refresh_span_bytes so that the scenario becomes less likely. This is predicated on better memory tracking in KV, a separate work item (planned for 20.1, see the work @tbg has started on #44341 ). I think this is orthogonal and should be kept out of scope here.

+1

when the scenario happens we want the error message to be clearer about what needs to happen: either decrease the duration of the txn, or decrease the its number of refresh spans (fewer reads/writes), or increase max_refresh_span_bytes, or increase the closed ts delay

I think this is a good idea. We could wrap the error in another layer to indicate that we didn't even try to refresh due to the refresh span byte limit.

or we could avoid the situation entirely? Make the closed ts lag behind the long-running txn if it has disabled refresh spans collection.

I don't think we've ever discussed a mechanism which always avoids pushing long-running txns. We have talked about detecting the pushes and then backing off the closed timestamp (#36478). The problem with this is that it makes an already best-effort mechanism much less predictable. There is talk about even more dramatically reducing the closed timestamp. Generally I'm opposed to ideas which prevent pushing of transactions. The refresh mechanism has become something of a cornerstone of our transaction protocol.

@knz
Copy link
Contributor Author

knz commented Feb 3, 2020

Is the error not a usual retry error?

It's a retry error but not the usual one. We've been very vocal in docs, support etc that retry errors are an artifact of contention. That's the usual case.

This one here happens without any contention whatsoever. It's misleading to bin it in the same conceptual category as our usual retry errors.

I'm not sure I would suggest to change the type of the error object, but we should absolutely clarify that it's not the run-off-the-mill retry error.

@ajwerner
Copy link
Contributor

ajwerner commented Feb 3, 2020

Is the error not a usual retry error?

It's a retry error but not the usual one. We've been very vocal in docs, support etc that retry errors are an artifact of contention. That's the usual case.

This one here happens without any contention whatsoever. It's misleading to bin it in the same conceptual category as our usual retry errors.

I'm not sure I would suggest to change the type of the error object, but we should absolutely clarify that it's not the run-off-the-mill retry error.

In terms of the cause of the error, I agree that it differs from a contention caused by a read. In terms of implementation and impact it is identical. It is absolutely the case that in general it should be treated as a run-of-the-mill retry in the case where refresh fails. Retry errors which occur for long-running transactions that cannot refresh due to having too many refresh spans probably should be handled differently than retry errors which just failed to refresh due to contention. From there we should further break down whether the push that lead to the refresh was caused by contention or by a closed timestamp.

I think all I'm saying is that there's two different interactions which are both worthy of differentiating.

Don't want to hijack this thread, but does it mean that CDC is "lagged" by ~30 seconds, compared to some fictional global real time event (commit) happens?

CDC emits rows with a timestamp field of exactly the hlc timestamp at which the transaction which performed the write commits. The latency from write to emit is currently lower bounded by waiting for the schema to be proven for the given timestamp which is implemented as polling. To reduce that latency from write to emit you can lower the changefeed.experimental_poll_interval though it will have some cost. The default there is 1s so each write will generally experience a uniform emit delay between 0-1s. Another feature of changefeeds is the resolved timestamp (see the option in the docs here). A resolved timestamp informs the client that all rows with timestamps older than the resolved timestamp have been seen (there will never be unseen rows with older timestamps emitted). The client configures how frequently these timestamps should be emitted. The actual timestamp which is emitted will lag the present by at least the kv.closed_timestamp.target_interval as we cannot resolve timestamps which have not been closed.

@andreimatei
Copy link
Contributor

I've extracted being smarter about the refresh spans tracking in #46095, if y'all don't mind.
This issue can remain for better error reporting in situations where all else fails.

@ajwerner
Copy link
Contributor

ajwerner commented Apr 8, 2020

To clarify for a future reader: the remaining work item on this issue AFAICT is to propagate a different, clearer error when a query is forced to retry due to the closed timestamp rather than the timestamp cache and then to provide documentation to help the user better understand the source of the restart and the available remedies.

@irfansharif
Copy link
Contributor

irfansharif commented Apr 30, 2020

One thing I'm a bit confused about, reading through zendesk#4611 and this thread.

The idea is that the closed ts infrastructure [...] is able to avoid conflicting with the txn [...] by using precise refresh span information.

The closed timestamp subsystem doesn't know anything about refresh spans.

Which one is it? I'm assuming the latter? Which brings me to my next question: now that in 20.1 kv.closed_timestamp.target_duration defaults to 3s, are we then forcing all txns taking over 3s to refresh?

@nvanbenschoten
Copy link
Member

Which one is it? I'm assuming the latter?

Yes, the closed timestamp subsystem doesn't have any understanding of refresh spans.

are we then forcing all txns taking over 3s to refresh?

Yes, all read-write transactions taking over 3s will be forced to refresh.

@knz
Copy link
Contributor Author

knz commented Apr 30, 2020

That feels super aggressive though. If there was any client interaction, that means there will be a retry error?

@ajwerner
Copy link
Contributor

ajwerner commented Apr 30, 2020

That feels super aggressive though. If there was any client interaction, that means there will be a retry error?

Only if new writes are issued and then the reads cannot be refreshed successfully. If the refresh fails, it generally indicates that there was some contention.

@ajwerner
Copy link
Contributor

#46095 will likely mitigate a large number of refresh failures due to refresh span size.

@knz
Copy link
Contributor Author

knz commented May 1, 2020

Only if new writes are issued and then the reads cannot be refreshed successfully. If the refresh fails, it generally indicates that there was some contention.

I read this as "the behavior under contention will produce more retry errors than it used to in the past". This sounds like a regression from a UX perspective.

@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@knz
Copy link
Contributor Author

knz commented Sep 19, 2023

@nvanbenschoten I think we fixed this, right?

@nvanbenschoten
Copy link
Member

Yes, we addressed the parts of this that seem actionable. I'll close for now, but we can open more specific issues if this kind of behavior remains problematic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. T-kv KV Team
Projects
No open projects
Archived in project
Development

No branches or pull requests

8 participants