-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: users surprised by the non-atomicity of TRUNCATE #27953
Comments
Discussed with @vivekmenezes: We're unlikely to change the atomicity of TRUNCATE, but at least the documentation should be updated to clarify this. Also the very long observed delay is abnormal and still needs to be investigated. |
cc @rmloveland this will need to be documented as known limitation |
@vivekmenezes also asks:
|
@knz |
@vivekmenezes the user has clarified the delay. Do we have enough information to analyze this from our side? |
@vivekmenezes Can I get a quick blurb describing this known limitation w/r/t the impact to user experience? Ideally, we need it by Friday 10/26 for the 2.1 Known Limitations page. Posting it on this issue and/or pinging me would be great. |
@sploiselle I see the limitations for schema changes discussed at https://www.cockroachlabs.com/docs/v2.1/online-schema-changes.html I think what need to be do is make it explicit on the TRUNCATE page |
It seems reasonable for the TRUNCATE statement/txn to have a post-commit that waits for the old table descriptor to have no active leases. That would make it effectively "transactional". |
Related: #42061 |
This seems to have been fixed some time ago. I'm not quite sure when. At least in 19.2 and the current release we wait for the leases on the old table to be released. I assumed we didn't when I saw this issue. I tried to reproduce this in v2.1 and earlier and failed. We do seem to wait for the old table's leases to drain. Perhaps there's some deeper bug here but it's not clear what it is. If anybody can produce a repro of this behavior, I'll happily work to understand it. |
@ajwerner, since this issue is closed, does that mean we can resolve/remove this known limitation from the 19.2 and/or 20.1 docs? https://www.cockroachlabs.com/docs/dev/known-limitations.html#truncate-does-not-behave-like-delete |
That comment is vague enough that it might still apply. In particular, in a 3 node cluster, it is possible for 3 connections to observe the following anomaly which is not possible with deletes:
This is weird and violates our consistency model in ways that relate to online schema changes and thus make that note reasonable. What this issue seemed to be about was a failure to observe the truncate after the truncate statement returned (after |
OK. Thanks, @ajwerner. I'll leave the limitation as-is for now then. |
User observes this "strange" behavior:
There are two problems at hand here:
it takes an abnormally long time for the truncate to propagate (1 minute according to the user). We'd expect the other nodes to start using the new descriptor nearly as soon as truncate completes (subject to the table lease expiration delay, which is a few seconds).
the truncate is not atomic. We set expectations that all statements in CRDB are transactions, so this is an example we should clean up, or make it very clear in docs where the exceptions lie.
The text was updated successfully, but these errors were encountered: