Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: users surprised by the non-atomicity of TRUNCATE #27953

Closed
tim-o opened this issue Jul 26, 2018 · 13 comments
Closed

sql: users surprised by the non-atomicity of TRUNCATE #27953

tim-o opened this issue Jul 26, 2018 · 13 comments
Labels
A-docs A-schema-changes C-investigation Further steps needed to qualify. C-label will change. docs-done docs-known-limitation S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.

Comments

@tim-o
Copy link
Contributor

tim-o commented Jul 26, 2018

User observes this "strange" behavior:

43268153-e0f0418c-90bd-11e8-8560-e1bd1d271204-edit

There are two problems at hand here:

  • it takes an abnormally long time for the truncate to propagate (1 minute according to the user). We'd expect the other nodes to start using the new descriptor nearly as soon as truncate completes (subject to the table lease expiration delay, which is a few seconds).

  • the truncate is not atomic. We set expectations that all statements in CRDB are transactions, so this is an example we should clean up, or make it very clear in docs where the exceptions lie.

@tim-o tim-o added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-executor SQL txn logic A-schema-changes labels Jul 26, 2018
@knz knz changed the title sql-schema: make truncate statements atomic sql: users surprised by the non-atomicity of TRUNCATE Jul 26, 2018
@knz knz added the S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. label Jul 26, 2018
@knz
Copy link
Contributor

knz commented Jul 26, 2018

Discussed with @vivekmenezes:

We're unlikely to change the atomicity of TRUNCATE, but at least the documentation should be updated to clarify this.

Also the very long observed delay is abnormal and still needs to be investigated.

@knz knz added C-investigation Further steps needed to qualify. C-label will change. docs-known-limitation A-docs docs-todo and removed C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Jul 26, 2018
@knz
Copy link
Contributor

knz commented Jul 26, 2018

cc @rmloveland this will need to be documented as known limitation

@knz
Copy link
Contributor

knz commented Jul 26, 2018

@vivekmenezes also asks:

we need to ask the user how he timed it, because he is saying T0 + 10s , but the previous select took 17s

@xarg
Copy link

xarg commented Jul 27, 2018

@knz t0 is when the first query finished the first execution, not when it started. It should have been: t0 + 17s + 10s.

@knz
Copy link
Contributor

knz commented Jul 30, 2018

@vivekmenezes the user has clarified the delay. Do we have enough information to analyze this from our side?

@sploiselle
Copy link
Contributor

@vivekmenezes Can I get a quick blurb describing this known limitation w/r/t the impact to user experience? Ideally, we need it by Friday 10/26 for the 2.1 Known Limitations page. Posting it on this issue and/or pinging me would be great.

@vivekmenezes
Copy link
Contributor

@sploiselle I see the limitations for schema changes discussed at https://www.cockroachlabs.com/docs/v2.1/online-schema-changes.html
in the limitations section.

I think what need to be do is make it explicit on the TRUNCATE page
https://www.cockroachlabs.com/docs/stable/truncate.html
that TRUNCATE is a schema change and thus suffers from all the limitations of schema changes.
I'd also suggest maintaining a separate link to the limitations section of online schema changes.

@ajwerner
Copy link
Contributor

ajwerner commented Feb 5, 2020

It seems reasonable for the TRUNCATE statement/txn to have a post-commit that waits for the old table descriptor to have no active leases. That would make it effectively "transactional".

@knz
Copy link
Contributor

knz commented Feb 7, 2020

Related: #42061

@ajwerner
Copy link
Contributor

This seems to have been fixed some time ago. I'm not quite sure when. At least in 19.2 and the current release we wait for the leases on the old table to be released. I assumed we didn't when I saw this issue.

I tried to reproduce this in v2.1 and earlier and failed. We do seem to wait for the old table's leases to drain. Perhaps there's some deeper bug here but it's not clear what it is. If anybody can produce a repro of this behavior, I'll happily work to understand it.

@jseldess
Copy link
Contributor

@ajwerner, since this issue is closed, does that mean we can resolve/remove this known limitation from the 19.2 and/or 20.1 docs? https://www.cockroachlabs.com/docs/dev/known-limitations.html#truncate-does-not-behave-like-delete

@ajwerner
Copy link
Contributor

That comment is vague enough that it might still apply. In particular, in a 3 node cluster, it is possible for 3 connections to observe the following anomaly which is not possible with deletes:

n1@t1: TRUNACE TABLE foo -- blocks until t4
n2@t2: SELECT * FROM foo; INSERT INTO bar VALUES ... -- observes truncate, writes rows
n3@t3: SELECT * FROM foo; SELECT * FROM bar -- does not observe truncate, does observe n2's writes.
n1@t4: Truncate concludes

This is weird and violates our consistency model in ways that relate to online schema changes and thus make that note reasonable. What this issue seemed to be about was a failure to observe the truncate after the truncate statement returned (after t4 in the above example).

@jseldess
Copy link
Contributor

OK. Thanks, @ajwerner. I'll leave the limitation as-is for now then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs A-schema-changes C-investigation Further steps needed to qualify. C-label will change. docs-done docs-known-limitation S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Projects
None yet
Development

No branches or pull requests

7 participants