Nexus should retry all transactions CRDB tells it to #3814

jmpesp · 2023-08-02T16:29:18Z

#3754 adds a retry_transaction function on TransactionError that returns true if CRDB is telling us to, based on the instructions at https://www.cockroachlabs.com/docs/v23.1/transaction-retry-error-reference#client-side-retry-handling.

Part of the review of that PR asked: why not to this for all transactions? If CRDB is telling us that we should be retrying a transaction, then we should. This issue tracks that work. Retrying in general seems like a safe thing to do but part of this issue should be convincing ourselves that we're not creating unbounded work for CRDB by retrying every transaction until it succeeds. Note this may also be impossible

The text was updated successfully, but these errors were encountered:

davepacheco · 2023-08-16T23:43:50Z

I think this is related to #973, which talks about having a way to put a bunch of statements into a non-interactive transaction. CockroachDB calls this batched statements. This allows CockroachDB to retry such errors automatically. This would have all the benefits mentioned in RFD 192 about avoiding interactive transactions, plus I'd expect it to minimize the window for transaction conflicts (since it avoids a round-trip to Nexus, plus all the time associated with doing the client-side retry).

davepacheco · 2023-10-04T00:00:05Z

@smklein raised a good question about doing client-side retries in the meantime, since it doesn't seem like we'll have the foundation we need for batched transactions like this that soon. That'd be a useful stopgap but I'm not sure how that's possible. (I don't know it's impossible either, but I don't see how to do it.) transaction()/transaction_async() let you can run arbitrary Rust code between statements and that code can decide based on the results of a statement what statements it wants to issue next. They can also reference state outside the closure. And they can nest, too (you can start a transaction inside a transaction, which creates a savepoint). So I'm not sure which Rust code needs to be re-run. I feel like it's possible you have to rerun everything from the top-level BEGIN; (transaction/transaction_async()) in the stack. So I think you need to propagate the conflict error all the way to the top-level transaction call, and the caller of that maybe can re-do whatever it did the first time? Maybe we could create a transaction_async_retry() that accepts a Fn rather than FnOnce and convert stuff over to that? Then the question will be: how possible will that be -- i.e., do these closures in practice wind up depending on the ability to do things that a Fn can't necessarily do? My intuition is that they do but I'm not sure.

smklein · 2023-10-04T00:26:19Z

Following up on @davepacheco 's comment:

First and foremost, any spot we use retries + interactive transactions is basically defying the advice of RFD 192, so it's a spot we need to revisit. That being said, it's by far the most flexible + easiest option to write, so I say this more from a "tech debt + future work" perspective rather than with a "this is wrong" lens.
- I give this caveat because I empathize with our developers, but anything we do here is a band-aid. I'm also working on RFD 436 to give better options in the limit.
I agree with the sentiment of a transaction_async_retry being possible. I prototyped this briefly in async-bb8-diesel, and I think it's possible, with a few notes:
- It basically requires a copy of transaction_async, but with Fn instead of FnOnce, and I believe it needed an additional Sync bound
- For many of the callsites in Omicron, it also requires explicitly cloning arguments so that they can be used from a Fn instead of FnOnce. This works for a lot of cases, but not all our queries implement Clone, which makes this trickier.
- It's not well documented, but transaction managers do expose a transaction_manager_status_mut message that helps us identify "are we in a transaction already?": https://docs.diesel.rs/master/src/diesel/connection/transaction_manager.rs.html#36-44 . This method may be useful for enforcing that the "retryable transactions" can only be called when not already within a nested transaction. I expect this to be a detail observed by the implementation to help prevent callers from misusing the API.

smklein · 2023-10-06T21:08:05Z

https://www.cockroachlabs.com/docs/v23.1/advanced-client-side-transaction-retries seems a bit opinionated about how transactions should work, in a way that's slightly divergent from Diesel.

CRDB recommends the following:

Start transactions with BEGIN; SAVEPOINT cockroach_restart;
Issuing the operations of the transaction, which must end with RELEASE SAVEPOINT cockroach_restart.
If at any point during (2) a 40001 error is returned, issue ROLLBACK TO SAVEPOINT. The caller can choose to try step (2) again, or simply ROLLBACK the whole transaction.
If the execution completes, and the savepoint is released successfully, the transaction can be finished with COMMIT.

Diesel implements the AnsiTransactionManager: https://docs.diesel.rs/master/diesel/connection/struct.AnsiTransactionManager.html , which provides the "transaction-with-closure" method that we know. async-bb8-diesel calls into the TransationManager traits to implement something similar from an asynchronous context: https://github.com/oxidecomputer/async-bb8-diesel/blob/1446f7e0c1f05f33a0581abd51fa873c7652ab61/src/async_traits.rs#L94-L147

This TransactionManager exposes APIs to begin_transaction, commit_transaction, and rollback_transaction, but the ability to customize savepoints seems pretty much non-existent.

https://docs.diesel.rs/master/diesel/connection/trait.TransactionManager.html is a trait, admittedly, so it does seem possible for us to "roll our own" version of it? This could support existing transactions, but also add a "retryable cockroachdb transaction" method that is Fn instead of FnOnce and which can be re-issued if necessary.

This would have an advantage over the "hand-rolled-retry", e.g. in

omicron/nexus/db-queries/src/db/datastore/region.rs

Lines 169 to 173 in d300fb8

    
           // Retry this transaction until it succeeds. It's a little heavy in that 
        
           // there's a for loop inside that iterates over the datasets the 
        
           // argument regions belong to, and it often encounters the "retry 
        
           // transaction" error. 
        
           let transaction = {

Pros of using CRDB savepoints / retries:

CRDB docs indicate that using the cockroach_restart savepoint would "hold your place in line" between attempts, which seems like it would help transactions in contention make forward progress.
The inject_retry_errors_enabled session variable can reliably trigger transaction contention (3 times in a row) if the cockroach_restart SAVEPOINT is being used. This should make testing easier, as documented in https://www.cockroachlabs.com/docs/v23.1/transaction-retry-error-example#test-transaction-retry-logic
Having an out-of-the-box solution seems like it would make it much easier to write transactions that can "auto-retry" somewhat safely, and standardize retry for our existing transactions.

Cons of using CRDB savepoints / retries:

It's non-trivial? It also requires breaking away from the Diesel TransactionManager, which means re-implementing that transaction manager logic, and also requires threading through our connection libraries.

Integrates automatic transaction retry into Nexus for most transactions. Additionally, this PR provides a "RetryHelper" object to help standardize how transaction retry is performed. Currently, after a short randomized wait (up to an upper bound), we retry unconditionally, emitting each attempt to Oximeter for further analysis. - [x] Depends on oxidecomputer/async-bb8-diesel#58 - [x] As noted in oxidecomputer/async-bb8-diesel#58, this will require customizing CRDB session variables to work correctly. (Edit: this is done on each transaction) Part of oxidecomputer/customer-support#46 Part of #3814

askfongjojo · 2023-12-10T01:20:13Z

On rack2 which is on omicron commit 8fa550c2c95b5fd3443af7e2e5837c2178fdb8c8, I run into this error a number of times when deleting disks in a loop or with Terraform in bulk:

08:14:34.789Z INFO 65a11c18-7f59-41ac-b9e7-680627f996e7 (dropshot_external): request completed
    error_message_external = Internal Server Error
    error_message_internal = saga ACTION error at node "no_result1": unexpected database error: value is too small for a byte count
    file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/ff87a01/dropshot/src/server.rs:841
    latency_us = 137909
    local_addr = 172.30.2.5:443
    method = DELETE
    remote_addr = 172.20.17.42:51614
    req_id = 28391b79-0560-4531-8683-b72766ce23de
    response_code = 500
    uri = https://oxide.sys.rack2.eng.oxide.computer/v1/disks/78870e8e-110c-4b3c-aebc-839a53fda970

There isn't much more information to go about in the saga events:

root@[fd00:1122:3344:105::3]:32221/omicron> select * from saga_node_event where saga_id = 'dfb03344-12c4-481f-9d7b-6948edaa4d2d';
                saga_id                | node_id |  event_type   |                                                                                                                                                                                                                                                                                                                                     data                                                                                                                                                                                                                                                                                                                                     |          event_time           |               creator
---------------------------------------+---------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------------------------------------
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | started       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.269961+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | succeeded     | {"block_size": "Traditional", "create_image_id": "e0ff30f2-e283-4d36-ad59-17d9a0b9b3c0", "create_snapshot_id": null, "identity": {"description": "app boot disk", "id": "48e80edd-1cda-4fd8-b917-1a59dab1d2ea", "name": "app-os-9", "time_created": "2023-12-10T00:49:31.595310Z", "time_deleted": null, "time_modified": "2023-12-10T00:49:31.595310Z"}, "pantry_address": null, "project_id": "4a2ff557-5234-41ea-81ac-6d01ac645594", "rcgen": 1, "runtime_state": {"attach_instance_id": null, "disk_state": "faulted", "gen": 1, "time_updated": "2023-12-10T00:49:31.595310Z"}, "size": 10737418240, "slot": null, "volume_id": "2609596c-1a5c-4ce6-832b-0c7aa729416b"} | 2023-12-10 00:57:04.278252+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | undo_finished | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.304956+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | undo_started  | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.297054+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       1 | failed        | {"ActionFailed": {"source_error": {"InternalError": {"internal_message": "unexpected database error: value is too small for a byte count"}}}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 2023-12-10 00:57:04.289942+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       1 | started       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.280445+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | started       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.265746+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | succeeded     | null                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.267928+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | undo_finished | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.308832+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | undo_started  | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.307026+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
(10 rows)

askfongjojo · 2023-12-10T02:57:15Z

The error has more to it. I filed #4662 to provide more details.

smklein · 2023-12-11T16:51:46Z

@askfongjojo : Closing the loop, that does not look like a transaction retry error, though it does look like a bug. I'll follow-up on #4662

askfongjojo · 2023-12-11T17:34:46Z

@askfongjojo : Closing the loop, that does not look like a transaction retry error, though it does look like a bug. I'll follow-up on #4662

Correct. I tested concurrent disk provisioning/deprovisioning further over the weekend and has not run into any problem so far.

morlandi7 · 2024-01-04T16:14:03Z

This has tested successfully, will open new tickets if we hit new or edge cases

jmpesp mentioned this issue Aug 2, 2023

Wait for Crucible agent to do requested work #3754

Merged

smklein self-assigned this Oct 2, 2023

smklein mentioned this issue Oct 3, 2023

Tracking Issue for Database Improvements #4189

Open

13 tasks

morlandi7 added this to the 5 milestone Nov 16, 2023

smklein mentioned this issue Nov 16, 2023

[nexus] Make most transactions automatically retry #4487

Merged

2 tasks

askfongjojo mentioned this issue Dec 18, 2023

CRDB returned "RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict" when creating 3 instances concurrently #3304

Open

morlandi7 closed this as completed Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nexus should retry all transactions CRDB tells it to #3814

Nexus should retry all transactions CRDB tells it to #3814

jmpesp commented Aug 2, 2023

davepacheco commented Aug 16, 2023

davepacheco commented Oct 4, 2023 •

edited

Loading

smklein commented Oct 4, 2023

smklein commented Oct 6, 2023

askfongjojo commented Dec 10, 2023

askfongjojo commented Dec 10, 2023

smklein commented Dec 11, 2023

askfongjojo commented Dec 11, 2023

morlandi7 commented Jan 4, 2024

Nexus should retry all transactions CRDB tells it to #3814

Nexus should retry all transactions CRDB tells it to #3814

Comments

jmpesp commented Aug 2, 2023

davepacheco commented Aug 16, 2023

davepacheco commented Oct 4, 2023 • edited Loading

smklein commented Oct 4, 2023

smklein commented Oct 6, 2023

askfongjojo commented Dec 10, 2023

askfongjojo commented Dec 10, 2023

smklein commented Dec 11, 2023

askfongjojo commented Dec 11, 2023

morlandi7 commented Jan 4, 2024

davepacheco commented Oct 4, 2023 •

edited

Loading