Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nexus should retry all transactions CRDB tells it to #3814

Closed
Tracked by #4189
jmpesp opened this issue Aug 2, 2023 · 9 comments
Closed
Tracked by #4189

Nexus should retry all transactions CRDB tells it to #3814

jmpesp opened this issue Aug 2, 2023 · 9 comments
Assignees
Milestone

Comments

@jmpesp
Copy link
Contributor

jmpesp commented Aug 2, 2023

#3754 adds a retry_transaction function on TransactionError that returns true if CRDB is telling us to, based on the instructions at https://www.cockroachlabs.com/docs/v23.1/transaction-retry-error-reference#client-side-retry-handling.

Part of the review of that PR asked: why not to this for all transactions? If CRDB is telling us that we should be retrying a transaction, then we should. This issue tracks that work. Retrying in general seems like a safe thing to do but part of this issue should be convincing ourselves that we're not creating unbounded work for CRDB by retrying every transaction until it succeeds. Note this may also be impossible

@davepacheco
Copy link
Collaborator

I think this is related to #973, which talks about having a way to put a bunch of statements into a non-interactive transaction. CockroachDB calls this batched statements. This allows CockroachDB to retry such errors automatically. This would have all the benefits mentioned in RFD 192 about avoiding interactive transactions, plus I'd expect it to minimize the window for transaction conflicts (since it avoids a round-trip to Nexus, plus all the time associated with doing the client-side retry).

@davepacheco
Copy link
Collaborator

davepacheco commented Oct 4, 2023

@smklein raised a good question about doing client-side retries in the meantime, since it doesn't seem like we'll have the foundation we need for batched transactions like this that soon. That'd be a useful stopgap but I'm not sure how that's possible. (I don't know it's impossible either, but I don't see how to do it.) transaction()/transaction_async() let you can run arbitrary Rust code between statements and that code can decide based on the results of a statement what statements it wants to issue next. They can also reference state outside the closure. And they can nest, too (you can start a transaction inside a transaction, which creates a savepoint). So I'm not sure which Rust code needs to be re-run. I feel like it's possible you have to rerun everything from the top-level BEGIN; (transaction/transaction_async()) in the stack. So I think you need to propagate the conflict error all the way to the top-level transaction call, and the caller of that maybe can re-do whatever it did the first time? Maybe we could create a transaction_async_retry() that accepts a Fn rather than FnOnce and convert stuff over to that? Then the question will be: how possible will that be -- i.e., do these closures in practice wind up depending on the ability to do things that a Fn can't necessarily do? My intuition is that they do but I'm not sure.

@smklein
Copy link
Collaborator

smklein commented Oct 4, 2023

Following up on @davepacheco 's comment:

  • First and foremost, any spot we use retries + interactive transactions is basically defying the advice of RFD 192, so it's a spot we need to revisit. That being said, it's by far the most flexible + easiest option to write, so I say this more from a "tech debt + future work" perspective rather than with a "this is wrong" lens.
    • I give this caveat because I empathize with our developers, but anything we do here is a band-aid. I'm also working on RFD 436 to give better options in the limit.
  • I agree with the sentiment of a transaction_async_retry being possible. I prototyped this briefly in async-bb8-diesel, and I think it's possible, with a few notes:
    • It basically requires a copy of transaction_async, but with Fn instead of FnOnce, and I believe it needed an additional Sync bound
    • For many of the callsites in Omicron, it also requires explicitly cloning arguments so that they can be used from a Fn instead of FnOnce. This works for a lot of cases, but not all our queries implement Clone, which makes this trickier.
    • It's not well documented, but transaction managers do expose a transaction_manager_status_mut message that helps us identify "are we in a transaction already?": https://docs.diesel.rs/master/src/diesel/connection/transaction_manager.rs.html#36-44 . This method may be useful for enforcing that the "retryable transactions" can only be called when not already within a nested transaction. I expect this to be a detail observed by the implementation to help prevent callers from misusing the API.

@smklein
Copy link
Collaborator

smklein commented Oct 6, 2023

https://www.cockroachlabs.com/docs/v23.1/advanced-client-side-transaction-retries seems a bit opinionated about how transactions should work, in a way that's slightly divergent from Diesel.

CRDB recommends the following:

  1. Start transactions with BEGIN; SAVEPOINT cockroach_restart;
  2. Issuing the operations of the transaction, which must end with RELEASE SAVEPOINT cockroach_restart.
  3. If at any point during (2) a 40001 error is returned, issue ROLLBACK TO SAVEPOINT. The caller can choose to try step (2) again, or simply ROLLBACK the whole transaction.
  4. If the execution completes, and the savepoint is released successfully, the transaction can be finished with COMMIT.

Diesel implements the AnsiTransactionManager: https://docs.diesel.rs/master/diesel/connection/struct.AnsiTransactionManager.html , which provides the "transaction-with-closure" method that we know. async-bb8-diesel calls into the TransationManager traits to implement something similar from an asynchronous context: https://github.com/oxidecomputer/async-bb8-diesel/blob/1446f7e0c1f05f33a0581abd51fa873c7652ab61/src/async_traits.rs#L94-L147

This TransactionManager exposes APIs to begin_transaction, commit_transaction, and rollback_transaction, but the ability to customize savepoints seems pretty much non-existent.

https://docs.diesel.rs/master/diesel/connection/trait.TransactionManager.html is a trait, admittedly, so it does seem possible for us to "roll our own" version of it? This could support existing transactions, but also add a "retryable cockroachdb transaction" method that is Fn instead of FnOnce and which can be re-issued if necessary.

This would have an advantage over the "hand-rolled-retry", e.g. in

// Retry this transaction until it succeeds. It's a little heavy in that
// there's a for loop inside that iterates over the datasets the
// argument regions belong to, and it often encounters the "retry
// transaction" error.
let transaction = {

Pros of using CRDB savepoints / retries:

  • CRDB docs indicate that using the cockroach_restart savepoint would "hold your place in line" between attempts, which seems like it would help transactions in contention make forward progress.
  • The inject_retry_errors_enabled session variable can reliably trigger transaction contention (3 times in a row) if the cockroach_restart SAVEPOINT is being used. This should make testing easier, as documented in https://www.cockroachlabs.com/docs/v23.1/transaction-retry-error-example#test-transaction-retry-logic
  • Having an out-of-the-box solution seems like it would make it much easier to write transactions that can "auto-retry" somewhat safely, and standardize retry for our existing transactions.

Cons of using CRDB savepoints / retries:

  • It's non-trivial? It also requires breaking away from the Diesel TransactionManager, which means re-implementing that transaction manager logic, and also requires threading through our connection libraries.

@morlandi7 morlandi7 added this to the 5 milestone Nov 16, 2023
smklein added a commit that referenced this issue Dec 6, 2023
Integrates automatic transaction retry into Nexus for most transactions.

Additionally, this PR provides a "RetryHelper" object to help
standardize how transaction retry is performed.
Currently, after a short randomized wait (up to an upper bound), we
retry unconditionally, emitting each
attempt to Oximeter for further analysis.

- [x] Depends on
oxidecomputer/async-bb8-diesel#58
- [x] As noted in
oxidecomputer/async-bb8-diesel#58, this will
require customizing CRDB session variables to work correctly. (Edit:
this is done on each transaction)


Part of oxidecomputer/customer-support#46 
Part of #3814
@askfongjojo
Copy link

On rack2 which is on omicron commit 8fa550c2c95b5fd3443af7e2e5837c2178fdb8c8, I run into this error a number of times when deleting disks in a loop or with Terraform in bulk:

08:14:34.789Z INFO 65a11c18-7f59-41ac-b9e7-680627f996e7 (dropshot_external): request completed
    error_message_external = Internal Server Error
    error_message_internal = saga ACTION error at node "no_result1": unexpected database error: value is too small for a byte count
    file = /home/build/.cargo/git/checkouts/dropshot-a4a923d29dccc492/ff87a01/dropshot/src/server.rs:841
    latency_us = 137909
    local_addr = 172.30.2.5:443
    method = DELETE
    remote_addr = 172.20.17.42:51614
    req_id = 28391b79-0560-4531-8683-b72766ce23de
    response_code = 500
    uri = https://oxide.sys.rack2.eng.oxide.computer/v1/disks/78870e8e-110c-4b3c-aebc-839a53fda970

There isn't much more information to go about in the saga events:

root@[fd00:1122:3344:105::3]:32221/omicron> select * from saga_node_event where saga_id = 'dfb03344-12c4-481f-9d7b-6948edaa4d2d';
                saga_id                | node_id |  event_type   |                                                                                                                                                                                                                                                                                                                                     data                                                                                                                                                                                                                                                                                                                                     |          event_time           |               creator
---------------------------------------+---------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------------------------------------
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | started       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.269961+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | succeeded     | {"block_size": "Traditional", "create_image_id": "e0ff30f2-e283-4d36-ad59-17d9a0b9b3c0", "create_snapshot_id": null, "identity": {"description": "app boot disk", "id": "48e80edd-1cda-4fd8-b917-1a59dab1d2ea", "name": "app-os-9", "time_created": "2023-12-10T00:49:31.595310Z", "time_deleted": null, "time_modified": "2023-12-10T00:49:31.595310Z"}, "pantry_address": null, "project_id": "4a2ff557-5234-41ea-81ac-6d01ac645594", "rcgen": 1, "runtime_state": {"attach_instance_id": null, "disk_state": "faulted", "gen": 1, "time_updated": "2023-12-10T00:49:31.595310Z"}, "size": 10737418240, "slot": null, "volume_id": "2609596c-1a5c-4ce6-832b-0c7aa729416b"} | 2023-12-10 00:57:04.278252+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | undo_finished | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.304956+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       0 | undo_started  | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.297054+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       1 | failed        | {"ActionFailed": {"source_error": {"InternalError": {"internal_message": "unexpected database error: value is too small for a byte count"}}}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 2023-12-10 00:57:04.289942+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |       1 | started       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.280445+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | started       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.265746+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | succeeded     | null                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.267928+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | undo_finished | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.308832+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
  dfb03344-12c4-481f-9d7b-6948edaa4d2d |      12 | undo_started  | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2023-12-10 00:57:04.307026+00 | 65a11c18-7f59-41ac-b9e7-680627f996e7
(10 rows)

@askfongjojo
Copy link

The error has more to it. I filed #4662 to provide more details.

@smklein
Copy link
Collaborator

smklein commented Dec 11, 2023

@askfongjojo : Closing the loop, that does not look like a transaction retry error, though it does look like a bug. I'll follow-up on #4662

@askfongjojo
Copy link

@askfongjojo : Closing the loop, that does not look like a transaction retry error, though it does look like a bug. I'll follow-up on #4662

Correct. I tested concurrent disk provisioning/deprovisioning further over the weekend and has not run into any problem so far.

@morlandi7
Copy link

This has tested successfully, will open new tickets if we hit new or edge cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants