Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offload write encryption #1066

Merged
merged 3 commits into from
Jan 4, 2024

Conversation

mkeeter
Copy link
Contributor

@mkeeter mkeeter commented Dec 19, 2023

This PR moves write encryption out of the main task.

Doing so is harder than it sounds, because we want to preserve ordering of incoming BlockReq, but only BlockReq::Write and BlockReq::WriteUnwritten need this encryption offload.

I turned to FuturesOrdered, which allows you to store a set of futures which resolve in FIFO order. For most BlockReq, we await a futures::future::ready(..), which simply provides the same BlockReq back to us. For writes, we offload the work using rayon::spawn, then return the result through a tokio::sync::oneshot channel. This strategy is recommended by this blog post by a Tokio maintainer.

We can then store Either<Ready, oneshot::Receiver> in a FuturesOrdered, which works out nicely.

The PR introduces a new generic DeferredQueue to organize these futures. (The DeferredQueue will also be used to defer read decryption in a subsequent PR). When a BlockReq comes in, we put it into this queue*; futures being available on the queue are another event source for the select / apply loop.

This is a significant speedup for large random writes:

IO main offload-write-encryption
4K random writes 27.2 MiB/s 26.7 MiB/s
1M random writes 609 MiB/s 710 MiB/s
4M random writes 629 MiB/s 890 MiB/s

I'm not sure if the performance delta for 4K writes is significant; if folks are worried, we could special-case sufficiently small writes to be executed in the main task.

*as an optimization, we execute non-write BlockOps immediately if there is nothing in the queue, to avoid an extra event

Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to preserve the ordering of incoming BlockReq ?

upstairs/src/deferred.rs Outdated Show resolved Hide resolved
upstairs/src/upstairs.rs Show resolved Hide resolved
upstairs/src/upstairs.rs Show resolved Hide resolved
upstairs/src/upstairs.rs Outdated Show resolved Hide resolved
upstairs/src/upstairs.rs Outdated Show resolved Hide resolved
@mkeeter
Copy link
Contributor Author

mkeeter commented Jan 2, 2024

This failed a CI check. Since the Github UI makes it easy to lose this things, I'm linking it here for reference:
https://github.com/oxidecomputer/crucible/pull/1065/checks?check_run_id=20095256238

Edit: this is #1077 , so unrelated to this branch

@mkeeter
Copy link
Contributor Author

mkeeter commented Jan 2, 2024

@leftwo

Why do we need to preserve the ordering of incoming BlockReq?

This is preserving the order of BlockReq in the period between (1) being sent by the guest and (2) being added to the active jobs list (which establishes inter-job dependencies).

We need to preserve order for correctness. It would be wrong if the guest sent

1. Do a big, encrypted write
2. Read from that same block

and we reordered that to

2. Read from that same block
1. Do the big encrypted write (delayed because encryption is slow)

@mkeeter

This comment was marked as outdated.

@pfmooney

This comment was marked as outdated.

@leftwo

This comment was marked as outdated.

@mkeeter

This comment was marked as outdated.

@leftwo

This comment was marked as outdated.

@mkeeter

This comment was marked as outdated.

@mkeeter mkeeter force-pushed the offload-write-encryption branch 2 times, most recently from eb1a91d to 63bdfe1 Compare January 4, 2024 15:15
@mkeeter
Copy link
Contributor Author

mkeeter commented Jan 4, 2024

After a bunch of discussion, I moved the BlockReq changes to a standalone PR #1085, which must be merged before this one.

(this PR is currently rebased onto that branch so that tests pass, but we'll re-rebase it onto main once #1085 is merged)

mkeeter added a commit that referenced this pull request Jan 4, 2024
#1064 added new logic to the
`BlockReq`: if the notification channel was dropped without a reply having been
sent, Crucible would panic.

After a bunch of discussion, I no longer think this extra complexity is worth
it, and would like to remove it.

## Why did we do this in the first place?
In general, we expect a `BlockReq` to _always_ be answered by Crucible.  If one
is dropped without a reply being sent, it indicates a programmer error.  This
behavior was inspired by similar logic in Propolis
(https://github.com/oxidecomputer/propolis/blob/85dc6db36709981c1d35b0651a18e20caee0b73f/lib/propolis/src/block/mod.rs#L176-L182).

## Propolis and Crucible are different
The `Request` in Propolis is built around
a `device::TrackingMarker`.  If it is dropped without a reply, then the
listening side will **wait forever**; this is a potential liveness issue, so
turning it into a panic is sensible.

The `BlockReq` in Crucible is built around a `tokio::sync::oneshot` channel.  If
it is dropped without a reply, then the `oneshot::Receiver` will return
immediately with an error.  In fact, we handle that case in our
`BlockReqWaiter`, translating this error into `CrucibleError::RecvDisconnected`
(https://github.com/oxidecomputer/crucible/blob/e813da3f/upstairs/src/block_req.rs#L67).
In other words, we're _already_ handling the case of a `BlockReq` being dropped
without a reply (by reporting an error).  There is no potential liveness issue.

(This also means that the block comment in `BlockRes::drop`,
https://github.com/oxidecomputer/crucible/blob/e813da3f/upstairs/src/block_req.rs#L39-L40
is incorrect)

It's arguably better to panic upon programmer error, versus going through normal
error-handling paths (which defer handling to the caller of the `Guest` API).
However...

## False positives during program exit
There is one time when a `BlockReq` can be dropped without a reply: during
program exit (or scope exit, for unit tests).

As far as I can tell, this panic **hasn't caught any true programmer errors**.
Instead, it's found a bunch of cases where `BlockReq` are dropped during exit,
and we've had to add `Drop` implementations to prevent it from firing.  The
original PR added `impl Drop for Guest`
(https://github.com/oxidecomputer/crucible/blob/e813da3f/upstairs/src/lib.rs#L2424-L2446),
but it missed a sneaky case where `BlockReq` were stored in the `UpstairsState`;
a later PR added `impl Drop for Upstairs`
(https://github.com/oxidecomputer/crucible/blob/e813da3f/upstairs/src/upstairs.rs#L194-L207)
to avoid panics during certain unit tests.

## Future false positives
This check also makes it easy to accidentally create false positives when
writing new code.  The `offload-write-encryption` branch
(#1066) moves some
`BlockReq`s into a thread pool to perform encryption outside of the main task.
However, because we didn't add special handling for program exit, this caused a
panic at shutdown.

## This is hard to do by accident
Remember what we're actually trying to protect against: someone dropping a
`BlockReq` without replying.  Because we've annotated relevant functions with
`#[must_use]`, this would require either ignoring warnings or explicitly
discarding the `BlockReq`.  It's hard to do by accident!

## Summary
- This `panic!` guards against something that's hard to do by accident
- If we **do** make this mistake, it's reported as a distinct error to the
  caller
  - Unlike Propolis, there's no liveness concern
- So far, 100% of issues it discovered are false positives at program exit
  - It is easy to introduce similar false positives when writing new code
  - New false positives will manifest as panics during program exit
  - If we introduce new false positives, we will not discover the issue until
    runtime (and it may be rare, because Tokio does not destroys tasks in a
    deterministic order)

## Proposal
- Remove the `panic!`
- Remove the `Drop` implementations which work around it at exit
- Add logging for this case:
  - Fire a DTrace probe if a `BlockReq` is dropped without a reply
  - Log when we see a `CrucibleError::RecvError`
ha ha ha

types!

Docstrings, etc

Move deferred stuff to a separate function

Try to minimize the diff against `main`

Fix typos

Explain is_empty behavior and fix typo
@mkeeter mkeeter force-pushed the offload-write-encryption branch from 63bdfe1 to b83c1ca Compare January 4, 2024 18:40
@leftwo leftwo self-requested a review January 4, 2024 21:43
upstairs/src/block_req.rs Outdated Show resolved Hide resolved
upstairs/src/upstairs.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,186 @@
// Copyright 2023 Oxide Computer Company
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2024!!!!!!!!!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 85b6994, WELCOME TO THE FUTURE

@mkeeter mkeeter merged commit 45146a8 into oxidecomputer:main Jan 4, 2024
18 checks passed
@mkeeter mkeeter deleted the offload-write-encryption branch January 4, 2024 23:26
mkeeter added a commit that referenced this pull request Jan 23, 2024
Analogous to #1066 , this PR moves read decryption into the rayon thread pool.

It uses exactly the same infrastructure (particularly `DeferredQueue`), and is
implemented in a very similar way.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants