Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework MGS internals for clarity #1583

Merged
merged 1 commit into from
Aug 17, 2022
Merged

Rework MGS internals for clarity #1583

merged 1 commit into from
Aug 17, 2022

Conversation

jgallagher
Copy link
Contributor

This is the followup PR to #1579, reworking MGS proper to use the SingleSp introduced in that PR for faux-mgs. The changes include:

  • Removal of RecvHandler and all its supporting machinery; instead of
    a monster tokio task reponsible for recv'ing on all sockets, pairing
    up request/response IDs, etc., we spawn a SingleSp task for each
    switch port that manages all communication on that port's socket.
  • Removal of the timeout option to most SP calls. SingleSp already
    has inherent (configured) timeouts for retries. At the HTTP client
    level, the client can already establish its own timeouts (e.g., in the
    event the network connection between client and server goes down); we
    don't need separate timeouts for SP communications (which the client
    has no reason to know how to set).
  • The serial console websocket handling has moved from
    gateway-sp-comms to gateway.

Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @jgallagher. The organization changes make this a lot cleaner IMO, and having each SP handled in its own task likely provides better scalability/performance.

@@ -298,6 +317,7 @@ async fn rpc(

#[derive(Debug)]
pub struct AttachedSerialConsole {
key: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of key? Are multiple serial attachments to the same SP allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not simultaneously; the purpose of key here is to distinguish "detach any currently-attached serial console client" (i.e., serial_console_detach() above, which sends a key of None) from "detach this currently-attached serial console client (i.e., AttachedSerialConsoleSend::send() below, which sends a key of Some(self.key)). This is (maybe only partially?) explained below:

// The associated value is the connection key; if Some(_), only detach if
// the currently-attached key number matches. If None, detach any current
// connection. These correspond to "detach the current session" (performed
// automatically when a connection is closed) and "force-detach any session"
// (performed by a user).

In hindsight I should expand this: ultimately this is to avoid a race condition like this:

  1. Client A attaches to the serial console.
  2. Client B tries to attach and fails.
  3. Client B sends a "detach any existing clients", and then successfully attaches.
  4. Client A drops their connection, and sends a "detach my current connection".

With this ordering, we don't want step 4 to detach client B from step 3; the key makes this work because the key sent in step 4 won't match the key established in step 3, so the detach will be a noop (the fact that the key is different means client A's session has already been detached).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this makes perfect sense. I somewhat gathered that after writing this question from another comment in the code where it talked about attaching after detach, but wasn't 100% sure so left my question.

Base automatically changed from mgs-cleanup-1 to main August 17, 2022 16:36
The changes include:

* Removal of `RecvHandler` and all its supporting machinery; instead of
  a monster tokio task reponsible for recv'ing on all sockets, pairing
  up request/response IDs, etc., we spawn a `SingleSp` task for each
  switch port that manages all communication on that port's socket.
* Removal of the `timeout` option to most SP calls. `SingleSp` already
  has inherent (configured) timeouts for retries. At the HTTP client
  level, the client can already establish its own timeouts (e.g., in the
  event the network connection between client and server goes down); we
  don't need separate timeouts for SP communications (which the client
  has no reason to know how to set).
* The serial console websocket handling has moved from
  `gateway-sp-comms` to `gateway`.
@jgallagher jgallagher enabled auto-merge (squash) August 17, 2022 20:26
@jgallagher jgallagher merged commit a70c681 into main Aug 17, 2022
@jgallagher jgallagher deleted the mgs-cleanup-2 branch August 17, 2022 21:05
leftwo pushed a commit that referenced this pull request Dec 4, 2024
Skip jobs when reinitializing to `Faulted` (#1583)
Clear `repair_check_deadline` if repair is successfully started (#1581)
Update rust-version reqs to reflect reality (#1580)

Propolis:
60886290 update nvme-trace.d to match current probe definitions (#821)
8e5693bf Fix clippy lints for Rust 1.83
leftwo added a commit that referenced this pull request Dec 4, 2024
Crucible:
Skip jobs when reinitializing to `Faulted` (#1583)
Clear `repair_check_deadline` if repair is successfully started (#1581)
Update rust-version reqs to reflect reality (#1580)

Propolis:
Update nvme-trace.d to match current probe definitions (#821) 
Fix clippy lints for Rust 1.83
PHD: use stty to widen the effective terminal for Linux guests (#818)

Co-authored-by: Alan Hanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants