Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Sled Agent API for "control plane disk management", use it #5172

Merged
merged 80 commits into from
Mar 29, 2024

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Feb 29, 2024

This PR moves Sled Agent to implement the following:

---
config:
  flowchart:
    curve: "monotoneY"
---
flowchart TD
    subgraph PhysDisk[Disks]
        M2[("Internal Storage (M.2s)")]
        U2[("External Storage (U.2s)")]
    end
    subgraph SledAgent[Sled Agent]
        Manager["Storage Manager"]
        Resources["Storage Resources:\nView of Inventory + Control Plane Disks"]
        Bootstrap["Bootstrap Server"]
        Hardware["Hardware Monitor:\nTrack actual set of disks"]
        HTTP["HTTP Server"]
        subgraph StorageConsumers[Local Storage Consumers]
            Dump["Dump Device Management"]
            ZoneBundle["Zone Bundler"]
            Instance["Instance Manager"]
            Service["Service Manager"]
        end
    end
    subgraph OffDevServices[External Services]
        Nexus{"Nexus"}
        RSS{"Rack Setup Service"}
    end

    Bootstrap -->|Notify: disk decryption keys exist|Manager
    Manager -->|Update set of actual attached disks\nUpdate set of control plane disks| Resources
    Hardware -->|Update set of actual attached disks| Manager
    StorageConsumers -->|Read and Write To Storage|U2
    PhysDisk -->|Update set of actual attached disks| Hardware
    Nexus -->|Query inventory\nUpdate set of control plane disks|HTTP
    RSS -->|Query inventory\nUpdate set of control plane disks|HTTP
    HTTP --> |Query inventory\nUpdate set of control plane disks|Manager
    Resources -->|Select usable U.2s|Dump
    Resources -->|Use debug dataset within usable U.2s|ZoneBundle
    Resources -->|Use zone root filesystems|Instance
    Resources -->|Use zone root filesystems\nUse durable datasets|Service
    Manager -->|On Update:\nStore set of control plane disks|M2
    M2 -->|On Boot:\nRead usable control plane disks|Manager
Loading

Overview

Virtual Environment Changes

  • Acting on Disks, not Zpools
    • Previously, sled agent could operate on "user-supplied zpools", which were created by ./tools/virtual_hardware.sh
    • Now, in a world where Nexus has more control over zpool allocation, the configuration can supply "virtual devices" instead of "zpools", to give RSS/Nexus control over "when zpools actually get placed on these devices".
    • Impact:
      • sled-agent/src/config.rs
      • smf/sled-agent/non-gimlet/config.toml
      • tools/virtual_hardware.sh

Sled Agent Changes

  • HTTP API
    • The Sled Agent exposes an API to "set" and "get" the "control plane physical disks" specified by Nexus. The set of control plane physical disks (usable U.2s) are stored into a ledger on the M.2s (as omicron-physical-disks.json). The set of control plane physical disks also determines "which disks are available to the rest of the sled agent".
  • StorageManager
    • Before: When physical U.2 disks are detected by the Sled Agent, they are "auto-formatted if empty", and we notify Nexus about them. This "upserts" them into the DB, so they are basically automatically adopted into the control plane.
    • After: As we've discussed on RFD 457, we want to get to a world where physical U.2 disks are detected by Sled Agent, but not used until RSS/Nexus explicitly tells the Sled Agent to "use this sled as part of the control plane". This set of "in-use control plane disks" is stored on a "ledger" file in the M.2.
    • Transition: On deployed systems, we need to boot up to Nexus, even though we don't have a ledger of control plane disks. Within the implementation of StorageManager::key_manager_ready, we implement a workaround: if we detect a system with no ledger, but with zpools, we'll use that set of zpools unconditionally until told otherwise. This is a short-term workaround to migrate existing systems, but can be removed when deployed racks reliably have ledgers for control plane disks.
  • StorageManagerTestHarness
    • In an effort to reduce "test fakes" and replace them with real storage, StorageManagerTestHarness provides testing utilities for spinning up vdevs, formatting them with zpools, and managing them. This helps us avoid a fair bit of bifurcation for "test-only synthetic disks" vs "real disks", though it does mean many of our tests in the sled-agent are now 'illumos-only'.

RSS Changes

  • RSS is now responsible for provisioning "control plane disks and zpools" during initial bootstrapping
  • RSS informs Nexus about the allocation decisions it makes via the RSS handoff

Nexus Changes

  • Nexus exposes a smaller API (no notification of "disk add/remove, zpools add/remove"). It receives a handoff from RSS, and will later be in charge of provisioning decisions based on inventory.
  • Dynamically adding/removing disks/zpools after RSS will be appearing in a subsequent PR.

Base automatically changed from disk-in-inventory to main March 13, 2024 21:05
@smklein smklein force-pushed the sled-agent-api-to-manage-phys-disks branch from 8efa199 to 3000739 Compare March 14, 2024 03:25
Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for all the persistence and hard work here @smklein!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants