-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rss][nexus][sled-agent] Responsibility for deploying services should (where possible) migrate into Nexus #732
Open
7 of 15 tasks
Tracked by
#824
Labels
Milestone
Comments
smklein
added
Sled Agent
Related to the Per-Sled Configuration and Management
nexus
Related to nexus
labels
Mar 7, 2022
13 tasks
This was referenced Apr 8, 2022
17 tasks
6 tasks
smklein
added a commit
that referenced
this issue
Dec 2, 2022
## Overview - Implements https://rfd.shared.oxide.computer/rfd/0278 - This PR moves much of the service configuration from the hard-coded `config-rss.toml` file to RSS itself. - In the future (See: #732) many of these services will be initialized by Nexus. Decoupling their provisioning from the hard-coded versions is the first step in this process. ### What Changed in the Sled Agent - Sled Agent - A new `get_zpools` endpoint is exposed from the Sled Agent. This is invoked by RSS when figuring out where to provision datasets. - The UUID for the sled agent is removed from the config file (it's dynamic, and should not be shared among sleds) ### What Changed in RSS - `HardcodedSledRequest` (and the corresponding entries in `config-rss.toml`) has been removed - A `plan` module was added, where plans for sled generation ("What sleds should get what addresses?") and service generation ("What services should run where?") are generated. - Refactor service and dataset initialization to insert entries into DNS - Invoke the `handoff_to_nexus`, informing it of all previously-owned-by-RSS services. ### What Changed in Nexus - Expand `RackInitializationRequest` to consider both services and datasets - `dataset_put` API removed -- beyond the initialization request, Nexus should be responsible for provisioning new datasets, not the sled agent. Fixes #1148 Part of #732 Part of #824
smklein
added a commit
that referenced
this issue
Feb 21, 2023
#2358) # Summary My long-term goal is to have Nexus be in charge of provisioning all services. For that to be possible, Nexus must be able to internalize all input during the handoff from RSS. This PR extends the RSS -> Nexus handoff to include: - What "Nexus Services" are being launched? - What are the ranges of IP addresses that may be used for internal services? - What external IP addresses, from that pool, are currently in-use for Nexus services? # Nexus Changes ## Database Records - Adds a `nexus_service` record, which just includes the information about the in-use external IP address. ## IP Address Allocation - Adds an `explicit_ip` option, which lets callers perform an allocation with an explicit request for a single IP address. You might ask the question: "Why not just directly create a record with the IP address in question, if you want to create it?" We could! But we'd need to recreate all the logic which validates that the IP address exists within the known-to-the-DB IP ranges within the pool. - The ability for an operator to "request Nexus execute with a specific IP address" is a feature we want anyway, so this isn't wasted work. - The implementation and tests for this behavior are mostly within `nexus/src/db/queries/external_ip.rs` ## Rack Initialization - Populates IP pools and Service records as a part of the RSS handoff. - Implementation and tests exist within `nexus/src/db/datastore/rack.rs`. ## Populate - Move the body of some of the "populate" functions into their correct spot in the datastore, which makes it easier to... - ... call all the populate functions -- rather than just a chunk of them -- from `omicron_nexus::db::datastore::datastore_test`. - As a consequence, update some tests which assumed the rack would be "half-populated" -- it's either fully populated, or not populated at all. # Sled Agent changes - Explicitly pass the "IP pool ranges for internal services" up to Nexus. - In the future, it'll be possible to pass a larger range of addresses than just those used by running Nexus services. Fixes: #1958 Unblocks: #732
This was referenced Apr 30, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
At the time of writing, #686 introduces the first usage of the "RSS", which makes requests to sled agent to create datasets/services.
Aside from the datasets necessary for initializing Nexus (that is, Nexus itself and CRDB), these service requests should be handled by Nexus "as much as possible" instead of the RSS.
Many operations can trigger a need to request these services partitions:
All these conditions are ongoing, and best handled by Nexus, which maintains a "global" view of the rack and exists beyond initialization.
Fortunately, the APIs defined by the Sled Agent should (more-or-less) remain the same - this issue just addresses the matter of "who calls them".
The text was updated successfully, but these errors were encountered: