Skip to content

Commit

Permalink
Configure boundary services with dendrite
Browse files Browse the repository at this point in the history
Update softnpu & dendrite dependencies
Update commands in helper scripts to use `swadm`
Update sled-agent config.toml to use softnpu override feature
Update a-to-z docs to reflect new process
  • Loading branch information
internet-diglett committed Feb 15, 2023
1 parent 55b64b7 commit 6b27f76
Show file tree
Hide file tree
Showing 23 changed files with 450 additions and 194 deletions.
3 changes: 3 additions & 0 deletions .github/buildomat/jobs/package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,19 @@ cargo --version
rustc --version

ptime -m ./tools/install_builder_prerequisites.sh -yp
ptime -m ./tools/install_softnpu_machinery.sh
ptime -m ./tools/create_self_signed_cert.sh -yp

ptime -m cargo run --locked --release --bin omicron-package -- package

files=(
out/*.tar
out/softnpu/*
package-manifest.toml
smf/sled-agent/config.toml
target/release/omicron-package
tools/create_virtual_hardware.sh
tools/scrimlet/*
)
ptime -m tar cvzf /work/package.tar.gz "${files[@]}"
mkdir -p /work/zones
Expand Down
170 changes: 68 additions & 102 deletions docs/boundary-services-a-to-z.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,89 +13,74 @@ Running Omicron (Non-Simulated) document.
== 1. Setup virtual hardware

----
pfexec ./tools/create_virtual_hardware.sh <wan interface>
PHYSICAL_LINK=<wan interface> pfexec ./tools/create_virtual_hardware.sh
----
Note that the `PHYSICAL_LINK` environment variable is optional. If not supplied,
the first link in `dladm show-phys` will be used.

The virtual hardware is a bit different than what's currently being used. What
we'll eventually wind up with looks like this.

image::plumbing.png[]

The `softnpu` zone will be configured and launched during the `create_virtual_hardware.sh`
script.

== 2. Build and install the control plane.

----
./tools/create_self_signed_cert.sh
cargo run --release --bin omicron-package -- package
pfexec cargo run --release --bin omicron-package -- install
cargo build --release --bin omicron-package
./target/release/omicron-package -t switch_variant=softnpu package
pfexec ./target/release/omicron-package -t switch_variant=softnpu install
----

The control plane is now starting, reference the Running Omicron (Non-Simulated)
doc for more details on determining when things are ready to go.


== 3. Launch and configure the softnpu zone

Launch the zone.

----
pfexec ./tools/scrimlet/create-softnpu-zone.sh
----

Configure the softnpu zone. The following will drop you into a zone shell.

----
pfexec zlogin softnpu
----

Now run softnpu.

----
root@scrimlet:~# cd /stuff/
root@scrimlet:/stuff# ./softnpu softnpu.toml
Config {
p4_program: "/stuff/libsidecar_lite.so",
ports: [
Port {
sidecar: "sc0_0",
scrimlet: "sr0_0",
mtu: 1600,
},
Port {
sidecar: "sc0_1",
scrimlet: "sr0_1",
mtu: 1500,
},
],
}
----

Back in the global zone, softnpu can be configured.

----
ry@korgano: cd /opt/softnpu/stuff
ry@korgano: pfexec ./softnpu-init.sh
[00:00:01] ######################################## 14.31 MiB/14.31 MiB done
local v6:
fe80::aae1:deff:fe01:701c
fe80::aae1:deff:fe01:701d
fd00:99::1
local v4:
router v6:
fd00:1122:3344:101::/64 -> fe80::aae1:deff:fe00:1 (1)
router v4:
0.0.0.0/0 -> 10.100.0.1 (2)
resolver v4:
10.100.0.1 -> 90:ec:77:2e:70:27
resolver v6:
fe80::aae1:deff:fe00:1 -> a8:e1:de:00:00:01
nat_v4:
10.100.0.6 1024/65535 -> fd00:1122:3344:101:: 8717766/a8:40:25:f0:51:75
nat_v6:
port_mac:
1: a8:e1:de:01:70:1c
2: a8:e1:de:01:70:1d
icmp_v6:
icmp_v4:
Once the control plane is running, `softnpu` can be configured via `dendrite`
using `swadm`. An example script is provided in `tools/scrimlet/softnpu-init.sh`.
This script should work without modification for basic development setups,
but feel free to tweak it as needed.

----
$ ./tools/scrimlet/softnpu-init.sh
++ netstat -rn
++ grep default
++ awk -F ' ' '{print $2}'
+ GATEWAY_IP=10.85.0.1
+ echo 'Using 10.85.0.1 as gateway ip'
Using 10.85.0.1 as gateway ip
++ arp 10.85.0.1
++ awk -F ' ' '{print $4}'
+ gateway_mac=68:d7:9a:1f:77:a1
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' port add 1:0 100G RS
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' port add 2:0 100G RS
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' addr add 1:0 fe80::aae1:deff:fe01:701c
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' addr add 2:0 fe80::aae1:deff:fe01:701d
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' addr add 1:0 fd00:99::1
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' route add fd00:1122:3344:0101::/64 1:0 fe80::aae1:deff:fe00:1
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' arp add fe80::aae1:deff:fe00:1 a8:e1:de:00:00:01
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' route add 0.0.0.0/0 2:0 10.85.0.1
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' arp add 10.85.0.1 68:d7:9a:1f:77:a1
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' port list
NAME MEDIA SPEED FEC ENA LINK MAC
1:0 Copper 100G RS Ena Up a8:40:25:71:e3:82
2:0 Copper 100G RS Ena Up a8:40:25:71:e3:83
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' addr list
Port IPv4 IPv6
1:0 fd00:99::1
fe80::aae1:deff:fe01:701c
2:0 fe80::aae1:deff:fe01:701d
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' route list
Subnet Port Gateway
0.0.0.0/0 2:0 10.85.0.1
fd00:1122:3344:101::/64 1:0 fe80::aae1:deff:fe00:1
+ ./out/softnpu/swadm -h '[fd00:1122:3344:101::2]' arp list
host mac age
10.85.0.1 68:d7:9a:1f:77:a1 0s
fe80::aae1:deff:fe00:1 a8:e1:de:00:00:01 0s
----

== 4. Populating the system
Expand All @@ -108,12 +93,19 @@ to here are the following.
- The address range in the IP pool should be on a subnet in your local network that
can NAT out to the Internet.
- Be sure to set up an external IP for the instance you create.
- You will need to set up `proxy-arp` if your VM external IP addresses are on the
same L2 network as the router or other non-oxide hosts.

Once your host has been populated with images, you can use the script at
`tools/quickstart.sh` to quickly create a VM and set up the `proxy-arp`. Please
be sure to edit the `ip_pool_start` and `ip_pool_end` variables to match your
desired address ranges.

== 5. Configuring scrimlet/sidecar

A this point we have an instance up and running. At the time of writing there is
not control plane driven boundary services automation so we're going to
configure the scrimlet it by hand.
configure the scrimlet by hand.

First we need to collect some information. In particular we need to know about
the virtual network our instance is sitting on. We can get that info from
Expand Down Expand Up @@ -152,44 +144,18 @@ field in the table. Let's assume that is `10.100.0.6` for this example.

Now we need to go tell boundary services about this information.

Log back into the scrimlet VM

----
./out/propolis/propolis-cli --server 127.0.0.1 serial
----

Go back to the `/opt/cargo-bay` and open up `softnpu-init.sh` in an editor.
There are a few things we need to edit here. Locate the line with the following
content.

----
./softnpuadm add-nat4 10.100.0.6 1024 65535 fd00:1122:3344:0101:: 8717766 a8:40:25:f0:51:75
----

Edit this line to use the information we gathered above. For the specific
information I have for this run, this look like:

----
./softnpuadm add-nat4 10.100.0.6 1024 65535 fd00:1122:3344:101::1 15103089 A8:40:25:F2:84:3F
----

While editing this file, also note the comments guiding you to change the
upstream gateway IP and MAC addresses. The MAC address is the same one you would
use for the OPTE hack. The IP address honestly does not matter a whole lot for
this setup since it's a default route. Just make sure the address used for the
gateway IP is the same in both places.

Now run
This can be accomplished via the following `swadm` command:

----
./softnpu-init.sh
./out/softnpu/swadm -h "[fd00:1122:3344:101::2]" nat add \
-e 10.100.0.6 \
-l 1024 \
-h 65535 \
-i fd00:1122:3344:101::1 \
-m A8:40:25:F2:84:3F \
-v 15103089
----

This will reconfigure the ASIC (you could also run just
`./softnpuadm remove-<x> ...` and `./softnpuadm add-<x>` if you feel like
being more surgical) with a boundary services config that will give your
instance access to the Internet.

----
ry@korgano:~/omicron$ ~/propolis/target/release/propolis-cli --server fd00:1122:3344:101::c serial
Expand Down
11 changes: 11 additions & 0 deletions docs/how-to-run.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,17 @@ file-based ZFS vdevs and ZFS zpools on top of those, and a couple of VNICs. The
vdevs model the actual U.2s that will be in a Gimlet, and the VNICs model the
two Chelsio NIC ports.

Set the `GATEWAY_IP` variable when running the `create_virtual_hardware` script
to override the default logic used to automatically determine the gateway ip.
This variable is used to configure `softnpu` with a default route for external /
internet connectivity. Set the `PHYSICAL_LINK` environment variable to override
the default logic used to automatically determine the physical network connection
used for external communication. For example:

----
$ GATEWAY_IP=10.85.0.1 PHYSICAL_LINK=ixgbe0 pfexec ./tools/create_virtual_hardware.sh
----

You can clean up these resources with `pfexec ./tools/destroy_virtual_hardware.sh`.
This script requires Omicron be uninstalled, e.g., with `pfexec
./target/release/omicron-package uninstall`, and a warning will be printed if
Expand Down
47 changes: 42 additions & 5 deletions package-manifest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -161,8 +161,10 @@ only_for_targets.switch_variant = "stub"
# 3. Use source.type = "manual" instead of "prebuilt"
source.type = "prebuilt"
source.repo = "dendrite"
source.commit = "10e305a52c45bee91ffb16f6d3ad7a5cc3100e73"
source.sha256 = "3142e71f7eb61e258dab3f6bf02e37ccfc2a0b2eeb134aadfe62603092175430"
# TODO: @internet-diglett
# point this back to dendrite `main` before merging
source.commit = "4a288a85e31c737a5b197dea91e543dd475a0fd0"
source.sha256 = "046c14c995579a783edf22e19429729a9099e8a2271cd550fa91568a43c5129a"
output.type = "zone"
output.intermediate_only = true

Expand All @@ -179,8 +181,30 @@ only_for_targets.switch_variant = "asic"
# 3. Use source.type = "manual" instead of "prebuilt"
source.type = "prebuilt"
source.repo = "dendrite"
source.commit = "10e305a52c45bee91ffb16f6d3ad7a5cc3100e73"
source.sha256 = "5bf81b8678fde53508d50cda973e2c8eacbe8b29a2e76e3809cdb1156c632702"
# TODO: @internet-diglett
# point this back to dendrite `main` before merging
source.commit = "4a288a85e31c737a5b197dea91e543dd475a0fd0"
source.sha256 = "130df5345ebd7e17ffe9445b564bc5ad020349b8571cc509e8dfad57e678e7dd"
output.type = "zone"
output.intermediate_only = true

[package.dendrite-softnpu]
service_name = "dendrite"
only_for_targets.switch_variant = "softnpu"
# To manually override the package source:
#
# 1. Build the zone image manually
# 1a. cd <dendrite tree>
# 1b. cargo build --features=softnpu --release
# 1c. cargo xtask dist -o -r --features softnpu
# 2. Copy dendrite.tar.gz from dendrite/out to omicron/out/dendrite-softnpu.tar.gz
# 3. Use source.type = "manual" instead of "prebuilt"
source.type = "prebuilt"
source.repo = "dendrite"
# TODO: @internet-diglett
# point this back to dendrite `main` before merging
source.commit = "4a288a85e31c737a5b197dea91e543dd475a0fd0"
source.sha256 = "c127e281d9643f2a7daa74510a96c04d4ebf9b51d45b2ca58a37f3ca96d39899"
output.type = "zone"
output.intermediate_only = true

Expand All @@ -197,7 +221,7 @@ output.type = "zone"

# To package and install the stub variant of the switch, do the following:
#
# - Set the sled agent's configuration option "stub_scrimlet" to "true"
# - Set the sled agent's configuration option "scrimlet_override" to "stub"
# - Run the following:
# $ cargo run --release -p omicron-package -- -t switch_variant=stub package
# $ pfexec ./target/release/omicron-package -t switch_variant=stub install
Expand All @@ -207,3 +231,16 @@ only_for_targets.switch_variant = "stub"
source.type = "composite"
source.packages = [ "omicron-gateway.tar.gz", "dendrite-stub.tar.gz", "wicketd.tar.gz", "wicket.tar.gz" ]
output.type = "zone"

# To package and install the softnpu variant of the switch, do the following:
#
# - Set the sled agent's configuration option "scrimlet_override" to "softnpu"
# - Run the following:
# $ cargo run --release -p omicron-package -- -t switch_variant=softnpu package
# $ pfexec ./target/release/omicron-package -t switch_variant=softnpu install
[package.switch-softnpu]
service_name = "switch"
only_for_targets.switch_variant = "softnpu"
source.type = "composite"
source.packages = [ "omicron-gateway.tar.gz", "dendrite-softnpu.tar.gz", "wicketd.tar.gz", "wicket.tar.gz" ]
output.type = "zone"
10 changes: 8 additions & 2 deletions package/src/bin/omicron-package.rs
Original file line number Diff line number Diff line change
Expand Up @@ -577,8 +577,14 @@ async fn do_clean(
"Removing artifacts from {}",
artifact_dir.to_string_lossy()
);
const ARTIFACTS_TO_KEEP: &[&str] =
&["clickhouse", "cockroachdb", "xde", "console-assets", "downloads", "softnpu"];
const ARTIFACTS_TO_KEEP: &[&str] = &[
"clickhouse",
"cockroachdb",
"xde",
"console-assets",
"downloads",
"softnpu",
];
remove_all_except(artifact_dir, ARTIFACTS_TO_KEEP, &config.log)?;
info!(
config.log,
Expand Down
4 changes: 2 additions & 2 deletions sled-agent/src/bootstrap/hardware.rs
Original file line number Diff line number Diff line change
Expand Up @@ -131,15 +131,15 @@ impl HardwareMonitor {
switch_zone_bootstrap_address: Ipv6Addr,
) -> Result<Self, Error> {
let hardware =
HardwareManager::new(log.clone(), sled_config.stub_scrimlet)
HardwareManager::new(log.clone(), sled_config.scrimlet_override)
.map_err(|e| Error::Hardware(e))?;

let service_manager = ServiceManager::new(
log.clone(),
underlay_etherstub.clone(),
underlay_etherstub_vnic.clone(),
bootstrap_etherstub,
sled_config.stub_scrimlet,
sled_config.scrimlet_override,
sled_config.sidecar_revision.clone(),
switch_zone_bootstrap_address,
)
Expand Down
19 changes: 17 additions & 2 deletions sled-agent/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ pub struct Config {
/// Configuration for the sled agent debug log
pub log: ConfigLogging,
/// Optionally force the sled to self-identify as a scrimlet (or gimlet,
/// if set to false).
pub stub_scrimlet: Option<bool>,
/// if set to Disabled).
pub scrimlet_override: Option<ScrimletMode>,
// TODO: Remove once this can be auto-detected.
pub sidecar_revision: String,
/// Optional VLAN ID to be used for tagging guest VNICs.
Expand All @@ -32,6 +32,21 @@ pub struct Config {
pub data_link: Option<PhysicalLink>,
}

/// Configuration for forcing a sled to run as a Scrimlet
#[derive(Clone, Debug, Deserialize, Copy)]
#[serde(rename_all = "snake_case")]
pub enum ScrimletMode {
/// Force sled to run as a Gimlet
/// TODO: @internet-diglett
/// this is to preserve the old behavior of `scrimlet_override = false`,
/// but I haven't found where that logic has actually been leveraged...
Disabled,
/// Force sled to run in Scrimlet mode with a stub switch
Stub,
/// Force sled to run in Scrimlet mode with a Softnpu switch
Softnpu,
}

#[derive(Debug, thiserror::Error)]
pub enum ConfigError {
#[error("Failed to read config from {path}: {err}")]
Expand Down
Loading

0 comments on commit 6b27f76

Please sign in to comment.