You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an umbrella tracking issue for managing the disparate bits of work for implementing a "real" networking stack in Omicron, Dendrite, and OPTE.
Background
As a short-term hack, which turned out to be not-so-short-lived, we merged the host and guest networks. This allowed the guest to piggy-back on the host network, using whatever fabric the host happened to live in for delivering traffic. OPTE abuses the source NAT configuration provided to guests to do this. Instead of encapsulating guest frames in Geneve headers, the addresses (both L3 and L2) are rewrittten instead: the IP address is rewritten to the guest's external NAT address, and the MAC of the host network's gateway is used as the next L2 hop.
This all works, but is both gross and completely different from how guest traffic we flow in the product. Fixing this requires both new functionality and removing various knock-on hacks necessitated by the original hack. This is all tracked here.
Issues and items
https://github.com/oxidecomputer/dendrite/pull/18. A key reason we implemented this hack in the first place was the glacial speed of the Tofino simulator. @rcgoodfellow implemented SoftNPU, a software emulation of a P4 program like the one run on the real Tofino. This should allow us to run an accurate emulation of the ASIC's behavior without requiring a physical ASIC. This PR adds support to Dendrite for using SoftNPU as a backend, which will be crucial for further integration work.
Integrate SoftNPU as virtual hardware #2089. Automated stand-up of SoftNPU environment in Omicron. @rcgoodfellowwrote up instructions for setting up Omicron in a modified way to take advantage of the SoftNPU work. Those scripts need to be removed, and their functionality subsumed as much as possible into either (1) the existing tools/{create,destroy}_virtual_hardware.sh script that we run or (2) Omicron itself.
Initial integration with Dendrite #1465. Dendrite is currently run as a service zone, launched by the sled agent. That's as far as the integration with Omicron extends at the moment. The linked tracking issue covers the core set of features required for Nexus to direct Dendrite's management of the switch, including setting up switch ports during initialization and providing information about the addressing of guests.
The Nexus external networking APIs. RFD 267 lays out a bunch of deployment scenarios, describing some likely ways that customers will integrate the rack with their existing networking infrastructure. We currently have a simple way to describe external IP address pools, but a fair chunk of these endpoints still need to be implemented. The current proposed API is described in RFD 267. The likely first targets are:
Provide IP addresses for our routers within those networks.
Configure VLANs and describe the physical links those run over and the L3 networks they support.
There are no issues here yet, but @internet-diglett is starting work on this.
x4c/SoftNPU: Remote Access Preview MVP p4#2. The other side of SoftNPU support is our P4 compiler and the port of the sidecar.p4 code from Dendrite called sidecar-lite.p4 (eventually, this will give way to just using sidecar.p4). This issue tracks critical capabilities in the compiler and the sidecar-lite.p4 port.
Cleanup from OPTE external IP workaround #1338. Like all hacks, this one metastasized. There are a few places in the sled agent where we implement other hacks. These need to be removed.
Remove the "external IP hack" opte#236. This tracks removing the hack itself from OPTE. There is a list of commits to be reverted or otherwise mitigated, and most of the code is well-marked with XXX-EXT-IP in the various places we need to fix things.
The text was updated successfully, but these errors were encountered:
This is an umbrella tracking issue for managing the disparate bits of work for implementing a "real" networking stack in Omicron, Dendrite, and OPTE.
Background
As a short-term hack, which turned out to be not-so-short-lived, we merged the host and guest networks. This allowed the guest to piggy-back on the host network, using whatever fabric the host happened to live in for delivering traffic. OPTE abuses the source NAT configuration provided to guests to do this. Instead of encapsulating guest frames in Geneve headers, the addresses (both L3 and L2) are rewrittten instead: the IP address is rewritten to the guest's external NAT address, and the MAC of the host network's gateway is used as the next L2 hop.
This all works, but is both gross and completely different from how guest traffic we flow in the product. Fixing this requires both new functionality and removing various knock-on hacks necessitated by the original hack. This is all tracked here.
Issues and items
tools/{create,destroy}_virtual_hardware.sh
script that we run or (2) Omicron itself.There are no issues here yet, but @internet-diglett is starting work on this.
XXX-EXT-IP
in the various places we need to fix things.The text was updated successfully, but these errors were encountered: