-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OPTE for Control Plane Zone Comms #127
Comments
So, the prior intent was that we would leverage OPTE when a given service needed communication to say the outside world or related, but not for in-rack and that a zone that needed to exist on both would have two interfaces, one that had the semantics of a traditional customer-style interface and one that would not. Can you provide more details about the use of OPTE for non-external traffic? It seems like the underlying issue driving us here is that we want the different zones in exclusive netstacks to be able to talk to each other. Given the whole virtual switch you described, why isn't that just an etherstub? |
I guess, just to add another general thought here, the fundamental thing is if we start using OPTE for something like control plane bootstrap, then we get into a chicken and egg scenario. One of the main earlier architectural decisions (which we can revisit) is that OPTE wasn't used in the main implementation of non-external control plane services because of this. One of the ways in which OPTE isn't the same as an Etherstub / virtual switch is that (I think) we don't really do any true L2 activity. While there is a local loopback, unlike a traditional virtual stub or etherstub, it has to be told what to do. That is, by default, OPTE has no connectivity between things without being told exactly what it should and the thing telling it what it should (to in part avoid split brain) was always designed to be the general control plane, e.g. directives issued by omicron/nexus via sled agent. To try to put together a bit more of an image of this, I'd imagine something that looks somwhat like this for the general case control plane zones (e.g. things without external connectivity):
This is a bit of a hasty sketch, so maybe not very clear. If we had a zone that needed to communicate both externally and internally, e.g. say something implementing the public API:
|
Thanks for the feedback @rmustacc! I did not realize etherstub could be used in this way, and I think this likely simplifies things quite a bit. I'll do a bit of tinkering with this and report back. |
Ok this works great. In the GZ
in the system-zone
Comms between hosts using the source/destination addresses of the GZ VNICs work the same way as having the primary underlay address on From the GZ of a host directly connected to the
|
@smklein is tackling this as part of Omicron #1066, and the work is tracked under Omicron #987. We can probably close this, but I defer to you @rcgoodfellow. |
SGTM |
The system-level zones we'll be running on sleds for: the control plane, storage, customer instances, dendrite, etc., will require communication both directly on the underlay and the boundary services overlay. An example of the latter is Nexus serving off-rack client requests to the user-facing API.
Given the requirement for both underlay and overlay communication, and the encapsulation capabilities of OPTE combined with OPTE's general-purpose architecture – it seems like a win to leverage OPTE for service-zone communications in addition to customer instance communications.
The general situation would look like this
There are a few notable details in this diagram
lo0
.In an initial implementation, the IP addresses in the zones would be atop VNICs over the xde device. This presents the somewhat awkward situation that we need link-local addresses on these VNICs as well as on the xde device. I've got plans to relax that constraint for on-host communications, but for now, I think it's something we can probably live with.
For underlay traffic, OPTE would mostly be in pass-through mode, letting traffic flow between system-zone instances and external sources or the GZ. When OPTE detects overlay traffic, it behaves similarly as it does for customer instances, performing encap/dcap onto/from the boundary services overlay.
Required Work
The text was updated successfully, but these errors were encountered: