Related issue: Add support for 100% MCS Compliance
Current Lighthouse implementation differs from MCS API 1 on two key points for ClusterIP Services:
- There is no VirtualIP [VIP] assigned in
ServiceImport
. EndpointSlices
do not contain individual PodIPs.
To better align with MCS API Lighthouse needs an option to allocate and track VirtualIPs in ServiceImports
and use
PodIPs in EndpointSlices
.
When creating ServiceExport
, users will have the option to add an annotation for enabling VIP for that Service.
This annotation will also be added to the aggregated ServiceImport
.
A configuration flag will also be added to the Submariner
and ServiceDiscovery
CRDs, and subctl
to set the
options at the global level. This is for deployments where the user wants the option set for all exported Services and
doesn't want to set the annotation explicitly for each ServiceExport
.
Default behavior with nothing configured would be same as it is currently i.e. no VIP and no PodIPs.
Global flag can't be changed at runtime and requires a reinstall of whole clusterset. Change of flag in CRs will require
a restart of Submariner on that cluster as it changes the behavior of ServiceExports
without annotation.
- Default CIDR of
243.0.0/8
will be used for Clusterset. User can change this at the time of deploying broker. - Each Cluster will get a default subset CIDR of
243.0.0.0/16
for allocating VIPs to Services exported from that cluster. User can change this at the time of join but this CIDR must be an unallocated subset of global VIP CIDR. - VIP allocation is done by the first cluster to export the
Service
. - When Service is exported on subsequent clusters, no VIP allocation is done if VIP already present on
ServiceImport
. - VIP is deallocated only when
ServiceImport
is deleted i.e.Service
orServiceExport
are no longer present on any of the clusters. - If Submariner is uninstalled on the cluster that allocated a given VIP, VIP is not changed on
ServiceImport
. - When
lighthouse-agent
starts, it should checkServiceImport
s for any VIPs allocated from its VIP CIDR. Any such VIPs should be added to its local VIP allocation cache. This will also help with scenarios where cluster that allocated VIP has uninstalled Submariner and another cluster got the same VIP CIDR on installation.
Note that 1
and 2
above are similar to Globalnet CIDR.
Lighthouse DNS will return VIP for ServiceImports
if available. It will be up to any other solution external to
Submariner to load balance this VIP to PodIPs in EndpointSlices
. If VIP is not present, it will behave as it does
today, i.e. return one of the ClusterIPs from EndpointSlices
.
In case of conflict in VIP setting when exporting Service
, configuration on the oldest ServiceExport
will be used. The ServiceExport
in conflict will be exported with existing behavior and ServiceExportConflict
condition will be set. The Valid
condition will also be set to true
as Service itself is exportable.
Here are the detailed steps when creating a ServiceExport
:
- When a new
ServiceExport
is created, check if enable VIP annotation is present. If not present, use global flag to determine if need to use VIP or not. - If VIP is enabled based on condition above, check if
ServiceImport
exists or not. - If
ServiceImport
doesn't exist, allocate the VIP and add it to newServiceImport
. Add an annotationlighthouse.submariner.io/clusterset-ip-allocated-by=<cluster-id>
to identify the cluster that allocated the VIP. - If a
ServiceImport
exists, compare the annotation on theServiceExport
and global flag against the annotation inServiceImport
for any conflicts. - In case of conflict, set
ServiceExportConflict
condition on theServiceExport
and continue with export. - Create
EndpointSlices
with PodIPs instead of ClusterIP. Follow existing rules for merging port information inEndpointSlices
.
Changing annotation will not modify the current behavior. It will only change the conflict
condition on ServiceExport
if it adds or resolves the conflict, depending on the change. If the users wish to change
the behavior for a given Service
, they will need to delete all ServiceExports
and recreate them with required
annotations.
Same behavior will also apply on restart after changing global flag.
Here are the detailed steps when deleting a ServiceExport
:
- Check the cluster list on
ServiceImport
to determine if this is the last cluster to Export the service. - If this is last cluster, deallocate the VIP if it is same CIDR as current cluster and delete the
ServiceImport
. - If not, proceed as usual irrespective of VIP annotation. This means deleting EndpointSlice, recalculating service
ports and updating cluster list on
ServiceImport
.
Globalnet will not be supported yet. Supporting Globalnet with PodIPs will be a significant scale issue as each Pod backing the Service will require a GlobalIngressIP.
For any use cases that require Globalnet support with this feature, a separate Enhancement can be proposed in the future.
Since there is no change in default behavior, nothing should be required for migration.
If users want to change VIP behavior on existing ServiceExports post deployment, they will need to delete all
ServiceExports
.
On restart, every lighthouse-agent should check aggregated ServiceImport
for IPs allocated from its CIDR and allocate
them in local allocation cache.
When deploying broker, --enable-clusterset-ip
flag will be added. There will also be an --clusterset-ip-cidr-range
option to set the VIP CIDR for clusterset to a non-default option. Irrespective of --enable-clusterset-vip
setting,
Clusterset CIDR Range will be set to default of 243.0.0.0/8
unless set to a different value. This is to support use
cases where global flag is not set but user configures VIP on individual ServiceExport
.
Similar to Globalnet, this information will also be stored in broker-info.subm
file and BrokerInfo ConfigMap.
Following flags will be added to subctl join
:
enable-clusterset-ip
- set the default behavior to enable VIP on this cluster. If not specified, it will pick the default from global configuration done during deploy broker.--clusterset-ip-cidr=a.b.c.d/x
- set the VIP CIDR for this cluster. Default will be allocated from Clusterset VIP CIDR Range confiured during deploy broker step.
Following CRDs will need to be modified to support these new flags
type BrokerSpec struct {
ClustersetIPEnabled bool `json:"clustersetIPEnabled,omitempty"`
ClustersetIPCIDRRange []string `json:"clustersetIPCIDRRange,omitempty"`
}
type SubmarinerSpec struct {
ClustersetIPEnabled bool `json:"clustersetIPEnabled,omitempty"`
ClustersetIPCIDR []string `json:"clusterserIpCidr,omitempty"`
}
type SubmarinerDiscoverySpec struct {
ClustersetIPEnabled bool `json:"clustersetIPEnabled,omitempty"`
ClustersetIPCIDR []string `json:"clusterserIpCidr,omitempty"`
}
lighthouse.submariner.io/use-clusterset-ip
- Use VIP for theServiceExport
- Doesn't modify existing behaviour
- Allows users to select behavior on a per Service basis.
- More configuration options for users to consider.
- More chances of
ServiceExports
on different clusters being in conflict due to wrong annotations etc.
None.
- Only use Global flag for entire deployment. This was discarded in favor of annotations based approach to provide flexibility to users as they can mix and match services with and without VIP.
Existing users will not be impacted in any ways. Users who wish to use this feature will need to reinstall Submariner
with new flags or use annotations on ServiceExports
.