-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement conflict checking for SessionAffinity #1632
Implement conflict checking for SessionAffinity #1632
Conversation
🤖 Created branch: z_pr1632/tpantelis/session_affinity_conflict |
ee06699
to
486455f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and reviewed code locally. Other than thoughts on how to improve logic for oldest export, looks good.
@@ -442,3 +564,48 @@ func (c *ServiceImportController) localServiceImportLister(transform func(si *mc | |||
|
|||
return retList | |||
} | |||
|
|||
func findClusterWithOldestTimestamp(from map[string]string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am wondering if we can improve this logic. In a scale setup there might be too many clusters exporting a service and may cause issues loopin through all of them.
How about we also add an annotation like timestamp.submariner.io/oldest
and use that to check if we're older or not. When the oldest one is exported, only then we will need to cycle through all to find the new oldest.
Also, this may have some issues when we're usin VIP. In case two different clusters export service at same time, VIP will end up being allocated by whosoever was first to successfully create Aggregated SI. It may happen that its timestamp is newer than the one that failed to create, coz it depends on timing of when the create attempt is made on broker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am wondering if we can improve this logic. In a scale setup there might be too many clusters exporting a service and may cause issues loopin through all of them.
I think the time would be negligible with a single loop even with thousands. If it was a nested loop, then we could have an issue. With a single loop, it would take millions of iterations to start to become significant.
How about we also add an annotation like
timestamp.submariner.io/oldest
and use that to check if we're older or not. When the oldest one is exported, only then we will need to cycle through all to find the new oldest.
yeah we could do that, ie also if the oldest cluster is un-exported then remove the annotation to trigger other clusters to re-evaluate. But it adds more complexity that I'm not sure we really need. I can try it out....
Also, this may have some issues when we're usin VIP. In case two different clusters export service at same time, VIP will end up being allocated by whosoever was first to successfully create Aggregated SI. It may happen that its timestamp is newer than the one that failed to create, coz it depends on timing of when the create attempt is made on broker.
I wasn't planning on using the oldest timestamp for the VIP b/c we won't ever re-allocate it once set. The first cluster to successfully create the aggregated SI sets the VIP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some testing of findClusterWithOldestTimestamp
with varying number of entries in the annotations map:
- 1,000 entries took 38.793 µs
- 10,000 took 244.597 µs
- 1,000,000 took 46.399 ms
- 10,000,000 took 711.979 ms
I also tried calling findClusterWithOldestTimestamp
in a nested loop. So calling it 1,000 times with 1,000 map entries took 21.924 ms (simulates 1000 services on 1000 clusters). With 10,000 and 10,000 it took 2.16 s, which is still pretty negligible.
So I think we should be fine the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth adding those benchmarks as a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would we actually verify as a test?
This is really just normal looping of an array or map (with some fairly simple string manipulation/parsing) that I'm sure we do similarly in other places throughout the code base. These operations are fast with modern computers at the scales we'd be dealing with (in the thousands) so I don't think we need to be concerned. Only when you start getting into the millions is when you might need to be concerned.
The MCS spec's conflict resolution policy states that a "conflict will be resolved by assigning precedence based on each ServiceExport's creationTimestamp, from oldest to newest". We don't have a central MCS controller with access to all cluster's ServiceExports but we can store each cluster's ServiceExport creationTimestamp as annotations in the aggregated ServiceImport and use them to resolve conflicts. The SessionAffinity and SessionAffinityConfig fields on the aggregated ServiceImport will be set by the cluster with the oldest timestamp. The other clusters will set a ServiceExportConflict condition if their corresponding local service's fields do not match those on the aggregated ServiceImport. If a local service is updated in any cluster, each cluster re-evaluates the updated aggregated ServiceImport and either clears or sets the conflict conditon. Also if the service from the precedent cluster is unexported, the next precedent cluster will set the fields. Signed-off-by: Tom Pantelis <[email protected]>
486455f
to
62ec10b
Compare
🤖 Closed branches: [z_pr1632/tpantelis/session_affinity_conflict] |
The release notes are here: submariner-io/submariner-website#1166 Related: #1621 |
The MCS spec's conflict resolution policy states that a "conflict will be resolved by assigning precedence based on each ServiceExport's creationTimestamp, from oldest to newest". We don't have a central MCS controller with access to all cluster's
ServiceExports
but we can store each cluster'sServiceExport
creationTimestamp
as annotations in the aggregatedServiceImport
and use them to resolve conflicts. TheSessionAffinity
andSessionAffinityConfig
fields on the aggregatedServiceImport
will be set by the cluster with the oldest timestamp. The other clusters will set aServiceExportConflict
condition if their corresponding local service's fields do not match those on the aggregatedServiceImport
. If a local service is updated in any cluster, each cluster re-evaluates the updated aggregatedServiceImport
and either clears or sets the conflict condition. Also if the service from the precedent cluster is un-exported, the next precedent cluster will set the fields.