-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fixes sharding placement algorithm and allows development of alternative algorithms #13018
fix: Fixes sharding placement algorithm and allows development of alternative algorithms #13018
Conversation
e04a107
to
1da356c
Compare
dda4713
to
8e67a27
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #13018 +/- ##
==========================================
+ Coverage 49.48% 49.57% +0.09%
==========================================
Files 256 256
Lines 43844 43920 +76
==========================================
+ Hits 21695 21773 +78
+ Misses 19988 19985 -3
- Partials 2161 2162 +1
☔ View full report in Codecov by Sentry. |
17b3986
to
d4f8ef3
Compare
d4f8ef3
to
17b3986
Compare
e7b258b
to
24ef5ea
Compare
eb005bf
to
a2afda7
Compare
f079c26
to
39429e6
Compare
7457a40
to
d07922a
Compare
@leoluz gentle ping to re-enable auto-merge for this one. I amended last commit after updating the manifests to align them with the env variable ARGOCD_CONTROLLER_SHARDING_ALGORITHM. |
77916ea
to
be58bca
Compare
…pe of functions to filter clusters by shard - Adding unit tests for sharding - Refresh clusters list on DistributionFunction call Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: ishitasequeira <[email protected]>
…it size from [strconv.Atoi](1) to a lower bit size type uint32 without an upper bound check. Signed-off-by: Akram Ben Aissi <[email protected]>
Signed-off-by: Raghavi Shirur <[email protected]> Signed-off-by: Akram Ben Aissi <[email protected]>
…e sharding algo env var Signed-off-by: Akram Ben Aissi <[email protected]>
…ontroller.sharding.algorithm Signed-off-by: Akram Ben Aissi <[email protected]>
…bution methods name, ran codegen on manifests Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: Akram Ben Aissi <[email protected]>
be58bca
to
095cb58
Compare
@leoluz amazing that this feature got merged! In which version is going to be released and when? |
@bygui86 In the Projects section of this PR (right panel) you can see the roadmap it is associated to and the release day planned for that. In case you can't see it, it is Argo CD 2.8 (~Aug 07th 2023) |
@leoluz great thanks now I see it! |
Appreciated! |
I'm not sure if other users have this requirement. I believe it's better to let one application controller shard handle in-cluster only, while balancing other clusters across other application controller shards. |
@bofeng96 may I ask you why do you have such requirement? |
We do have thousands of parent Apps on in-cluster, and that's the reason we want the one shard only deal with those apps. It's too heavy if shard 0 is still taking care of other clusters, as the workload on in-cluster is more heavier. The current homogeneous distribution algorithm is only used to assign same # of clusters on each shards. |
@bofeng96 you can still use the not-so-well-documented manually assignment... apiVersion: v1
kind: Secret
metadata:
name: my-cluster
type: Opaque
data:
config: base64config
name: base64name
server: base64server
shard: base64shard |
clusterShard = sharding.GetShardByID(cluster.ID, replicas) | ||
distributionFunction := sharding.GetDistributionFunction(kubeClient, settingsMgr) | ||
distributionFunction(&cluster) | ||
cluster.Shard = pointer.Int64Ptr(int64(clusterShard)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cluster.Shard
should not be updated. That field is meant for users to explicitly set static shard
value so that right application controller instance with that shard value picks it up.
ctrl.projByNameCache.Delete(key) | ||
return true | ||
}) | ||
if ctrl != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this nil
check really needed ?
controller/sharding/sharding.go
Outdated
|
||
return func(c *v1alpha1.Cluster) int { | ||
if c != nil && replicas != 0 { | ||
clusterIndex, ok := clusterIndexdByClusterId[c.ID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clusterIndexdByClusterId
is initialized only during the GetShardByIndexModuloReplicasCountDistributionFunction
creation. Any clusters added post that initialization would not be available in the map. This map should be dynamically created within the filter function.
case filterFunctionName == "hash": | ||
distributionFunction = GetShardByIndexModuloReplicasCountDistributionFunction(db) | ||
case filterFunctionName == "legacy": | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default
behaviour should be legacy
, So can't we just have 2 cases case filterFunctionName == "hash"
and default
? Also it would be better if this logic could be moved to cmd/argocd-application-controller/controllers/argocd_application_control.go
where it would be possible to switch the algorithm using command line flags.
return clusters | ||
} | ||
|
||
func createClusterIndexByClusterIdMap(db db.ArgoDB) map[string]int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some comments for the newly added functions ?
if replicas > 0 { | ||
clusterShard = sharding.GetShardByID(cluster.ID, replicas) | ||
distributionFunction := sharding.GetDistributionFunction(argoDB, common.DefaultShardingAlgorithm) | ||
distributionFunction(&cluster) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be clusterShard=distributionFunction(&cluster)
?
haha. That's the workaround which I've been using. The sweet problem is that I have to edit new Secrets monthly because the # of clusters we managed is increasing. |
…ernative algorithms (argoproj#13018) * fix: Extraction of DistributionFunction to allow passing different type of functions to filter clusters by shard - Adding unit tests for sharding - Refresh clusters list on DistributionFunction call Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: ishitasequeira <[email protected]> * fix: Incorrect conversion of an integer with architecture-dependent bit size from [strconv.Atoi](1) to a lower bit size type uint32 without an upper bound check. Signed-off-by: Akram Ben Aissi <[email protected]> * Added config to switch to round-robin sharding Signed-off-by: Raghavi Shirur <[email protected]> Signed-off-by: Akram Ben Aissi <[email protected]> * Documenting sharding more, adding shuffling tests (skipped), re-enable sharding algo env var Signed-off-by: Akram Ben Aissi <[email protected]> * Allow configuration through argocd-cmd-params-cm configMap and key: controller.sharding.algorithm Signed-off-by: Akram Ben Aissi <[email protected]> * De-duplicate code, remove reflection for default case, shorten distribution methods name, ran codegen on manifests Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: Akram Ben Aissi <[email protected]> --------- Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: ishitasequeira <[email protected]> Signed-off-by: Raghavi Shirur <[email protected]> Co-authored-by: Raghavi Shirur <[email protected]>
…ernative algorithms (argoproj#13018) * fix: Extraction of DistributionFunction to allow passing different type of functions to filter clusters by shard - Adding unit tests for sharding - Refresh clusters list on DistributionFunction call Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: ishitasequeira <[email protected]> * fix: Incorrect conversion of an integer with architecture-dependent bit size from [strconv.Atoi](1) to a lower bit size type uint32 without an upper bound check. Signed-off-by: Akram Ben Aissi <[email protected]> * Added config to switch to round-robin sharding Signed-off-by: Raghavi Shirur <[email protected]> Signed-off-by: Akram Ben Aissi <[email protected]> * Documenting sharding more, adding shuffling tests (skipped), re-enable sharding algo env var Signed-off-by: Akram Ben Aissi <[email protected]> * Allow configuration through argocd-cmd-params-cm configMap and key: controller.sharding.algorithm Signed-off-by: Akram Ben Aissi <[email protected]> * De-duplicate code, remove reflection for default case, shorten distribution methods name, ran codegen on manifests Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: Akram Ben Aissi <[email protected]> --------- Signed-off-by: Akram Ben Aissi <[email protected]> Signed-off-by: ishitasequeira <[email protected]> Signed-off-by: Raghavi Shirur <[email protected]> Co-authored-by: Raghavi Shirur <[email protected]>
distributionFunction
that allows developer to introduce other distribution and placement algorithmshash
that uses db.clusterList index as the sharding placement parameterlegacy
and keep it as default as it works with integers as strings of clusters uid. In pratice, uid are uuid which makes the default algorihm not work well.argocd admin shard status
to use thedistributionFunction
and hence produce the same results as the actual placement.string
toint
conversionFixes #9633 #11537 #13136 #11537 #13226
Checklist:
Please see Contribution FAQs if you have questions about your pull-request.