-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose memberlist label configs #187
Conversation
Signed-off-by: Mauro Stettler <[email protected]>
Signed-off-by: Mauro Stettler <[email protected]>
Signed-off-by: Mauro Stettler <[email protected]>
Signed-off-by: Mauro Stettler <[email protected]>
Signed-off-by: Mauro Stettler <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comments.
In general, I'm not sure why you modify test helpers into returning errors instead of failing on them directly? Personally I prefer for test helpers to catch errors, so their callers don't have to check for them (which increases test verbosity).
if err := casWithErr(context.Background(), t, kv, key, updateFn); err != nil { | ||
t.Fatal(err) | ||
} | ||
func cas(kv *Client, key string, updateFn func(*data) (*data, bool, error)) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the change to return an error instead of failing the test on error (as a t.Helper()
)? I think normally it's preferable to catch test errors in test helpers instead of returning them.
func TestMultipleClients(t *testing.T) { | ||
c := dataCodec{} | ||
members := 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit] Might as well be a constant:
members := 10 | |
const members = 10 |
BindAddrs: []string{"localhost"}, | ||
BindPort: 0, // randomize ports | ||
func TestMultipleClientsWithMixedLabelsAndSkipLabelCheck(t *testing.T) { | ||
members := 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit] Might as well be a constant:
members := 3 | |
const members = 3 |
|
||
var clients []*Client | ||
func TestMultipleClientsWithMixedLabelsAndExpectFailure(t *testing.T) { | ||
members := 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit] Might as well be a constant:
members := 3 | |
const members = 3 |
} | ||
|
||
func TestMultipleClientsWithSameLabelWithoutSkipLabelCheck(t *testing.T) { | ||
members := 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit] Might as well be a constant:
members := 10 | |
const members = 10 |
configGen := func(i int) KVConfig { | ||
cfg := defaultKVConfig(i) | ||
|
||
cfg.Label = label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, should we set SkipInboundLabelCheck
also here?
cfg.Label = label | |
cfg.SkipInboundLabelCheck = false | |
cfg.Label = label |
require.NoError(t, err) | ||
} | ||
|
||
func testMultipleClientsWithConfigGenerator(t *testing.T, members int, configGen func(memberId int) KVConfig) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make this into a t.Helper()
that catches errors instead of returning them, it would only make the tests simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of TestMultipleClientsWithMixedLabelsAndExpectFailure
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM (modulo few nits).
@@ -594,7 +705,11 @@ func TestMultipleClients(t *testing.T) { | |||
if updates < members { | |||
// in general, at least one update from each node. (although that's not necessarily true... | |||
// but typically we get more updates than that anyway) | |||
t.Errorf("expected to see updates, got %d", updates) | |||
return fmt.Errorf("expected to see %d updates, got %d", members, updates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf("expected to see %d updates, got %d", members, updates) | |
return fmt.Errorf("expected to see at least %d updates, got %d", members, updates) |
require.NoError(t, err) | ||
} | ||
|
||
func testMultipleClientsWithConfigGenerator(t *testing.T, members int, configGen func(memberId int) KVConfig) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of TestMultipleClientsWithMixedLabelsAndExpectFailure
.
cfg.GossipInterval = 100 * time.Millisecond | ||
cfg.GossipNodes = 3 | ||
cfg.PushPullInterval = 5 * time.Second | ||
err := testMultipleClientsWithConfigGenerator(t, members, configGen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] We can remove members
variable at all.
err := testMultipleClientsWithConfigGenerator(t, members, configGen) | |
err := testMultipleClientsWithConfigGenerator(t, len(labels), configGen) |
return cfg | ||
} | ||
|
||
err := testMultipleClientsWithConfigGenerator(t, members, configGen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here.
kv/memberlist/memberlist_client.go
Outdated
@@ -399,6 +404,9 @@ func (m *KV) buildMemberlistConfig() (*memberlist.Config, error) { | |||
mlCfg.AdvertiseAddr = m.cfg.AdvertiseAddr | |||
mlCfg.AdvertisePort = m.cfg.AdvertisePort | |||
|
|||
mlCfg.Label = m.cfg.Label | |||
mlCfg.SkipInboundLabelCheck = m.cfg.SkipInboundLabelCheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below we log:
level.Info(m.logger).Log("msg", "Using memberlist cluster node name", "name", mlCfg.Name)
I would move that log at the end of the function, and change it to something like:
level.Info(m.logger).Log("msg", "Using memberlist cluster label %q and node name %q", mlCfg.Label, mlCfg.Name)
When manually testing this, I found it useful seeing the actual label.
kv/memberlist/memberlist_client.go
Outdated
@@ -193,6 +196,8 @@ func (cfg *KVConfig) RegisterFlagsWithPrefix(f *flag.FlagSet, prefix string) { | |||
f.BoolVar(&cfg.EnableCompression, prefix+"memberlist.compression-enabled", mlDefaults.EnableCompression, "Enable message compression. This can be used to reduce bandwidth usage at the cost of slightly more CPU utilization.") | |||
f.StringVar(&cfg.AdvertiseAddr, prefix+"memberlist.advertise-addr", mlDefaults.AdvertiseAddr, "Gossip address to advertise to other members in the cluster. Used for NAT traversal.") | |||
f.IntVar(&cfg.AdvertisePort, prefix+"memberlist.advertise-port", mlDefaults.AdvertisePort, "Gossip port to advertise to other members in the cluster. Used for NAT traversal.") | |||
f.StringVar(&cfg.Label, prefix+"memberlist.label", mlDefaults.Label, "Label is an optional set of bytes to include on the outside of each packet and stream.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "label" CLI flag and YAML config option name is very generic. I'm wondering if we should call it something like "cluster-label" or "cluster-id" given we use it to authenticate the cluster ID. If so, then I would change the "skip" CLI/YAML option accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cluster-label
has a nice sound to it
I think the config option (which inverts the behavior to memberlist unfortunately) which I would prefer would be
enforce-cluster-label
Although we might want to get more opinions on enforce
i'm not sure how well that would be understood for non native english speakers and if there is something more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also enforce-cluster-label would default to true IMO)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like "cluster-label" too. While working on the jsonnet changes to configure it, it sounds nice in the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enforce-cluster-label
WDYT about "cluster-label-verification-enabled" instead of "enforce-cluster-label"? While doing change I've the feeling "enforce-cluster-label" may not be that clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Ed on Slack:
sounds good to me! I like that better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Ed on Slack:
also you could do it then as
cluster-label-verification-disabled
and the default could be false and it would then match the same semantics as the existing config in memberlist
@replay I pick up the work to do changes to config. |
Signed-off-by: Marco Pracucci <[email protected]>
…cation-disabled' Signed-off-by: Marco Pracucci <[email protected]>
I merged that PR and updated
I pushed few commits to apply my own comments and config renamed as discussed with Ed. Could you review it again, please? |
…gation in unit tests Signed-off-by: Marco Pracucci <[email protected]>
Co-authored-by: Ed Welch <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a couple fixes to the explanation of the variables (based on our recent renaming)
Thanks @slim-bean for the edits 🤦 Applied. |
I agree with you that it is nicer to just pass |
Thanks for your changes Marco, they make sense to me |
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
@@ -140,6 +140,9 @@ type KVConfig struct { | |||
AdvertiseAddr string `yaml:"advertise_addr"` | |||
AdvertisePort int `yaml:"advertise_port"` | |||
|
|||
ClusterLabel string `yaml:"cluster_label" category:"experimental"` | |||
ClusterLabelVerificationDisabled bool `yaml:"cluster_label_verification_disabled" category:"experimental"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we call this ClusterLabelVerificationDisabled
and make it default to true
? Most of our other options are -enabled
, not -disabled
. Reverse logic is confusing to users.
This makes three changes
skip-inbound-label-check
(PR)TestMultipleClients
Cluster with no labels and skip-inbound-label-check disabled, expected to succeed in joiningTestMultipleClientsWithMixedLabelsAndExpectFailure
Cluster where some members have labels and some don't and skip-inbound-label-check is disabled, this cluster is expected to fail because the members can't join each otherTestMultipleClientsWithMixedLabelsAndSkipLabelCheck
Cluster where some members have labels and some don't and skip-inbound-label-check is enabled, this cluster is expected to succeed in joiningTestMultipleClientsWithSameLabelWithoutSkipLabelCheck
Cluster where all members have the same label and skip-inbound-label-check is disabled, this cluster is expected to succeed in joiningThe above unit tests basically test a migration scenario where we migrate an existing cluster which currently doesn't use labels to use labels by going through the following steps:
Furthermore the unit tests also verify that if skip-inbound-label-check is disabled, processes which have different labels cannot join each other, providing isolation between the processes.