-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serve metrics under a dedicated port #48
Conversation
lgtm |
etcd.py
Outdated
@@ -253,6 +255,8 @@ def etcd_arguments(self, data_dir, initial_cluster, cluster_state): | |||
self.peer_url, | |||
'-listen-client-urls', | |||
'http://0.0.0.0:{}'.format(self.client_port), | |||
'-listen-metrics-urls', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It breaks compatibility with old etcd, because the same script is used for different images containing different version.
Moreover, it even breaks upgrade 3.2 => 3.3.
Those arguments should be applied conditionally, i.e. only if etcd version is >= 3.3.
You can get version from environment: ETCDVERSION
.
@CyberDem0n Thanks. It now looks at the |
etcd.py
Outdated
# this section handles etcd version specific flags | ||
etcdversion = os.environ.get('ETCDVERSION') | ||
if etcdversion: | ||
etcdversion = list(map(lambda x: int(x), etcdversion.split('.'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
etcdversion = [int(x) for x in etcdversion.split('.')]
You could also use tuples which are immutable (tuple(int(x) for x in etcdversion.split('.'))
), but don't forget to change the comparison to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I changed it.
etcd.py
Outdated
if etcdversion: | ||
etcdversion = list(map(lambda x: int(x), etcdversion.split('.'))) | ||
# don't try to add additonal flags if we encounter an unexpected version format | ||
if len(etcdversion) == 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is worth it TBH. If they decide to do a 3.3.1.2
, this will suddenly break instead of continuing. I'd just check that the length is >= 2, or even drop the check altogether.
etcd.py
Outdated
# don't try to add additonal flags if we encounter an unexpected version format | ||
if len(etcdversion) == 3: | ||
# etcd >= v3.3: serve metrics on an additonal port | ||
if etcdversion[0] >= 3 and etcdversion[1] >= 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
etcdversion >= [3, 3]
@CyberDem0n can we get this merged? |
👍 |
etcd.py
Outdated
@@ -263,6 +266,20 @@ def etcd_arguments(self, data_dir, initial_cluster, cluster_state): | |||
cluster_state | |||
] | |||
|
|||
# this section handles etcd version specific flags | |||
etcdversion = os.environ.get('ETCDVERSION') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still one corner case left, it breaks upgrade from 3.2 to 3.3 :(
During the upgrade first it starts the old binary in order to get data, than it stops it and starts the new binary.
Current change will try to apply -listen-metrics-urls
to the old binary and fail if the old version is 3.2
In order to solve it, let's introduce one more argument to this method, run_old
and pass its value from https://github.com/linki/stups-etcd-cluster/blob/0a914150be28b93ff77b56747b0cf09a010138dc/etcd.py#L474:
return self.me.etcd_arguments(self.DATA_DIR, peers, cluster_state, self.run_old)
And the line 270 has to be changed to:
etcdversion = os.environ.get('ETCDVERSION_PREV' if run_old else 'ETCDVERSION')
etcd.py
Outdated
@@ -241,8 +243,9 @@ def delete_member(self, member): | |||
self.adjust_security_groups('revoke_ingress', member) | |||
return result | |||
|
|||
def etcd_arguments(self, data_dir, initial_cluster, cluster_state): | |||
return [ | |||
def etcd_arguments(self, data_dir, initial_cluster, cluster_state, run_old)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- def etcd_arguments(self, data_dir, initial_cluster, cluster_state, run_old)):
+ def etcd_arguments(self, data_dir, initial_cluster, cluster_state, run_old):
this is the reason why tests are failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
damn, sorry about that.
👍 |
@CyberDem0n let me quickly test on our cluster that it still works as expected before merging. |
@CyberDem0n ok, works with |
I need a second 👍 |
@CyberDem0n Did it work for you as well? |
I didn't tried. In order to build image it needs to be merged and we need a second approval for that |
👍 |
etcd v3.3 introduced a new flag to allow serving
/metrics
and/health
under a different port than e.g./v2/keys
. This allows us to protect etcd's data via firewall rules but still let monitoring tools to access the monitoring information.See feature request in etcd repo: etcd-io/etcd#8060.
The implementation landed in v3.3: etcd-io/etcd#8242
This PR instructs etcd to serve metrics and health under the additonal port
2381
unconditionally when the used etcd binary is>=v3.3.x
. However, if not explicitely set in thesenza.yaml
this port won't be mapped to the outside and therefore isn't accessible. It doesn't expose more information than anything under2379
already does. See our intended usage here: zalando-incubator/kubernetes-on-aws#879.Note:
This fails when bundled with etcd lower than. Furthermore, serving this additional endpoint unconditionally should be safe but might not be desired. It keeps the implementation very simple, though. LMKWYTv3.3
/cc @CyberDem0n @aermakov-zalando @mikkeloscar