Scheduler and controller-manager ports for metrics #791

brancz · 2018-12-05T16:50:13Z

This pull request opens the scheduler and controller-manager ports for the openshift and aws platforms.

abhinavdahiya · 2018-12-05T17:22:56Z

@smarterclayton already has opened 9000-9999 for metrics #683

Please use those.

brancz · 2018-12-05T17:27:52Z

I was specifically instructed by @smarterclayton to do this patch 🙂 . The scheduler is in the same situation as the other ports around it, which is why those exist as well.

brancz · 2018-12-05T17:34:37Z

/retest

abhinavdahiya · 2018-12-05T17:35:37Z

I was specifically instructed by @smarterclayton to do this patch slightly_smiling_face . The scheduler is in the same situation as the other ports around it, which is why those exist as well.

the ports around it are kubelet ports that are used proabaly for apiserver to kubelet communication for exec/logs.

you are opening new ports for metrics.. that's different and we already have a range opened for metrics.

And if these can't be in 9000-9999 range, only then we don't have a choice. :P

brancz · 2018-12-05T17:37:56Z

They can't be today, which is why I was instructed to do so. In the future (when 1.12 rebase lands) the master team may be able to fix this, but to get metrics into a working state now we're going for this.

brancz · 2018-12-05T18:51:20Z

/retest

brancz · 2018-12-05T19:51:45Z

/refresh

brancz · 2018-12-05T20:08:45Z

/refresh

brancz · 2018-12-05T20:09:19Z

umm ... I open the link and everything passed ... any ideas?

/refresh doesn't seem to help

brancz · 2018-12-05T20:53:48Z

/refresh

smarterclayton · 2018-12-05T21:02:51Z

/retest

data/data/aws/vpc/sg-master.tf

wking · 2018-12-05T21:13:50Z

data/data/aws/vpc/sg-master.tf

@@ -148,6 +148,46 @@ resource "aws_security_group_rule" "master_ingress_kubelet_insecure" {
  self      = true
 }

+resource "aws_security_group_rule" "master_ingress_kube_scheduler_from_worker" {


Do workers need direct access to the scheduler? This is surprising to me, but I'm not familiar with the scheduler implementation.

(cluster-monitoring) prometheus is running on workers, and prometheus needs to scrape the metrics

... and prometheus needs to scrape the metrics

Ugh. Ok. But it really feels like we should get Prometheus and other monitoring out into a different subnet or something then. Having these ports and the 9000-9999 range on all machines open to all workers seems like one more thing to worry about for security. Not something that's going to happen this week though.

These services are already protected, we need to do a better job of scanning them to catch security regressions though.

We should not be depending on security groups for security. We need to find a way to ensure teams can own both exposing their metrics and securing their metrics. In the core control plane case we've already done that, but we need to keep improving.

We should not be depending on security groups for security.

No, but the more barriers there are between a given bug an a useful exploit, the better.

Agreed. My understanding is that once the 1.12 rebase lands the masters team will change the secure port to something in the open range (9000-10000).

Though the default secure ports are 10259 for the scheduler, and 10257 for the controller-manager. As we don't change this for the kubelet, I'd be hesitant to deviate from the defaults for other components.

abhinavdahiya · 2018-12-05T21:59:37Z

/approve

brancz · 2018-12-06T18:04:53Z

/retest

brancz · 2018-12-07T05:31:30Z

/retest

wking · 2018-12-07T08:56:31Z

/lgtm

brancz · 2018-12-08T17:37:08Z

/retest

brancz · 2018-12-08T18:56:57Z

/retest

brancz · 2018-12-08T21:39:27Z

/retest

brancz · 2018-12-08T23:22:28Z

/retest

brancz · 2018-12-09T01:21:09Z

Can someone spot any errors in the PR? It seems to me that the CI failures are flakes, but if that's the case it's very flaky.

/retest

brancz · 2018-12-09T02:35:52Z

/retest

brancz · 2018-12-09T22:20:09Z

/retest

brancz · 2018-12-10T19:39:27Z

It seems this was a real failure, I believe it should be fixed now.

wking · 2018-12-10T19:45:37Z

It seems this was a real failure, I believe it should be fixed now.

There were a number of CI issues over the weekend, which is what I'd thought was going on here. And it looks like you didn't actually change anything?

$ diff -u <(git show 69c4da3) <(git show 1d866fe)
--- /dev/fd/63	2018-12-10 11:43:46.879281433 -0800
+++ /dev/fd/62	2018-12-10 11:43:46.880281438 -0800
@@ -1,4 +1,4 @@
-commit 69c4da3017737f48f7fa5abb37d8b31782268853
+commit 1d866fe4b42d1a656b2c7f08ee39c9c45b3d5421
 Author: Frederic Branczyk <[email protected]>
 Date:   Wed Dec 5 17:30:02 2018 +0100
 
$ diff -u <(git show 69c4da3^) <(git show 1d866fe^)
--- /dev/fd/63	2018-12-10 11:44:36.581510484 -0800
+++ /dev/fd/62	2018-12-10 11:44:36.581510484 -0800
@@ -1,4 +1,4 @@
-commit 6d69febaac81921314e719073035496ca643f2d9
+commit cad1f25a176afe9593c8db6d6406a055c4e78d31
 Author: Frederic Branczyk <[email protected]>
 Date:   Wed Dec 5 17:29:22 2018 +0100

Did you expect to have fixed a bug in your PR that maybe you forgot to commit? Or are you just talking about the external CI issues? Anyway, still looks good to me:

/lgtm

openshift-ci-robot · 2018-12-10T19:45:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, brancz, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

brancz · 2018-12-10T19:48:12Z

I was just talking about external issues. I wasn't changing anything, but wanted to make sure I didn't miss out on changes in case they are important for CI to pass so I rebased, sorry if that was unnecessary.

wking · 2018-12-10T20:09:03Z

No worries, I was just making sure I was on the same page.

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 5, 2018

openshift-ci-robot requested review from tomassedovic and wking December 5, 2018 16:50

brancz mentioned this pull request Dec 5, 2018

Change kube-controllers targets to separated scheduler and controller-manager openshift/cluster-monitoring-operator#177

Merged

wking reviewed Dec 5, 2018

View reviewed changes

data/data/aws/vpc/sg-master.tf Outdated Show resolved Hide resolved

wking reviewed Dec 5, 2018

View reviewed changes

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 5, 2018

brancz force-pushed the sched-cm-ports branch from f3b8e90 to ab5bd6b Compare December 6, 2018 17:15

openshift-ci-robot assigned wking Dec 7, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 7, 2018

brancz force-pushed the sched-cm-ports branch from ab5bd6b to 69c4da3 Compare December 8, 2018 20:07

openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Dec 8, 2018

brancz added 2 commits December 10, 2018 11:38

openstack: Open controller-manager and scheduler ports for metrics

cad1f25

aws: Open controller-manager and scheduler ports for metrics

1d866fe

brancz force-pushed the sched-cm-ports branch from 69c4da3 to 1d866fe Compare December 10, 2018 19:39

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 10, 2018

openshift-merge-robot merged commit b744c17 into openshift:master Dec 10, 2018

brancz deleted the sched-cm-ports branch December 10, 2018 20:49

wking mentioned this pull request Dec 10, 2018

CHANGELOG: Document changes since v0.5.0 #841

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduler and controller-manager ports for metrics #791

Scheduler and controller-manager ports for metrics #791

brancz commented Dec 5, 2018

abhinavdahiya commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

abhinavdahiya commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018 •

edited

Loading

brancz commented Dec 5, 2018

smarterclayton commented Dec 5, 2018

wking Dec 5, 2018

brancz Dec 5, 2018

wking Dec 5, 2018 •

edited

Loading

smarterclayton Dec 5, 2018 •

edited

Loading

wking Dec 6, 2018

brancz Dec 6, 2018

brancz Dec 6, 2018

abhinavdahiya commented Dec 5, 2018

brancz commented Dec 6, 2018

brancz commented Dec 7, 2018

wking commented Dec 7, 2018

brancz commented Dec 8, 2018

brancz commented Dec 8, 2018

brancz commented Dec 8, 2018

brancz commented Dec 8, 2018

brancz commented Dec 9, 2018

brancz commented Dec 9, 2018

brancz commented Dec 9, 2018

brancz commented Dec 10, 2018

wking commented Dec 10, 2018

openshift-ci-robot commented Dec 10, 2018

brancz commented Dec 10, 2018

wking commented Dec 10, 2018

Scheduler and controller-manager ports for metrics #791

Scheduler and controller-manager ports for metrics #791

Conversation

brancz commented Dec 5, 2018

abhinavdahiya commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

abhinavdahiya commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018

brancz commented Dec 5, 2018 • edited Loading

brancz commented Dec 5, 2018

smarterclayton commented Dec 5, 2018

wking Dec 5, 2018

Choose a reason for hiding this comment

brancz Dec 5, 2018

Choose a reason for hiding this comment

wking Dec 5, 2018 • edited Loading

Choose a reason for hiding this comment

smarterclayton Dec 5, 2018 • edited Loading

Choose a reason for hiding this comment

wking Dec 6, 2018

Choose a reason for hiding this comment

brancz Dec 6, 2018

Choose a reason for hiding this comment

brancz Dec 6, 2018

Choose a reason for hiding this comment

abhinavdahiya commented Dec 5, 2018

brancz commented Dec 6, 2018

brancz commented Dec 7, 2018

wking commented Dec 7, 2018

brancz commented Dec 8, 2018

brancz commented Dec 8, 2018

brancz commented Dec 8, 2018

brancz commented Dec 8, 2018

brancz commented Dec 9, 2018

brancz commented Dec 9, 2018

brancz commented Dec 9, 2018

brancz commented Dec 10, 2018

wking commented Dec 10, 2018

openshift-ci-robot commented Dec 10, 2018

brancz commented Dec 10, 2018

wking commented Dec 10, 2018

brancz commented Dec 5, 2018 •

edited

Loading

wking Dec 5, 2018 •

edited

Loading

smarterclayton Dec 5, 2018 •

edited

Loading