-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduler and controller-manager ports for metrics #791
Scheduler and controller-manager ports for metrics #791
Conversation
@smarterclayton already has opened Please use those. |
I was specifically instructed by @smarterclayton to do this patch 🙂 . The scheduler is in the same situation as the other ports around it, which is why those exist as well. |
/retest |
the ports around it are kubelet ports that are used proabaly for apiserver to kubelet communication for you are opening new ports for metrics.. that's different and we already have a range opened for metrics. And if these can't be in |
They can't be today, which is why I was instructed to do so. In the future (when 1.12 rebase lands) the master team may be able to fix this, but to get metrics into a working state now we're going for this. |
/retest |
/refresh |
1 similar comment
/refresh |
umm ... I open the link and everything passed ... any ideas? /refresh doesn't seem to help |
/refresh |
/retest |
@@ -148,6 +148,46 @@ resource "aws_security_group_rule" "master_ingress_kubelet_insecure" { | |||
self = true | |||
} | |||
|
|||
resource "aws_security_group_rule" "master_ingress_kube_scheduler_from_worker" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do workers need direct access to the scheduler? This is surprising to me, but I'm not familiar with the scheduler implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(cluster-monitoring) prometheus is running on workers, and prometheus needs to scrape the metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and prometheus needs to scrape the metrics
Ugh. Ok. But it really feels like we should get Prometheus and other monitoring out into a different subnet or something then. Having these ports and the 9000-9999 range on all machines open to all workers seems like one more thing to worry about for security. Not something that's going to happen this week though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These services are already protected, we need to do a better job of scanning them to catch security regressions though.
We should not be depending on security groups for security. We need to find a way to ensure teams can own both exposing their metrics and securing their metrics. In the core control plane case we've already done that, but we need to keep improving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be depending on security groups for security.
No, but the more barriers there are between a given bug an a useful exploit, the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. My understanding is that once the 1.12 rebase lands the masters team will change the secure port to something in the open range (9000-10000).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though the default secure ports are 10259 for the scheduler, and 10257 for the controller-manager. As we don't change this for the kubelet, I'd be hesitant to deviate from the defaults for other components.
/approve |
f3b8e90
to
ab5bd6b
Compare
/retest |
1 similar comment
/retest |
/lgtm |
/retest |
1 similar comment
/retest |
ab5bd6b
to
69c4da3
Compare
/retest |
/retest |
Can someone spot any errors in the PR? It seems to me that the CI failures are flakes, but if that's the case it's very flaky. /retest |
/retest |
1 similar comment
/retest |
69c4da3
to
1d866fe
Compare
It seems this was a real failure, I believe it should be fixed now. |
There were a number of CI issues over the weekend, which is what I'd thought was going on here. And it looks like you didn't actually change anything? $ diff -u <(git show 69c4da3) <(git show 1d866fe)
--- /dev/fd/63 2018-12-10 11:43:46.879281433 -0800
+++ /dev/fd/62 2018-12-10 11:43:46.880281438 -0800
@@ -1,4 +1,4 @@
-commit 69c4da3017737f48f7fa5abb37d8b31782268853
+commit 1d866fe4b42d1a656b2c7f08ee39c9c45b3d5421
Author: Frederic Branczyk <[email protected]>
Date: Wed Dec 5 17:30:02 2018 +0100
$ diff -u <(git show 69c4da3^) <(git show 1d866fe^)
--- /dev/fd/63 2018-12-10 11:44:36.581510484 -0800
+++ /dev/fd/62 2018-12-10 11:44:36.581510484 -0800
@@ -1,4 +1,4 @@
-commit 6d69febaac81921314e719073035496ca643f2d9
+commit cad1f25a176afe9593c8db6d6406a055c4e78d31
Author: Frederic Branczyk <[email protected]>
Date: Wed Dec 5 17:29:22 2018 +0100
Did you expect to have fixed a bug in your PR that maybe you forgot to commit? Or are you just talking about the external CI issues? Anyway, still looks good to me: /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, brancz, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I was just talking about external issues. I wasn't changing anything, but wanted to make sure I didn't miss out on changes in case they are important for CI to pass so I rebased, sorry if that was unnecessary. |
No worries, I was just making sure I was on the same page. |
This pull request opens the scheduler and controller-manager ports for the openshift and aws platforms.
cc @smarterclayton