Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose hubble agent when hubble is enabled #11314

Merged
merged 1 commit into from
Apr 30, 2021

Conversation

olemarkus
Copy link
Member

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 24, 2021
@@ -190,6 +190,9 @@ data:
enable-hubble: "true"
# UNIX domain socket for Hubble server to listen to.
hubble-socket-path: "/var/run/cilium/hubble.sock"
# An additional address for Hubble server to listen to (e.g. ":4244").
hubble-listen-address: ":4244"
hubble-disable-tls: "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of exposing things without TLS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nor do I. But it is a choice of this staying broken in 1.20 or having the feature, but no tls. For 1.21 this will hopefully be fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not having the feature. There is nothing as permanent as a temporary shortcut.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you suggest blocking hubble from being enabled through validation? The flag is already there, so we cannot remove this altogether.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll implement serving TLS, but mTLS will be more tricky as kops-controller doesn't have any way of provisioning certs to Deployments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have kops provision a Keypair and then render it to a Secret using a template function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...ah, that would make the addons manifest secret. We've had a conversation like this before. Perhaps an auxiliary manifest-like thing which provisions secrets from the keystore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, kops-controller certs have lifetimes that are tied to nodes. Deployments would need a different lifetime and renewal mechanism. This is why people tend to use JWT over mTLS--Kubernetes has the infrastructure for the former.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out cilium starts normal operations, then wait for the availability of the certificate. So there really isn't any chicken/egg issue here wrt cert-manager.

I have deployed addon PKI + 3 certs (hubble server cert, relay client cert, relay server cert) and tried to get them to work, but something in the mTLS setup is failing. It works when I connect with hubble CLI using the same certs and flags that hubble-relay is supposed to have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problems were caused by our addon CAs only setting common name. This PR contains a workaround now, but we need to resolve the CA certificate problem later on.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/api and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 27, 2021
@olemarkus olemarkus force-pushed the cilium-fix-hubble branch 2 times, most recently from 2fbecb4 to 5cfe5f5 Compare April 27, 2021 09:21
path: server.key
name: tls
---
apiVersion: cert-manager.io/v1alpha2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be v1 already?

Copy link
Member Author

@olemarkus olemarkus Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch yes. not sure how v1alpha2 made it there. thanks for catching

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 28, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 28, 2021
@codablock
Copy link
Contributor

fyi, I have this PR merged and running in one of our clusters and already succesfully used it to debug network failures.

@olemarkus olemarkus requested a review from johngmyers April 30, 2021 12:55
@olemarkus
Copy link
Member Author

I am not sure why hubble-ui is failing. but this still solves the hubble issue, so I think we can just merge it.

@codablock
Copy link
Contributor

Looks like hubble-ui is failing for 2 different reasons:

  1. See cilium Slack: https://cilium.slack.com/archives/CQRL1EPAA/p1619706586139400
    This is fixed in the hubble-ui image v0.7.7
  2. hubble-ui seems to only work when hubble-relay is running without TLS and there is currently no way to enable TLS in hubble-relay. At the same time, you have fixed the hubble-relay service-port to 80, even though TLS is enabled, which seems to be misleading. Looking into the cilium Helm chart, I see that they use port 443 when TLS is enabled.

So, for this PR, I assume that at least 2. should be fixed first.

Also enables PKI for the addon
@olemarkus
Copy link
Member Author

There. Now it works for me.

We'll just have to live with relay TLS false when the UI doesn't support TLS.

@hakman
Copy link
Member

hakman commented Apr 30, 2021

/test all

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 30, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2021
@k8s-ci-robot k8s-ci-robot merged commit d8de9fc into kubernetes:master Apr 30, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Apr 30, 2021
k8s-ci-robot added a commit that referenced this pull request May 1, 2021
…314-origin-release-1.20

Automated cherry pick of #11314: Expose hubble agent when hubble is enabled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons area/api cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants