-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for the VenafiConnection CRD so that users can start using the Workload Identity Federation authentication ("secretless") #552
Conversation
pkg/client/client_venconn.go
Outdated
func loadRESTConfig(path string) (*rest.Config, error) { | ||
switch path { | ||
// If the kubeconfig path is not provided, use the default loading rules | ||
// so we read the regular KUBECONFIG variable or create a non-interactive | ||
// client for agents running in cluster | ||
case "": | ||
loadingrules := clientcmd.NewDefaultClientConfigLoadingRules() | ||
cfg, err := clientcmd.NewNonInteractiveDeferredLoadingClientConfig( | ||
loadingrules, &clientcmd.ConfigOverrides{}).ClientConfig() | ||
if err != nil { | ||
return nil, errors.WithStack(err) | ||
} | ||
return cfg, nil | ||
// Otherwise use the explicitly named kubeconfig file. | ||
default: | ||
cfg, err := clientcmd.NewNonInteractiveDeferredLoadingClientConfig( | ||
&clientcmd.ClientConfigLoadingRules{ExplicitPath: path}, | ||
&clientcmd.ConfigOverrides{}).ClientConfig() | ||
if err != nil { | ||
return nil, errors.WithStack(err) | ||
} | ||
return cfg, nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review: I copy-pasted this from somewhere else in the Agent codebase. I'll find a better way than copy-pasting.
@@ -0,0 +1,176 @@ | |||
package client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly a copy-paste from the file client_oauth.go
. Since both use one form or another of the OAuth flow, I think it would be good to merge both later on.
9a35e7a
to
1deb969
Compare
1ac29bc
to
bd26101
Compare
ReviewWhat will be the UX?
This sounds like a pain, but later I suppose we might
Does it need to be repeated for each component too?
How does this relate to the
Please add a more realistic name and serviceAccount name and audiences here.
|
I asked whether I should let users use an API key with a VenafiConnection resource. Atanas answered:
I'll show an error if someone tries using an API key using a VenafiConnection resource in the Venafi Kubernetes Agent. |
Discussed today during the afternoon standup:
I've discussed this with Adam, and the plan is to do the following:
|
Yes, the minimum number of service accounts to be created in the UI or using
Note that it might be possible to create a single SA instead of two by using VCP's serviceaccount API directly, but I haven't tried that. So, if you have 20 clusters, you will have to either go through the UI 40 times or run
The client ID is only needed by the Private Key JWT authentication that already exists in the agent. The Workoad Identity Federation service accounts only require the "company ID" (or "tenant ID"). Thus, the client ID that is displayed in the UI isn't useful to the user: On the other hand, the "company ID" (or "tenant ID") is important but isn't displayed on the UI. One has to extract it from the Token URL. What's weird is that both methods (private key and workoad identity federation) use exactly the same mechanism under the hood, but one requires a client ID to figure out which service account it is, and the other uses a company ID along with the public key to figure out the service account. I guess this technical difference doesn't matter a lot, but the difference in experience will be unexpected to users. |
Tim pointed out that my "envtest" tests could have been written using the ConnectionHandler fake. He shared the example of signer_test.go that uses this fake. Unfortunately, I won't have time to move from envtest to the fake by the end of tomorrow. I plan to do that later on. Would it be OK if we go ahead with these slow envtests as a "first step" so we can review/merge this PR? I'd like to finish this feature by August 19th. My remaining work days are Aug 13th and Aug 19th. Tomorrow, I'll work on testing https://github.com/jetstack/venafi-connection-lib/pull/220. |
c5de925
to
4303312
Compare
Let's pin venafi-connection-lib to the latest commit on main until we decide to release a new version of venafi-connection-lib (if we actually do).
Note that I should probably have gone with a fake of the ConnectionHandler instead of an envtest. We will move to the fake later on. I added the venaficonnection CRDs manually for now. I have a PR to automate pulling these CRDs from the venafi-connection-lib project: #556 For now, I added these manifests manually with the following commands: gh pr checkout 556 git checkout - git checkout step1-makefile-modules -- deploy/charts/venafi-kubernetes-agent/templates/venafi-connection-crd{,.without-validations}.yaml
CI is fixed and PR is ready to be reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maelvls
I've done quite a lot of testing of the VenafiConnection feature while updating the Helm chart in #559, but I haven't tested whether the agent still works in traditional Service account mode.
I left a few comments and optional suggestions. Ping me for another review if you choose to address those here.
/lgtm
/approve
/hold in case you prefer to make changes here.
deploy/charts/venafi-kubernetes-agent/templates/venafi-connection-crd.without-validations.yaml
Outdated
Show resolved
Hide resolved
return func(t *testing.T) { | ||
fakeVenafiCloud, certCloud := fakeVenafiCloud(t) | ||
fakeTPP, certTPP := fakeTPP(t) | ||
_, restconf, kclient := startEnvtest(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned envtest is discarded here, but I think you should t.Cleanup(env.Stop)
. I see envtest processes lingering after I run these tests
$ pidof etcd kube-apiserver
374054 373960
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's odd.
The func startEnvtest
already calls t.Cleanup(envtest.Stop)
to stop the kube-apiserver and etcd processes:
func startEnvtest(t testing.TB) (_ *envtest.Environment, _ *rest.Config, kclient ctrlruntime.WithWatch) {
// ...
t.Cleanup(func() {
t.Log("Waiting for envtest to exit")
err = envtest.Stop()
require.NoError(t, err)
})
Not sure why it isn't cleaned up. I'll have to investigate. Would it be OK if we merge this PR even if this oddity is still there?
- secret: | ||
name: accesstoken | ||
fields: [accesstoken]`), | ||
expectReadyCondMsg: "ea744d098c2c1c6044e4c4e9d3bf7c2a68ef30553db00f1714886cedf73230f1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a confusing Ready condition message! What does it mean? Where does it come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey. I've looked at the official Venafi Connection API reference page and wasn't able to find where this is documented. I don't think it is documented anywhere.
I found an example of status here. It looks like this:
kind: VenafiConnection
status:
conditions:
- type: VenafiEnhancedIssuerReady
status: "True"
reason: Generated a token
message: 8db2c15ace6f4c7b59138909b6b69d6caca69e2d308e695206b5e15ddaf28e81
tokenValidUntil: "2123-10-02T19:57:36Z"
What (I think) happens is that venafi-connection-lib stores some form of hash in the message
field... The reason
is actually a message
(reason
should be TokenGenerated or something like that), and the message
contains a non-readable string that should rather be in some custom tokenHash
field. Funnily enough, the official Venafi Connection API reference says that message
is a human-readable explanation:
Message is a human readable description of the details of the last transition, complementing reason.
I don't think this quirk is a big deal, but IMO it should be clearly stated in the API reference or somewhere else that message
contains an opaque string used by venafi-connection-lib to work, and that reason
contains the actual human-readable string. For example, in case of error, it shows:
kind: VenafiConnection
status:
conditions:
- type: VenafiEnhancedIssuerReady
status: "False"
reason: |
connection is not ready yet (Venafi self-test failed): error authenticating
with Venafi: vcert error: server error: server unavailable: Get "https://api.venafi.cloud/v1/useraccounts":
net/http: request canceled (Client.Timeout exceeded while awaiting headers)
message: ""
tokenValidUntil: ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, the message field contains a hash of the token.
We can change this, there is no logic that depends on this behavior (only the human experience).
Co-authored-by: Richard Wall <[email protected]>
Signed-off-by: Richard Wall <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I merged the Helm templates into this branch. And the tests are passing.
Please merge unless you want to address any thing else here.
This PR is good to go! Three things I'd like to address in later PRs:
|
That's 100 years away! I noticed the same when testing this branch. Is it a bug in VenafConnection lib? |
// TODO(mael): The rest of the codebase uses the standard "log" package, | ||
// venafi-connection-lib uses "go-logr/logr", and client-go uses "klog". We | ||
// should standardize on one of them, probably "slog". | ||
ctrlruntimelog.SetLogger(logr.Logger{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO has been addressed in #626:
Today, @hawksight asked: why do we recommend using a single venafi-components? Peter made the point that he would most likely recommend customers to use separate service accounts, one for each component. Peter said it would be OK for demo purposes, but not for production. I think I said something about a well-known name:
Should we not rather show the recommended way on the website instead of |
We have decided to make the Venafi Kubernetes Agent compatible with the VenafiConnection CRD in order to bring support for secretless authentication to VCP (also called workload identity federation). When using the following flags in the Agent's deployment, the Agent uses the authentication method provided in the given VenafiConnection to connect to VCP:
This PR relies on the PR https://github.com/jetstack/venafi-connection-lib/pull/220. Most of the actual changes are visible in this other PR.
Refs:
Work done in this PR
tpp
field.vcp
field using the API key method. Even though the authorization headertppl-api-key: <apikey>
is supported for uploading cluster data (see https://github.com/jetstack/venafi-connection-lib/pull/220#discussion_r1692041030), we will still show an error message. Slack message from Atanas:Work to be done in follow-up PRs
venafiConnection
to the Helm chart → VC-35374vCert
for all logs → VC-35375What will be the UX?
To use the secretless auth (which is the entire point for bringing the Venafi Connection CRD into the agent), the user will first need to install the Venafi Connection CRD Helm chart.
Then, they will go to the UI to create an "Agent" Workload Identity Federation service account by filling in the Kubernetes cluster's issuer and JWKS URI (which needs to be repeated for each Kubernetes cluster).
Then, they will look for their "company ID" (which is called "tenant ID" in Venafi Connection) by extracting it from the "Token URL" visible in the UI. For example, the token URL in the UI may look like:
With this company ID, the user will create a VenafiConnection resource:
Finally, they will install the agent's own Helm chart using the following command:
Testing Manually
For this test, we will be using the test tenant https://ven-tlspk.venafi.cloud (US). The user we will be using is
[email protected]
, and the password is at the top of the document Production Accounts.First, go to https://ven-tlspk.venafi.cloud/platform-settings/user-preferences?key=api-keys to get the API key. You will need it in the next step. Set the env var APIKEY:
export APIKEY=...
Now, we need a Kind cluster for which the OIDC endpoint is reachable from the internet. We will use
kind-tailscale
for this. First, create the cluster:Then, install Tailscale.
Then, install
kind-tailscale
(I wrote this) and run itcurl -sSLO https://gist.githubusercontent.com/maelvls/adf680ae01612ff79658872c7dca013f/raw/kind-tailscale install kind-tailscale ~/bin
Run
kind-tailscale
(please review the contents of the script first):Now, curl the OIDC configuration:
oidc_conf=$(curl https://$(tailscale status --json | jq -r .Self.DNSName):8443/.well-known/openid-configuration -sS --fail-with-body | tee /dev/stderr)
Now, you are ready to create the Workload Identity Federation Service Account:
Note
If you need to delete the service account, use the command:
Now, use the following command to clone the project venafi-connection-lib in the same folder as your clone of
jetstack-secure
:Now, install the VenafiConnection CRD:
Then, do:
Then, create the necessary RBAC and VenafiConnection:
Then, create the VenafiConnection:
Run this:
Finally, run: