Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_nomad_si token getting expired or revoked. #24354

Closed
danm-talkscriber opened this issue Nov 1, 2024 · 3 comments
Closed

_nomad_si token getting expired or revoked. #24354

danm-talkscriber opened this issue Nov 1, 2024 · 3 comments

Comments

@danm-talkscriber
Copy link

Nomad version

Nomad v1.8.4

Operating system and Environment details

Debian 12

Issue

Issue affect consul-connect enabled job. To establish connection between connect-proxies nomad is requesting token from consul - _nomad_si token. Periodically that token is getting expired or revoked, causing service outage since communication between services is becoming broken.

Reproduction steps

There is no reproduction step, just need to wait certain period of time, when token is getting expired/revoked.

Expected Result

Either token need to be renewed or replaced with new one.

Actual Result

Getting service outage.

Job file (if appropriate)

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

@tgross
Copy link
Member

tgross commented Nov 1, 2024

@danm-talkscriber this is described in the docs for configuring your auth method: https://developer.hashicorp.com/nomad/docs/integrations/consul/acl#consul-auth-method

Nomad cannot recreate Consul tokens that have been deleted. The auth method configuration should never set the MaxTokenTTL field.

Closing as duplicate of #20185

@tgross tgross closed this as not planned Won't fix, can't repro, duplicate, stale Nov 1, 2024
@danm-talkscriber
Copy link
Author

@tgross , thanks for so rapid reply. Going through docs now. I got damn good federated cluster setup, and that is only one issue that just killing me once a month or so.

@tgross
Copy link
Member

tgross commented Nov 4, 2024

From your email:

  1. I have configured consul secret engine in vault and using it to generate tokens for nomad. That tokens are gettign expired periodically - i know where it's and i know how to tune secret engine to extend TTL for that tokens. Not too much questions here.

You can have whatever TTL you'd like for the Nomad agent's token, so long as you have the means to renew it and reload the Nomad agent out-of-band. Nomad does not renew this token on its own.

  1. _nomad_is tokens - that one has much more mistery for me. I understand that that token requested by nomad, i thecked that token and there is policy which is not really visible, im assuming that it is sort of implicit policy - created by .... whom ? Secondly, as we discussed on the github - that token getting expired, or revoked by ... Nomad ? Is there exposed to the enduser any type of control on that process ? Can i change for example TTL for such tokens ?

The binding rules you created in Consul will configure this. The Consul docs on Service Identity get into the default policies. But the TTL is determined by the Consul Auth Method config (which you should have created as described here). You should not configure a TTL on those tokens. If you insist on doing so, the TTL needs to exceed the allocation lifetime.

Now in that picture - where does workload identity is playing the role ? Nomad agents (clients and servers) - still uses consul tokens to register itself - the tokens configured in "consul" block. Im not providing any consul tokens to my primary task (for example http server), BUT i think such token (_nomad_si token) is provided to connect-proxy task, so it can talk to consul - am i on right direction ?
With work load identity configured connect-proxy task will be issued with consul token (?) - who is going to request that token - Nomad or It will be just issued by Consul based on the authenitcation method ? From my understanding it will be, from functionality standpoint, the same like _nomad_si token. Method to get that token is completely different. What are my options to control such tokens, like TTL/MaxTTL ? Is it in auth-method configuration ?

During allocation startup, the Nomad client "logs in" to Consul using the allocation's Workload Identity, and receives the SI token in exchange. The Nomad client uses this SI token to register the workload's services and bootstrap the Envoy proxy. See the Consul ACLs integration docs.

(Aside: in the future, I'd appreciate it if you asked question in GitHub or Discuss, rather than sending email to my personal email address. Thanks!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants