Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault integration upgrade based on Workload Identity #15617

Closed
mikenomitch opened this issue Dec 22, 2022 · 6 comments
Closed

Vault integration upgrade based on Workload Identity #15617

mikenomitch opened this issue Dec 22, 2022 · 6 comments

Comments

@mikenomitch
Copy link
Contributor

mikenomitch commented Dec 22, 2022

Proposal

Once Workload Identity upgrades make it into Nomad, we can redo the Vault integration to use these tokens as the source of auth instead of manually provided Vault tokens.

Using these tokens, Nomad Users would have a one-time set up process to integrate Nomad workloads into Vault.

The general flow for setting up the Vault-Nomad integration would be:

  • Set up Vault
    • Create a Policy for Nomad in Vault
    • Enable the JWT Auth Method
    • Configure Vault to use Nomad’s Public keys - either passing in the keys, a JWKS URL, or an OIDC Config URL
    • Create a Vault Role for Nomad
  • Set up Nomad
    • Pass a Vault URL into Nomad Server config in a new configuration block (or v2 of the existing vault block). (Note: no token needed)
  • Deploy Job
    • Job is configured to use new Vault integration
    • Nomad, recognizing that the new integration is being used, automatically requests a token for this job using the JWT auth method.

This would involve an up front cost to set up roles in Vault, but after that no management of tokens would be needed.

Use Cases & Advantages

This would be advantageous in many ways:

  • Nomad Users would not have to manage Vault token issuing, rotation, and revocation for Nomad clients.
  • Nomad Users would not have to manage Vault tokens for each Nomad workload.
  • Workload identity tokens could be time-bound and automatically rotated
  • Workload identity tokens could provide fine-grained access at the task level
  • Workload identity tokens could be automatically removed once the task has stopped
  • Multiple Vault clusters could be configured to use the same token. This could allow Nomad to talk to multiple clusters at once, handle performance replicas better, and handle disaster recovery better. Failovers could happen without token rotation in Nomad.
  • Vault could be more easily deployed as a Nomad job, as client tokens/config would not be needed.

Potential simultaneous improvements

While not directly related, there are a few other Vault improvements that should be considered while we do this upgrade:

  • Supporting batch tokens in Vault
  • Using Vault secrets in jobspecs, either in artifact stanza or Docker auth (I think this is likely unrelated, but such a popular feature that its worth considering while we implement)
@tgross
Copy link
Member

tgross commented Mar 24, 2023

Example of a motivating use case: #16639

@tgross tgross self-assigned this Aug 2, 2023
@tgross tgross added this to the 1.7.0 milestone Aug 17, 2023
@mikenomitch mikenomitch moved this from Later release shortlist (uncommitted) to 1.7 - Beta (ETA mid-Oct) in Nomad Roadmap Aug 17, 2023
tgross added a commit that referenced this issue Oct 25, 2023
Submitting a Consul or Vault token with a job is deprecated in Nomad 1.7 and
intended for removal in Nomad 1.9. Add a deprecation warning to the CLI when the
user passes in the appropriate flag or environment variable.

Nomad agents will no longer need a Vault token when configured with workload
identity, and we'll ignore Vault tokens in the agent config after Nomad 1.9. Log
a warning at agent startup.

Ref: #15617
Ref: #15618
tgross added a commit that referenced this issue Oct 26, 2023
Submitting a Consul or Vault token with a job is deprecated in Nomad 1.7 and
intended for removal in Nomad 1.9. Add a deprecation warning to the CLI when the
user passes in the appropriate flag or environment variable.

Nomad agents will no longer need a Vault token when configured with workload
identity, and we'll ignore Vault tokens in the agent config after Nomad 1.9. Log
a warning at agent startup.

Ref: #15617
Ref: #15618
@brucelok brucelok removed the status in Nomad Roadmap Oct 27, 2023
@brucelok brucelok modified the milestone: 1.7.0 Oct 27, 2023
@tgross
Copy link
Member

tgross commented Nov 1, 2023

Shipped in Nomad 1.7.0-beta.1

@tgross tgross closed this as completed Nov 1, 2023
@tomqwpl
Copy link

tomqwpl commented Feb 25, 2024

Hi, would it be expected that to successfully retrieve the JWKS, you need to have verify_https_client set to "false"? If I have that set to true, it is expected that the client present a TLS certificate (mutual TLS), but Vault isn't going to do that, so the retrieval of JWKS fails?
Thanks.

@tgross
Copy link
Member

tgross commented Feb 28, 2024

@tomqwpl yeah that's right. We should probably update the verify_https_client docs to point out that it's safe to have verify_https_client=true for the API in general in the Nomad security model (so long as you've got ACLs enabled).

@tomqwpl
Copy link

tomqwpl commented Feb 29, 2024

that it's safe to have verify_https_client=true for the API in general

Do you mean "true" here? I'm assuming you mean "false"? I see in fact that the docs say "false".
I suspect we've enabled it because we're doing no further authentication to nomad, but I would need to clarify that.

Thanks.

@tgross
Copy link
Member

tgross commented Mar 1, 2024

Do you mean "true" here? I'm assuming you mean "false"? I see in fact that the docs say "false".

🤦 yes, sorry.

I suspect we've enabled it because we're doing no further authentication to nomad, but I would need to clarify that.

Yeah in that case you really so want to enable TLS verification for the HTTP API. But it's not recommended to have ACLs disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants