-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for transparent authentication to the Task API #18125
Comments
Hi @aofei! As you might have guessed, the current behavior is intentional. Let me share the rationale from our internal design doc (here for HashiCorp folks):
The Future Work section described there is:
I think you're potentially onto an interesting problem with Identity Expiration and Rotation though. I'm going to tag @schmichael (who's actively working on that) and @mikenomitch as a heads up on this to see if they have thoughts on how we might approach this. |
Excellent writeup and usecase @aofei. Thanks for pinging me Tim. Workaroundsocat, haproxy, or some other "simple" proxy could be used in a sidecar task to perform transparent authentication for other tasks in its network namespace. haproxy should allow reloading credentials via sighup to workaround traefik's missing feature there. I mention this workaround not because it's nice -- it's not -- but because it should work until we get something official shipped. This workaround will continue to work in perpetuity for people who don't really mind a single purpose sidecar sitting around. Secure by defaultOne quick note on the aspiration of making all Nomad requests authenticated: the JWKS endpoint added in #18035 will likely always need to be available unauthenticated (and potentially some other similar endpoints). However, I'd love to have to explicitly special case unauthenticated endpoints and have the default be auth'd. Transparent Task API authBack to the subject at hand! Can we tie a Task API socket to an I like the optin design proposed so the default can continue to require explicit authentication. That's the only way to remain secure by default. However by tying a unix socket to an identity, you're accomplishing the same goal from the user's perspective: they're explicitly authenticating, just via the jobspec instead of at runtime! Neat! In some sense this is more secure as there's no sensitive material inside the container that could be exfiltrated! You can even statically detect this behavior since its in a jobspec presumably on disk somewhere. Sentinel (in enterprise) could be used to prevent (or enforce!) its use. Not sure Right now I can't think of a reason you would want to use an alternate identity for your Task API, so maybe the validation is as simple as "only the default identity can have If we wanted to kill 2 task_api {
unix_socket = true
transparent_auth = false
} where idk... just brainstorming. Those names are a bit awkward, and it's a shame it allows for the invalid |
One more neat side effect of transparent auth with the default token: expiration, and therefore rotation, wouldn't be necessary! No secrets would leave Nomad, and the socket's identity would be valid as long as the allocation is non-terminal. I feel like I might be overlooking some gotcha here because it almost seems too good to be true. 😅 (Note that expiration, and therefore rotation, are absolutely necessary for identities used with Consul, Vault, and other 3rd parties as they're not able to perform the "is this for a valid alloc?" association that Nomad itself is. We must rely on all the OIDC-ish JWT and JWKS infrastructure there.) |
Hi @tgross! Thanks for sharing the internal design docs, especially the Future Work section. I'm glad to know that Agent API UDS is already on your roadmap. I completely agree that client agents shouldn't bind to any port. I'm actually in favor of the secure-by-default design. I haven't used the agent's HTTP endpoint since the Task API was introduced. And a while back, I made all requests must go through the Task API. This was mainly because I encountered a bug that led to an access control bypass (which your team later identified as #16775). I thought I was the weird one, but it turns out you guys agree with this approach. I must say, this use case could also benefit from the transparent authentication support. Here's my current Nomad HTTP jobspec: job "nomad-http" {
group "nomad-http" {
network {
port "nomad_http" {}
}
task "nomad-http" {
driver = "docker"
config {
image = "caddy"
args = ["caddy", "run", "--config", "/local/caddy/Caddyfile"]
ports = ["nomad_http"]
}
identity { env = true }
template {
destination = "local/caddy/Caddyfile"
data = <<EOF
:{{env "NOMAD_PORT_nomad_http"}}
reverse_proxy unix//{{env "NOMAD_SECRETS_DIR"}}/api.sock {
header_up +X-Nomad-Token "{{env "NOMAD_TOKEN"}}"
}
EOF
}
}
}
} The Task API rejects all requests without a token, which means there is no way to access the Nomad UI since it's displayed as "Unauthorized" without the ability to set the token in the browser. So, as you can see, I had to use a solution like Caddy to proxy requests for the Task API. This allowed me to set a default token for all requests without one. However, with transparent authentication support, the implementation can be simplified to: job "nomad-http" {
group "nomad-http" {
network {
port "nomad_http" {}
}
task "nomad-http" {
driver = "docker"
config {
image = "alpine/socat"
args = [
"TCP-LISTEN:${NOMAD_PORT_nomad_http},fork,reuseaddr",
"UNIX-CONNECT:${NOMAD_SECRETS_DIR}/api.sock",
]
ports = ["nomad_http"]
}
}
}
} Regarding this proposal, I initially forgot about #16436 and overlooked #18123. Given these two issues, it does seem that @schmichael's idea of introducing a new |
Proposal
Given that #16872 is on the way and #16258 is likely planned, I'm thinking it might be a good idea to add support for transparent authentication to the Task API (aka
${NOMAD_SECRETS_DIR}/api.sock
).I'm not sure if this proposal is a good security practice. I just think it makes things easier.
Use-cases
For a couple of my current use cases for Workload Identity, #16258 is going to cause some trouble. For example, with #16258, every Workload Identity rotation will cause the Traefik job to restart. Unfortunately, that's the only way Traefik can use the rotated
NOMAD_TOKEN
(since Traefik has no reload mechanism).I was thinking that with #16872, we could make Traefik take advantage of the Task API (traefik/traefik#10044). By adding transparent authentication support to the Task API, we might be able to solve all such problems at once.
Think about this (assuming both #16872 and #16258 are fixed, along with traefik/traefik#10044):
Attempted Solutions
Perhaps add an
api_sock
option to the identity block:The text was updated successfully, but these errors were encountered: