-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Model priviliged execution environments in the task-driver API #6554
Comments
@endocrimes Such a thing would be really great. Where/how can the community help to move this forward? |
@apollo13 I left HashiCorp last year so unfortunately I'm not sure about the current state of thinking here (and no longer have my notes on this). @schmichael should be able to be a little more helpful than me though :) |
@schmichael Any chance of getting some feedback on this? If the suggested approach looks fine, I might be able to start working on a PR. |
Hi @apollo13! I'm happy to get the conversation rolling... I think this is decidedly trickier from an architectural standpoint so I'd love to get your thoughts here. In theory could simply add a new Gotcha 1 The major architectural barrier is that the What it comes down to is that there can be more than one version of a task driver on the cluster. Suppose I have So in #11807 we were able to take advantage of the task driver name being a Gotcha 2 This is less of a blocker and more a design space to explore. The concept of "privileged" isn't really a boolean! The |
Hi @tgross, I fully agree that this one will not be that easy. I do not think that it makes sense to expose all the options you listed in gotcha 2 as capabilities. I also do not have a good idea on how to solve gotcha 1; but I do have an idea. What if it were possible to configure drivers like this (client.config):
In the task one would then specify "docker-privileged" as driver name… This would allow us to push the granularity of what is allowed and whatnot into the task driver itself without the servers (or namespaces) needing to know more than a plugin name. I am not sure how feasible this is yet codewise (ie dispatch a plugins twice with different configs) and it certainly would need a bit of code so that the plugins are properly fingerprinted but aside from that it should be opaque (for the lack of a better word) for nomad. Is that a direction worth pursuing? |
I threw together https://github.com/sorenisanerd/nomad-docker-driver-external as a workaround. It's the internal docker driver, but as an external plugin. This lets you apply different configuration to the internal and external driver, and then you can restrict the one that allows privileged containers to specific namespaces. |
Ha, that is a nice idea. Does this also work in conjunction with connect jobs etc? |
I made sure to import the docker driver and commit it and then layered my changes on top: sorenisanerd/nomad-docker-driver-external@9685fc8 As you can see, I haven't really changed anything, I just glued it together. Anything that works with the docker driver should work with this one. Just specify 'driver = "docker-ext"' in your job description. Make sure to heed the advice in the README, and you should be good to do. |
Note: This issue mostly contains initial background thoughts to prompt discussion and is not yet a well defined proposal.
Background
Currently, Nomads permission model around runtime permissions of a job exist only within the implementation of a driver. This means that we do not include them in our ACL system or take whether the features are enabled on a particular client into account in the scheduler.
This is fine if a Nomad cluster is a uniform fleet, but that is rarely the case in larger clusters, and currently requires users to add additional metadata to privileged clients, and constraints to jobs that require them. It also then allows anyone with access to the
submit-job
permission in any namespace to get privileged access to those hosts.As part of #5378 however, access to privileged containers will become more normal as CSI Plugins require privileged containers in order to be able to create new mounts in the system for publishing volumes. Although we operate on a trusted operator security model, there are many valid cases where CSI plugins may want to be deployed, without granting trivial docker privilege escalation to all.
Proposal
I'm proposing introducing a Nomad-level API for modelling process isolation levels. This change would introduce a
privileged
option at thetask
level in a Nomad Job, that would signal to Nomad that the job my only be placed on nodes where that driver exposes thePrivileged
execution capability. It would also allow the introduction of aprivileged-execution
capability to the ACL system.Task Configuration
Driver API
This is currently mostly undefined, but it would mostly involve updating the
DriverCapabilities
to introduce a new field that plugins may opt in to, and introducing a privileged option to TaskConfig.Opting nodes into privileged execution
Currently, Nomad requires you to configure drivers with support for privileged execution modes in a client configuration. After this change, you'll still be required to enable support on an individual client, but by default will require using the new configuration for privileged execution modes.
To allow for backwards compatibility and a cleaner upgrade path, we will also offer an option in driver configuration to retain the existing behavior for using privileged execution environments.
Docker
Example Config
Raw Exec
The Raw Exec driver will begin exposing an
Unconstrained
isolation capability whenlegacy_privileged_behavior
isfalse
which will require that a user has access toprivileged
execution modes.The text was updated successfully, but these errors were encountered: