-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the base of CSI staging_target_paths and target_paths configurable #13263
Comments
Hi @ejweber! Thanks for opening this issue. I'm happy to see CSI plugin authors taking a look at Nomad. So just for my understanding of the problem, the plugin is in compliance with the spec because it accepts the
It looks like I never opened a new issue to track that, as we just didn't think it would be something anyone cared about. Mea cupla. We can treat this issue as the feature request and try to get it roadmapped. It's not a huge fix by any means; we'll just have to juggle it in with the other priorities. We'd also be thrilled to review a patch for it if you're interested! |
We mount the root file system read-only to the plugin container like The requested change comes into play because code "inside" the container and the utilities "outside" the container need to see the same staging_target_path and target_path. If we know ahead of time that a staging_target_path will be like We recognize that this is a bit of a strange use case. If all userspace utilities shipped in the container image (as is generally expected), we would be able to use the Nomad CSI functionality as currently implemented. However, while trying to understand what changed between our Nomad CSI beta implementation and Nomad CSI GA, we came across the comment suggesting the very change we need to get working in the way we already do in Kubernetes. If it is implemented, we have a straightforward path to supporting Nomad "again" (we have some example manifests that no longer work). |
Apologies for the delay. I've been quite interested in contributing a patch that moves this issue along, but I needed to navigate my company's process for approving Open Source contributions. I submitted the above PR (#13919) to that end. As its my first contribution to Nomad, I suspect it will need some work. |
Hi, not trying to derail this issue, but given the hoops you have to go through to make this csi driver work (chroot etc) I wonder if it might not be an option for you (and maybe even easier) to run with the |
Hello @apollo13, This is an interesting thought that we haven't considered. On the surface at least, it sounds like this approach would make it easy to run the driver binary without the added complexity associated with paths and containerization. However, there are a couple of things that make me less inclined to take this approach.
If this issue is ultimately closed without the changes we are hoping for, we may certainly revisit other options (and the one you are suggesting can now be added to the list). However, given that @tgross was already planning something to this effect and given that the issue is in stage/accepted, we're hopeful we've already identified the path forward. |
Hi @ejweber,
Sure, kubernetes wants everything to be containerized :D I mainly suggested While I understand that you are hesitant to go down a second road when you already have something working I personally prefer simplicity in all things. In that sense I do not think that chroot and chwrap make things easier to debug when they fail. Regarding your descriptions here, I am left with one question though: Even if you can control Out of curiosity; is there a deployment example for beegfs (the fs itself, not the csi) that runs inside nomad? |
You're right that it's a bit brittle (as we had to deal with in this issue). We considered implementing some large(ish) changes in the driver itself with some configuration flags that would skirt the issue by doing some path transformations between internal and external paths, but this would also involve user input (if the external paths they used differed from the defaults we expected). Our experience with Kubernetes is that every distribution has used the same external paths going back many versions and we generally expect the same with Nomad now that CSI has gone GA. We don't expect most users to have to set the stage_publish_base_dir field, as the base manifests we provide will already it preconfigured for most clusters. Hopefully we will eventually be able to release a larger container that simply has the utilities we need in the future (the decision to go distroless is a policy one), but raw_exec is certainly interesting. I plan to give it a shot when I can as an alternative approach. I haven't ever deployed BeeGFS on Nomad (the BeeGFS CSI driver presumes an existing BeeGFS file system, and we have many such file systems set up in our test environment already), but it should be relatively straightforward to do depending on the use case. I'm not sure exactly what would be gained by doing it, as I tend to see Nomad jobs as being somewhat ephemeral and BeeGFS file systems as being largely permanent. However, if BeeGFS was installed on all Nomad nodes and a job wanted to set up an ephemeral file system using node-local storage targets (maybe the nodes have unused NVMe drives or something like that), it could be interesting. Or if an administrator was all in on Nomad and just wanted to use Nomad to manage all aspects of their infrastructure (we definitely see this with some Kubernetes distributions/uses), it might be something they'd want. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
In the linked PR, @tgross effectively ensured that containers running CSI plugins would receive staging_target_paths like
/local/csi/staging
. The work was aimed at making giving multiple CSI plugins unique locations outside their containers to stage mounts at (while inside their containers, each plugin would see an isolated/local/csi/staging directory
). The assumption (I think) was that CSI plugins should not care what path a mount is staged at.In the below comment, @tgross mentioned he would eventually make this /local/csi path configurable in the csi_plugin jobspec. Though there was some discussion on whether or not there was even a use case for this, it was implied that the work would be done.
Originally posted by @tgross in #12078 (comment)
For better or worse, the BeeGFS CSI driver does need this change to work with the GA Nomad CSI implementation. (We were originally able to run in the beta implementation.) For a variety of reasons, the BeeGFS CSI driver container image does not ship with the mount utility or certain other required userspace utilities (e.g. beegfs-ctl). Instead, we use a bind mount and a chroot to execute the versions of these utilities that are already installed on a running node. This works fine in Kubernetes, where staging_target_paths are reasonably predictable and provided as absolute paths that resolve in the node's mount namespace (e.g.
/var/lib/kubelet/plugins/kubernetes.io/csi/pv/
). A simple bind mount like/var/lib/kubelet/plugins/kubernetes.io/csi/pv/ : /var/lib/kubelet/plugins/kubernetes.io/csi/pv/
ensures the plugin code and the chrooted utilities have the same view of the mount location.Because Nomad provides staging_target_paths like
/local/csi/staging/...
, a similar solution is not possible. We would like to be able to configure Nomad as @tgross suggested. For example, if Nomad would (by default) create the bind mount/opt/nomad/opt/nomad/client/csi/monolith/beegfs-plugin : /local/csi
(as it does today), we would instead like it to create the bind mount/opt/nomad/opt/nomad/client/csi/monolith/beegfs-plugin : /opt/nomad/opt/nomad/client/csi/monolith/beegfs-plugin
and provide staging_target_paths like/opt/nomad/opt/nomad/client/csi/monolith/beegfs-plugin/staging/...
Is the referenced jobspec changed still planned?
The text was updated successfully, but these errors were encountered: