-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to have nodes which can be used only for job with specific constraint. #2299
Comments
Why do you need to put |
@dvusboy We want no jobs to be run on C except with a specified constraint. Currently, we can tell to some jobs to go specifically on C with the constraint "${node.class}" = "bar" but all the jobs without any constraint could be run on A, B or C. We want them to be run only on A or B, so we have to put the constraint "${node.class}" = "foo" on them. |
Or |
@dvusboy That's what I want indeed. |
This would be a very helpful feature. 👍 |
I agree, this would be a very nice feature. We have a redundant nomad setup, but a few nodes are highly special (different network config and similar, basically bridging our environment to other environments) If we could setup a default constraint on those special nodes (only jobs with class "bridge-node" for example) we could make sure that no one deploys anything on those nodes unless they really mean to. Ideally this should also be protected through vault integration so we could limit who is allowed to deploy to these network-bridging-nodes. The only solution right now seems to be to setup a dedicated consul / nomad cluster for these nodes to make it hard to do mistakes. |
This would be ideal. Here's what I've run into. I've got a handful of specialized build servers in my cluster. These servers are really big machines 16+ core and a ton of ram additionally they have a specific set of hardware which supports a specfic set of cpu instructions that I'm using to build and test a suite of different applications optimized for machines with these instruction sets. Ideally id like for only jobs tagged with a very specific constraint to deploy onto those machines. I cant trust that the other developers i give access to deploy jobs through deployment tools that interface with nomad to deploy tasks are always going to add a constraint to their jobs so that they dont get scheduled onto my specialized machines. Something has to be there so that nodes can enforce a set of node specific rules on which jobs will get scheduled to them when job constraints arent defined in a job definition. |
Would be very useful for us too. |
any news on this feature? Seems like there are different good use cases. I'm interested in how other people are solving this on their clusters. |
This is the first I've read this request and it is interesting to me Nomad doesn't really have a way to disable placements on a node by default except for jobs which explicitly target it. You could use datacenters today to achieve this. If your normal datacenter is It's a little hacky but may be easier than trying to enforce proper constraints on every job. |
@schmichael I like this workaround, thanks for the idea! |
Looks like in v0.9 you could manage jobs placement via affinities: |
Affinities is in 0.9 beta and will be in the final release coming soon, closing this |
@preetapan, I think we should reopen the issue. |
I would also opt to have this re-opened, since affinities are soft. If I understood correctly, this issue is about a hard, cluster-wide, implied constraint. |
@preetapan @schmichael what do you think about reopenning the issue? |
How is the datacenter workaround insufficient? #2299 (comment) |
At first it just unobvious and as you notice, it's the workaround. |
Agreed this should remain open. I think another compelling use case is to actually suggest people allow Nomad servers to be clients as well. Then using this feature you could ensure the servers aren't considered for the vast majority of workloads, but you could still use system jobs for log shippers and monitoring tools on the servers. |
Another use case: access to restricted data. I have a redis cluster that can only be accessed by certain services. Access to redis is controlled by iptables. I need to have a subset of my nomad agents that have iptables rules allowing access to that redis cluster. Option1: new datacenterI’d rather not use the datacenter workaround because I’m already using datacenter primitive for my 2 co-located DCs. There is a 1:1 mapping between consul dc and nomad dc. Deviating from that will require training developers about the exception. Option2: consul-connect + envoyTheoretically Envoy would be a viable option. I could use consul Acls to restrict access to the redis cluster to certain workloads.
I’ve tried and it’s not possible to use a proxy with envoy. There is experimental support for native redis support in envoy, but it doesn’t work with the traditional redis clustering I use. |
Taking a look at this - thanks all for the input. |
@schmichael the datacenter option sounds fine in theory, but that would mean I need a separate set of Consul server(s) for the For me, the Consul servers are also Nomad servers for my actual dc. and the use case exactly what you have mentioned here: #2299 (comment) I want to run some trivial |
Agree that this should really be a client option. There are many use cases where one wants to prevent jobs from being scheduled on a set of nomad clients with a certain node class. Very simple example: you may want to dedicate a set of workers for the task of running Elasticsearch nodes. You'd use a node class to pin these jobs to a set of nodes. Even though this can be accomplished using datacenter, this has other issues with features like balancing placement using |
+1 For this feature please |
This is our exact use case. We want to run a monitoring job on the nomad cluster servers, but no other job should ever be scheduled onto those clients. We work around it at the moment by specifying a constraint on every job. |
My current workaround is to have my cluster nodes on a different datacenter than my member nodes and it seems to work well. |
@axsuul Did you have to run a separate Consul DC as well? |
All my nodes are within the same Consul datacenter so they can still communicate with each other. Consul datacenters don't seem to be related to Nomad datacenters. Just to clarify, all my Consul nodes are within the |
TIL!!! Thanks for the specifics! I have stuck to a fixed configuration since forever and I guess I had a mental association of those config parameters! |
Just chiming in with support for this feature. I was quite surprised to realise that you can set up namespaces or flag nodes with a class, but then can't use those in ACLs and job restrictions. Having a Nomad client only accept job from one namespace, and then adding an ACL on that namespace to restrict who can launch jobs would be very useful. |
Hi, just wanted to provide an update here. We are planning to ship a feature called Node Pools in 1.6 that will address some of this! We can post a technical design doc later with more details, but the idea is that you can have an additional (and optional) attribute on nodes called "node_pool". It will work similarly to node_class, except you will have to opt in to placing jobs onto non-default node_pools. You will also be able to tie node_pool placement to namespaces, so you can tie ACL policies to placing jobs on the pool. This might not be exactly what was requested, but I think it's close enough in spirit to add the 1.6 milestone. |
Hey everybody, as I mentioned earlier, we're planning a relatively simple version of this request for 1.6. Each node can have an additional (and optional) attribute called "node_pool". It will work similarly to node_class, except you will have to opt in to placing jobs onto non-default node_pools. One of the constraints of this approach is that each node can only be a part of a single node_pool. This works if you want to force job-writers to opt into specific pools for something like prod/dev/test, or exclude a set of nodes by default ("you must opt into the GPU node pool explicitly"), but there are some more complex use-cases this doesn't support. For instance, you couldn't have a series of "taints" and have to tolerate all of them at the jobspec level (excuse the K8s terminology 😄). We're interested in learning more about how people would use more complex opt-in constraints, where you would have to opt-in to multiple "pre-set" constraints (with an "AND"). Perhaps there's even a case where you might have to opt into one of several constraints (an "OR")? Or instances where you might be mutating the node's constraint-set quite a bit. If you've got a use case like this, please let us know in a comment! Also, if you feel like talking through your use case with the team, feel free to grab a time and we can chat! |
Not an exact match, but adding this link here so we can close this out once this is shipped: #11041 |
Shipped! |
Reference: https://groups.google.com/forum/#!topic/nomad-tool/Nmv8LiMUnEg
It would be great to have a way to avoid jobs to be run on a node unless they specify a constraint!
Quoted from the mailing list discussion:
Thanks !
The text was updated successfully, but these errors were encountered: