jobs in non-default namespace can hang node drain #10172

tgross · 2021-03-12T19:59:11Z

It's been reported privately that jobs in the non-default namespace can cause node drains to hang. This doesn't appear to be reproducible unless the jobs are in the non-default namespace. The behavior has been reported against Nomad 1.0.2 and reproduced on Nomad 1.0.4.

Similar symptoms to #7432, but in that case the job was in the default namespace.

This is easily reproducible with a cluster with two clients. Create 3 namespaces:reate

$ nomad namespace apply ns1
Successfully applied namespace "ns1"!
$ nomad namespace apply ns2
Successfully applied namespace "ns2"!
$ nomad namespace apply ns3
Successfully applied namespace "ns3"!

Prepare the following minimal jobspec. Create 3 versions of this, one for each namespace.

jobspec

job "example" {
  datacenters = ["dc1"]
  namespace   = "ns1"

  group "web" {

    task "httpd" {
      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-v", "-f", "-p", "8001", "-h", "/var/www"]
      }

      resources {
        cpu    = 128
        memory = 128
      }
    }
  }
}

Check out client status and drain the first node before we run any jobs (to ensure they all end up on the same client):

$ nomad node status
ID        DC   Name              Class   Drain  Eligibility  Status
6cb8547c  dc1  ip-172-31-7-19    <none>  false  eligible     ready
bb027927  dc1  ip-172-31-15-185  <none>  false  eligible     ready

# drain the first node
$ nomad node drain -enable -yes 6cb8547c
2021-03-12T14:33:45-05:00: Ctrl-C to stop monitoring: will not cancel the node drain
2021-03-12T14:33:45-05:00: Node "6cb8547c-8de2-ec90-453f-63f6b6abd572" drain strategy set
2021-03-12T14:33:45-05:00: Drain complete for node 6cb8547c-8de2-ec90-453f-63f6b6abd572
2021-03-12T14:33:45-05:00: All allocations on node "6cb8547c-8de2-ec90-453f-63f6b6abd572" have stopped

Run all three job:

$ nomad job run ./example1.nomad
==> Monitoring evaluation "308e5823"
    Evaluation triggered by job "example"
    ==> Monitoring evaluation "308e5823"
    Evaluation within deployment: "2eb1d3bd"
    Allocation "a5bb2d98" created: node "bb027927", group "web"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "308e5823" finished with status "complete"

$ nomad job run ./example2.nomad
==> Monitoring evaluation "7d69aa4d"
    Evaluation triggered by job "example"
==> Monitoring evaluation "7d69aa4d"
    Evaluation within deployment: "589639ce"
    Allocation "294844e2" created: node "bb027927", group "web"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "7d69aa4d" finished with status "complete"

$ nomad job run ./example3.nomad
==> Monitoring evaluation "bdb9f1f1"
    Evaluation triggered by job "example"
==> Monitoring evaluation "bdb9f1f1"
    Evaluation within deployment: "c7437065"
    Allocation "236225aa" created: node "bb027927", group "web"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "bdb9f1f1" finished with status "complete"

# verify they're running
$ nomad job status -namespace ns1
ID       Type     Priority  Status   Submit Date
example  service  50        running  2021-03-12T14:34:13-05:00
$ nomad job status -namespace ns2
ID       Type     Priority  Status   Submit Date
example  service  50        running  2021-03-12T14:34:16-05:00
$ nomad job status -namespace ns3
ID       Type     Priority  Status   Submit Date
example  service  50        running  2021-03-12T14:34:19-05:00

Unset the node drain to give us a place to migrate the allocs:

$ nomad node drain -disable -yes 6cb8547c
Node "6cb8547c-8de2-ec90-453f-63f6b6abd572" drain strategy unset

Drain the node that has the allocs. It will reliably hang here:

$ nomad node drain -enable -yes -deadline 5m bb027927
2021-03-12T14:35:48-05:00: Ctrl-C to stop monitoring: will not cancel the node drain
2021-03-12T14:35:48-05:00: Node "bb027927-354b-cab2-776d-692cbc24d131" drain strategy set
2021-03-12T14:35:49-05:00: Alloc "236225aa-bea5-26ca-aa12-80a4d6985c60" marked for migration
2021-03-12T14:35:49-05:00: Alloc "294844e2-26b2-6926-0781-64b88a897d67" marked for migration
2021-03-12T14:35:49-05:00: Alloc "a5bb2d98-4d66-8741-6c27-5f1732cf2f38" marked for migration
2021-03-12T14:35:49-05:00: Alloc "a5bb2d98-4d66-8741-6c27-5f1732cf2f38" draining
2021-03-12T14:35:54-05:00: Alloc "a5bb2d98-4d66-8741-6c27-5f1732cf2f38" status running -> complete

If we check the drain status, we see one job has drained, but not the others:

$ nomad job status -namespace ns1 example
...
Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created    Modified
6d18ef66  6cb8547c  web         0        run      running   37s ago    18s ago
a5bb2d98  bb027927  web         0        stop     complete  2m13s ago  31s ago

$ nomad job status -namespace ns2 example
...
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
294844e2  bb027927  web         0        run      running  2m43s ago  2m33s ago

$ nomad job status -namespace ns3 example
...
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created   Modified
236225aa  bb027927  web         0        run      running  3m1s ago  2m50s ago

After the 5 minute deadline passes, one more of the allocations drains and the node reports that the drain is now complete. But the last allocation is still running on the node:

...
2021-03-12T14:40:49-05:00: Drain complete for node bb027927-354b-cab2-776d-692cbc24d131
2021-03-12T14:40:49-05:00: Alloc "294844e2-26b2-6926-0781-64b88a897d67" draining
2021-03-12T14:40:54-05:00: Alloc "294844e2-26b2-6926-0781-64b88a897d67" status running -> complete

I've pulled both the Nomad server logs and the Nomad client logs and found nothing of particular note, so this isn't throwing unexpected errors. I've grabbed a debug bundle and uploaded it here (this cluster was for testing only and has been destroyed, so nothing private is exposed).

nomad-debug-2021-03-12-193800Z.tar.gz

The text was updated successfully, but these errors were encountered:

This fixes a bug affecting drain nodes, where allocs may fail to be migrated if they belong to different namespaces but share the same job name. The reason is that the helper function that creates the migration evals indexed the allocs by job ID without accounting for the namespaces. When job ids clash, only an eval is created for one and the rest of the allocs remain intact. Fixes #10172

github-actions · 2022-10-20T02:44:07Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added type/bug theme/drain hcc/cst Admin - internal labels Mar 12, 2021

notnoop self-assigned this Apr 16, 2021

notnoop mentioned this issue Apr 20, 2021

Migrate all allocs when draining a node #10411

Merged

notnoop closed this as completed in #10411 Apr 21, 2021

tgross added this to the 1.1.0 milestone May 7, 2021

github-actions bot locked as resolved and limited conversation to collaborators Oct 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jobs in non-default namespace can hang node drain #10172

jobs in non-default namespace can hang node drain #10172

tgross commented Mar 12, 2021

github-actions bot commented Oct 20, 2022

jobs in non-default namespace can hang node drain #10172

jobs in non-default namespace can hang node drain #10172

Comments

tgross commented Mar 12, 2021

github-actions bot commented Oct 20, 2022