Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service tags are silently dropped when defined in separate blocks within the same task #5827

Closed
hobochili opened this issue Jun 12, 2019 · 3 comments

Comments

@hobochili
Copy link
Contributor

hobochili commented Jun 12, 2019

Nomad version

# nomad version
Nomad v0.9.2 (028326684b9da489e0371247a223ef3ae4755d87)

Operating system and Environment details

NAME="Ubuntu"
VERSION="18.04.2 LTS (Bionic Beaver)"

Issue

In 0.8.6 we were able to define multiple service blocks with the same name but different tags. We rely on this to map service tags to distinct ports. In 0.9.2 only tags defined in the last service block end up attached to the Consul service and the tags from the other service blocks are silently dropped. This can lead to subtle failures during the upgrade process because the service checks don't enter a failed state, they just disappear completely.

Reproduction steps

Run a job in 0.8.6 with a task that contains two or more service blocks with the same name but different tags and query Consul to see that all service tags are registered. Repeat in 0.9.2 to see that only tags from the last service block are registered.

Job file

job "dummy" {
  region      = "us-east-1"
  datacenters = ["us-east-1a"]

  constraint {
    attribute = "${attr.nomad.version}"
    value     = "0.9.2"
  }

  group "dummy" {
    count = 1

    task "dummy" {
      driver = "docker"

      config {
        image   = "ubuntu:bionic"
        command = "/bin/sh"
        args    = ["-c", "/usr/bin/tail -f /dev/null"]
      }

      service {
        name = "dummy"
        port = "foo"

        tags = ["foo"]

        check {
          name     = "foo check"
          type     = "script"
          command  = "/bin/true"
          interval = "10s"
          timeout  = "2s"
        }
      }

      service {
        name = "dummy"
        port = "bar"

        tags = ["bar"]

        check {
          name     = "bar check"
          type     = "script"
          command  = "/bin/true"
          interval = "10s"
          timeout  = "2s"
        }
      }

      service {
        name = "dummy"
        port = "baz"

        tags = ["baz"]

        check {
          name     = "baz check"
          type     = "script"
          command  = "/bin/true"
          interval = "10s"
          timeout  = "2s"
        }
      }

      resources {
        cpu    = 100
        memory = 32

        network {
          mbits = 10

          port "foo" {}
          port "bar" {}
          port "baz" {}
        }
      }
    }
  }
}

Logs

I have not found any relevant log entries for this issue.

Nomad 0.8.6 Results

All three checks are created:

root@node01:/var/lib/consul/checks# grep dummy /var/lib/consul/checks/*
/var/lib/consul/checks/5f2d537f72b7f814eaef480666ad22ec:{"Check":{"Node":"node01","CheckID":"b535926bb7450ada076eb00d753c6f87c772edb0","Name":"baz check","Status":"critical","Notes":"","Output":"","ServiceID":"_nomad-task-mlixuvo3phniqlz4pykppgd2ci4ktums","ServiceName":"dummy","ServiceTags":["baz"],"Definition":{},"CreateIndex":0,"ModifyIndex":0},"ChkType":{"CheckID":"b535926bb7450ada076eb00d753c6f87c772edb0","Name":"baz check","Status":"","Notes":"","ScriptArgs":null,"HTTP":"","Header":null,"Method":"","TCP":"","Interval":0,"AliasNode":"","AliasService":"","DockerContainerID":"","Shell":"","GRPC":"","GRPCUseTLS":false,"TLSSkipVerify":false,"Timeout":2000000000,"TTL":41000000000,"DeregisterCriticalServiceAfter":0},"Token":""}
/var/lib/consul/checks/a6282c84c6f73dd52fe49778164743ff:{"Check":{"Node":"node01","CheckID":"5860e3aab0aca098d47f53da65f07552e74a8de2","Name":"bar check","Status":"critical","Notes":"","Output":"","ServiceID":"_nomad-task-ntvpzts5fcetv7kdygrr4kkjktizpfmt","ServiceName":"dummy","ServiceTags":["bar"],"Definition":{},"CreateIndex":0,"ModifyIndex":0},"ChkType":{"CheckID":"5860e3aab0aca098d47f53da65f07552e74a8de2","Name":"bar check","Status":"","Notes":"","ScriptArgs":null,"HTTP":"","Header":null,"Method":"","TCP":"","Interval":0,"AliasNode":"","AliasService":"","DockerContainerID":"","Shell":"","GRPC":"","GRPCUseTLS":false,"TLSSkipVerify":false,"Timeout":2000000000,"TTL":41000000000,"DeregisterCriticalServiceAfter":0},"Token":""}
/var/lib/consul/checks/bf70985be258d37c1d911b511648ff54:{"Check":{"Node":"node01","CheckID":"94da0eec43ed59200e59885194babecfc5792e35","Name":"foo check","Status":"critical","Notes":"","Output":"","ServiceID":"_nomad-task-v5qp7gpz4pcfpjanderga7snvgfc63sq","ServiceName":"dummy","ServiceTags":["foo"],"Definition":{},"CreateIndex":0,"ModifyIndex":0},"ChkType":{"CheckID":"94da0eec43ed59200e59885194babecfc5792e35","Name":"foo check","Status":"","Notes":"","ScriptArgs":null,"HTTP":"","Header":null,"Method":"","TCP":"","Interval":0,"AliasNode":"","AliasService":"","DockerContainerID":"","Shell":"","GRPC":"","GRPCUseTLS":false,"TLSSkipVerify":false,"Timeout":2000000000,"TTL":41000000000,"DeregisterCriticalServiceAfter":0},"Token":""}
grep: /var/lib/consul/checks/state: Is a directory

All checks are registered in Consul with the appropriate tags:

$ curl -s $CONSUL_HTTP_ADDR/v1/health/checks/dummy | jq .
[
  {
    "Node": "node01",
    "CheckID": "5860e3aab0aca098d47f53da65f07552e74a8de2",
    "Name": "bar check",
    "Status": "passing",
    "Notes": "",
    "Output": "",
    "ServiceID": "_nomad-task-ntvpzts5fcetv7kdygrr4kkjktizpfmt",
    "ServiceName": "dummy",
    "ServiceTags": [
      "bar"
    ],
    "Definition": {},
    "CreateIndex": 307197905,
    "ModifyIndex": 307197910
  },
  {
    "Node": "node01",
    "CheckID": "94da0eec43ed59200e59885194babecfc5792e35",
    "Name": "foo check",
    "Status": "passing",
    "Notes": "",
    "Output": "",
    "ServiceID": "_nomad-task-v5qp7gpz4pcfpjanderga7snvgfc63sq",
    "ServiceName": "dummy",
    "ServiceTags": [
      "foo"
    ],
    "Definition": {},
    "CreateIndex": 307197907,
    "ModifyIndex": 307197967
  },
  {
    "Node": "node01",
    "CheckID": "b535926bb7450ada076eb00d753c6f87c772edb0",
    "Name": "baz check",
    "Status": "passing",
    "Notes": "",
    "Output": "",
    "ServiceID": "_nomad-task-mlixuvo3phniqlz4pykppgd2ci4ktums",
    "ServiceName": "dummy",
    "ServiceTags": [
      "baz"
    ],
    "Definition": {},
    "CreateIndex": 307197906,
    "ModifyIndex": 307198002
  }
]

And there is a unique ServiceID for each tag:

[
  {
    "ID": "c2a7be61-3243-f0aa-00bd-3e412a0cc238",
    "Node": "node01",
    "Address": "10.0.100.86",
    "Datacenter": "us-east-1",
    "TaggedAddresses": {
      "lan": "10.0.100.86",
      "wan": "10.0.100.86"
    },
    "NodeMeta": {
      "consul-network-segment": ""
    },
    "ServiceKind": "",
    "ServiceID": "_nomad-task-mlixuvo3phniqlz4pykppgd2ci4ktums",
    "ServiceName": "dummy",
    "ServiceTags": [
      "baz"
    ],
    "ServiceAddress": "10.0.100.86",
    "ServiceWeights": {
      "Passing": 1,
      "Warning": 1
    },
    "ServiceMeta": {},
    "ServicePort": 31921,
    "ServiceEnableTagOverride": false,
    "ServiceProxyDestination": "",
    "ServiceProxy": {},
    "ServiceConnect": {},
    "CreateIndex": 307197904,
    "ModifyIndex": 307197904
  },
  {
    "ID": "c2a7be61-3243-f0aa-00bd-3e412a0cc238",
    "Node": "node01",
    "Address": "10.0.100.86",
    "Datacenter": "us-east-1",
    "TaggedAddresses": {
      "lan": "10.0.100.86",
      "wan": "10.0.100.86"
    },
    "NodeMeta": {
      "consul-network-segment": ""
    },
    "ServiceKind": "",
    "ServiceID": "_nomad-task-ntvpzts5fcetv7kdygrr4kkjktizpfmt",
    "ServiceName": "dummy",
    "ServiceTags": [
      "bar"
    ],
    "ServiceAddress": "10.0.100.86",
    "ServiceWeights": {
      "Passing": 1,
      "Warning": 1
    },
    "ServiceMeta": {},
    "ServicePort": 24662,
    "ServiceEnableTagOverride": false,
    "ServiceProxyDestination": "",
    "ServiceProxy": {},
    "ServiceConnect": {},
    "CreateIndex": 307197902,
    "ModifyIndex": 307197902
  },
  {
    "ID": "c2a7be61-3243-f0aa-00bd-3e412a0cc238",
    "Node": "node01",
    "Address": "10.0.100.86",
    "Datacenter": "us-east-1",
    "TaggedAddresses": {
      "lan": "10.0.100.86",
      "wan": "10.0.100.86"
    },
    "NodeMeta": {
      "consul-network-segment": ""
    },
    "ServiceKind": "",
    "ServiceID": "_nomad-task-v5qp7gpz4pcfpjanderga7snvgfc63sq",
    "ServiceName": "dummy",
    "ServiceTags": [
      "foo"
    ],
    "ServiceAddress": "10.0.100.86",
    "ServiceWeights": {
      "Passing": 1,
      "Warning": 1
    },
    "ServiceMeta": {},
    "ServicePort": 24126,
    "ServiceEnableTagOverride": false,
    "ServiceProxyDestination": "",
    "ServiceProxy": {},
    "ServiceConnect": {},
    "CreateIndex": 307197903,
    "ModifyIndex": 307197903
  }
]

Nomad 0.9.2 Results

All three checks are created but they all have the same tag:

root@node02:~# grep dummy /var/lib/consul/checks/*
/var/lib/consul/checks/21cff8b711bdbae5bdf980f5367f0808:{"Check":{"Node":"node02","CheckID":"_nomad-check-56a784fa95cf7e9230a6db208374cd1f3afeffb2","Name":"bar check","Status":"critical","Notes":"","Output":"","ServiceID":"_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy","ServiceName":"dummy","ServiceTags":["baz"],"Definition":{},"CreateIndex":0,"ModifyIndex":0},"ChkType":{"CheckID":"_nomad-check-56a784fa95cf7e9230a6db208374cd1f3afeffb2","Name":"bar check","Status":"","Notes":"","ScriptArgs":null,"HTTP":"","Header":null,"Method":"","TCP":"","Interval":0,"AliasNode":"","AliasService":"","DockerContainerID":"","Shell":"","GRPC":"","GRPCUseTLS":false,"TLSSkipVerify":false,"Timeout":2000000000,"TTL":41000000000,"DeregisterCriticalServiceAfter":0},"Token":""}
/var/lib/consul/checks/3fab462753753eb415cfa5427321f30b:{"Check":{"Node":"node02","CheckID":"_nomad-check-7f6999dbb0aae17e9badb5729196fedcc72d6df5","Name":"baz check","Status":"critical","Notes":"","Output":"","ServiceID":"_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy","ServiceName":"dummy","ServiceTags":["baz"],"Definition":{},"CreateIndex":0,"ModifyIndex":0},"ChkType":{"CheckID":"_nomad-check-7f6999dbb0aae17e9badb5729196fedcc72d6df5","Name":"baz check","Status":"","Notes":"","ScriptArgs":null,"HTTP":"","Header":null,"Method":"","TCP":"","Interval":0,"AliasNode":"","AliasService":"","DockerContainerID":"","Shell":"","GRPC":"","GRPCUseTLS":false,"TLSSkipVerify":false,"Timeout":2000000000,"TTL":41000000000,"DeregisterCriticalServiceAfter":0},"Token":""}
/var/lib/consul/checks/928007e77b53302eb62c6d2f7e01bca4:{"Check":{"Node":"node02","CheckID":"_nomad-check-027a9bc8edc6511702dddebc7cdabaa25cf06c63","Name":"foo check","Status":"critical","Notes":"","Output":"","ServiceID":"_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy","ServiceName":"dummy","ServiceTags":["baz"],"Definition":{},"CreateIndex":0,"ModifyIndex":0},"ChkType":{"CheckID":"_nomad-check-027a9bc8edc6511702dddebc7cdabaa25cf06c63","Name":"foo check","Status":"","Notes":"","ScriptArgs":null,"HTTP":"","Header":null,"Method":"","TCP":"","Interval":0,"AliasNode":"","AliasService":"","DockerContainerID":"","Shell":"","GRPC":"","GRPCUseTLS":false,"TLSSkipVerify":false,"Timeout":2000000000,"TTL":41000000000,"DeregisterCriticalServiceAfter":0},"Token":""}
grep: /var/lib/consul/checks/state: Is a directory
$ curl -s $CONSUL_HTTP_ADDR/v1/health/checks/dummy | jq .
[
  {
    "Node": "node02",
    "CheckID": "_nomad-check-027a9bc8edc6511702dddebc7cdabaa25cf06c63",
    "Name": "foo check",
    "Status": "passing",
    "Notes": "",
    "Output": "",
    "ServiceID": "_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy",
    "ServiceName": "dummy",
    "ServiceTags": [
      "baz"
    ],
    "Definition": {},
    "CreateIndex": 307201226,
    "ModifyIndex": 307201373
  },
  {
    "Node": "node02",
    "CheckID": "_nomad-check-56a784fa95cf7e9230a6db208374cd1f3afeffb2",
    "Name": "bar check",
    "Status": "critical",
    "Notes": "",
    "Output": "context deadline exceeded",
    "ServiceID": "_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy",
    "ServiceName": "dummy",
    "ServiceTags": [
      "baz"
    ],
    "Definition": {},
    "CreateIndex": 307201227,
    "ModifyIndex": 307201430
  },
  {
    "Node": "node02",
    "CheckID": "_nomad-check-7f6999dbb0aae17e9badb5729196fedcc72d6df5",
    "Name": "baz check",
    "Status": "critical",
    "Notes": "",
    "Output": "context deadline exceeded",
    "ServiceID": "_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy",
    "ServiceName": "dummy",
    "ServiceTags": [
      "baz"
    ],
    "Definition": {},
    "CreateIndex": 307201228,
    "ModifyIndex": 307201330
  }
]

Which leads to the following Consul service registration:

$ curl -s $CONSUL_HTTP_ADDR/v1/catalog/service/dummy | jq .
[
  {
    "ID": "6d64c443-ba1e-457f-2dfc-c5727f783d4d",
    "Node": "node02",
    "Address": "10.0.100.90",
    "Datacenter": "us-east-1",
    "TaggedAddresses": {
      "lan": "10.0.100.90",
      "wan": "10.0.100.90"
    },
    "NodeMeta": {
      "consul-network-segment": ""
    },
    "ServiceKind": "",
    "ServiceID": "_nomad-task-a04567a0-b3b3-6210-ed69-44473e502d54-dummy-dummy",
    "ServiceName": "dummy",
    "ServiceTags": [
      "baz"
    ],
    "ServiceAddress": "10.0.100.90",
    "ServiceWeights": {
      "Passing": 1,
      "Warning": 1
    },
    "ServiceMeta": {
      "external-source": "nomad"
    },
    "ServicePort": 21656,
    "ServiceEnableTagOverride": false,
    "ServiceProxyDestination": "",
    "ServiceProxy": {},
    "ServiceConnect": {},
    "CreateIndex": 307201225,
    "ModifyIndex": 307201225
  }
]
@alexdulin
Copy link

This sounds like you are experiencing the same as in #5819

@hobochili
Copy link
Contributor Author

Whoops, yeah. Same issue. I searched for it yesterday and didn't find anything but neglected to refresh my search today before filing. Closing in favor of #5819

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants