Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Multiple services with common tag not being registered with Consul #5819

Closed
alexdulin opened this issue Jun 12, 2019 · 9 comments · Fixed by #5829
Closed

bug: Multiple services with common tag not being registered with Consul #5819

alexdulin opened this issue Jun 12, 2019 · 9 comments · Fixed by #5829

Comments

@alexdulin
Copy link

Nomad version

Nomad: Nomad v0.9.2 (028326684b9da489e0371247a223ef3ae4755d87)
Consul: Consul v1.5.1

Operating system and Environment details

Ubuntu 16.04

Issue

When submitting a job that has multiple service stanzas that use the same service name and a common tag, only the last service stanza will be registered. No other services sharing the same service name and common tag get registered.

In the example below (based off of nomad job init), only the last service with tags global and bar will be registered, none of the others. Applying this same job to Nomad 0.9.1 results in all 3 services being registered in the Consul catalog with the same name and all corresponding tags.

Reproduction steps

  1. Fire up a nomad agent running version 0.9.2 and a Consul agent running 1.5.1 (I have not tested it with lower version Consul agents, but since the issue does not occur on nomad 0.9.1 I do not presume it would matter)
  2. Submit the job below
  3. Observe only 1 of the three below services being registered

Job file (if appropriate)

job "example" {
  datacenters = ["dc1"]
  type = "service"


  group "cache" {
    count = 1

    task "redis" {
      driver = "docker"

      config {
        image = "redis:3.2"
        port_map {
          db = 6379
          foo = 6380
          bar = 6381
        }
      }


      resources {
        cpu    = 500
        memory = 256
        network {
          mbits = 10
          port "db" {}
          port "foo" {}
          port "bar" {}
        }
      }

      service {
        name = "redis-cache"
        tags = ["global", "cache"]
        port = "db"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }

      service {
        name = "redis-cache"
        tags = ["global", "foo"]
        port = "foo"

        check {
          name     = "alive"
          type     = "tcp"
          port     = "db"
          interval = "10s"
          timeout  = "2s"
        }
      }

      service {
        name = "redis-cache"
        tags = ["global", "bar"]
        port = "bar"

        check {
          name     = "alive"
          type     = "tcp"
          port     = "db"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}
@alexdulin
Copy link
Author

alexdulin commented Jun 12, 2019

After testing this some more, it seems that even if you remove the common tag global, only the last tag will be registered on the service. I could be wrong, but my best guess would be because the changes from this PR no longer take into account the tags when creating the service name

@scalp42
Copy link
Contributor

scalp42 commented Jun 12, 2019

We can see it too 💥 @alexdulin

@notnoop
Copy link
Contributor

notnoop commented Jun 12, 2019

Thanks for reporting this bug. @alexdulin, your assessment is correct in that, with #5536, we expect task services to have unique names. We are evaluating a proper fix, but we'd recommend using different services names if you upgrade to 0.9.2 or 0.9.3.

For us to have some context - mind if you elaborate on your use case please? What advantages do you have by reusing the service name across multiple services?

@alexdulin
Copy link
Author

@notnoop The use case for us is with tasks that listen on multiple ports for different purposes. Some of our real examples include Elasticsearch, which listens on separate ports for HTTP requests and TCP for cluster level communication, or Logstash which uses multiple ports for various inputs. We distinguish between these using tags, such as http.elasticsearch.service.consul or tcp.elasticsearch.service.consul.

This is the same way that Nomad servers register themselves as rpc.nomad.service.consul and http.nomad.service.consul, and how Vault uses active.vault.service.consul and standby.vault.service.consul to distinguish between leader and follower nodes.

This feature is a very critical element for our usage of Nomad, and I have a feeling it is for others as well. Without being able to register tasks using a common service name, much of the value of Nomad's integration with Consul is lost.

@ivantopo
Copy link

Adding my 2 cents of context from other user here, we do have a very similar situation as the one @alexdulin described: a single service exposes more than one port for different purposes (in our case, an Akka application that requires a port for the HTTP API and another port for cluster management). This was working just fine with 0.8.4 and today got bitten by this while upgrading to 0.9.2.

If you folks need some extra info or help testing a patch I would be happy to participate.

@scalp42
Copy link
Contributor

scalp42 commented Jun 12, 2019

Adding my 2 cents here, same with say Sensu for the api port and the redis port for example.

@notnoop
Copy link
Contributor

notnoop commented Jun 12, 2019

Thanks for the useful feedback. We'll aim to fix this in the next patch release.

@Neha-Maurya95
Copy link

Hey,
i have an issue with service registered on consul .
I am creating 2 instances of same job and they are scheduling on two different client nodes . Because i want to run them in High Availability.
i have added a single service in nomad job configuration file. The job is deployed successfully and service i also created .
But the service is conficting, it is getting registered and deregistered frequently, sometimes point to client node 1 and sometimes to client node 2.
I don't understand why this happening .
[Note : If i stop the one node and then run the job , then it works fine]

Please suggest me how i can resolve this issue.

@tgross
Copy link
Member

tgross commented Jun 10, 2021

@Neha-Maurya95 this issue has been closed for a couple years now. Please open a new issue or post on Discuss if you need help.

@hashicorp hashicorp locked as resolved and limited conversation to collaborators Jun 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants