Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consul/connect: add initial support for ingress gateways #8709

Merged
merged 8 commits into from
Aug 26, 2020
Merged

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Aug 21, 2020

This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.

service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      ingress {
        // ingress-gateway configuration entry
      }
    }
  }
}

A gateway can be run in bridge or host networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.

Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.

When the gateway.ingress field is set, Nomad will write/update the Configuration Entry into Consul on job submission. Because Configurations are global in Consul scope and there may be more than one Nomad cluster communicating with Consul, there is an assumption that any ingress gateway defined in Nomad for a particular service will be the same among Nomad clusters. Consul may provide a mechanism for more fine-grained control over Configuration Entries in the future.

Gateways require Consul 1.8.0+, checked by an additional constraint.

Aims to address #8294 and tangentially #8647

This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.

```hcl
service {
  connect {
    gateway {
      proxy {
        // envoy proxy configuration
      }
      ingress {
        // ingress-gateway configuration entry
      }
    }
  }
}
```

A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.

Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.

Aims to address #8294 and tangentially #8647
@shoenig
Copy link
Member Author

shoenig commented Aug 21, 2020

Still TODO

@shoenig shoenig force-pushed the f-cc-ingress branch 2 times, most recently from 599f064 to 173fedf Compare August 24, 2020 13:43
@shoenig shoenig marked this pull request as ready for review August 24, 2020 14:42
@shoenig shoenig mentioned this pull request Aug 24, 2020
Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will finish up tomorrow

Comment on lines +194 to +197
!> **Warning:** There is no way to disable the Envoy admin interface, which will be
accessible to any workload running on the same Nomad client. The admin interface exposes
information about the proxy, including a Consul Service Identity token if Consul ACLs
are enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can use the "pick a random port and write it to a file" trick then we can mention that filename here in a note instead of this big warning.

Comment on lines +274 to +279
for _, service := range tg.Services {
if service.Name == serviceName {
_, port = tg.Networks.Port(service.PortLabel)
break
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is bound on localhost we don't have to reserve a port and instead could bind to 127.0.0.1:0, let the OS pick a free port, and have Envoy write that to a file with -admin-address-path: https://www.envoyproxy.io/docs/envoy/latest/operations/cli#cmdoption-admin-address-path

client/client.go Show resolved Hide resolved
return nil
}

var bindAddresses map[string]*structs.ConsulGatewayBindAddress = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var bindAddresses map[string]*structs.ConsulGatewayBindAddress = nil
var bindAddresses map[string]*structs.ConsulGatewayBindAddress

c.lock.Unlock()

if stopped {
return errors.New("client stopped and may not longer create config entries")
Copy link
Member

@schmichael schmichael Aug 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a risk of this getting emitted at shutdown?

I can try to determine that but need to stop for the day.

Update: Ah, it will get emitted to the remote caller which is useful! I was worried it would spew error messages to the agent's log at shutdown which gets scary and confusing for operators.

Copy link
Member

@nickethier nickethier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the implementation here looks solid!

@@ -288,9 +388,16 @@ func (e envoyBootstrapArgs) args() []string {
"envoy",
"-grpc-addr", e.grpcAddr,
"-http-addr", e.consulConfig.HTTPAddr,
"-admin-bind", e.envoyAdminBind,
"-admin-bind", e.envoyAdminBind, // bleh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"-admin-bind", e.envoyAdminBind, // bleh
"-admin-bind", e.envoyAdminBind,

client/client.go Show resolved Hide resolved
//
// Every job update will re-write the Configuration Entry into Consul.
for service, entry := range args.Job.ConfigEntries() {
ctx := context.Background()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a WithTimeout? I guess the consul client may already have a built in timeout, just thinking about blocking job submission forever if Consul is not responding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, even if it's high (30 seconds? 1 minute?), dying with a 500 seems preferable to hanging until the user presses Ctrl-C and wonders what state things are in.

var consulACLsAPI mockConsulACLsAPI
s1.consulACLs = &consulACLsAPI

// replace consul Config API with a mock for tracking calls in tests
// var consulConfigsAPI mock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be able to remove this

@@ -981,6 +1023,15 @@ func (p *ConsulProxy) Copy() *ConsulProxy {
return newP
}

// opaqueMapsEqual compares map[string]interface{} commonly used for opaque
//// config blocks. Interprets nil and {} as the same.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//// config blocks. Interprets nil and {} as the same.
// config blocks. Interprets nil and {} as the same.

@shoenig shoenig merged commit 48ae1de into master Aug 26, 2020
@shoenig shoenig deleted the f-cc-ingress branch August 26, 2020 20:50
mikemorris added a commit to hashicorp/consul that referenced this pull request Sep 10, 2020
mikemorris added a commit to hashicorp/consul that referenced this pull request Sep 10, 2020
mikemorris added a commit to hashicorp/consul that referenced this pull request Sep 10, 2020
mikemorris added a commit to hashicorp/consul that referenced this pull request Sep 10, 2020
@waszi
Copy link

waszi commented Sep 23, 2020

I have tried to setup ingress gateway with wildcard, but I get error:

job "ingress" {
  datacenters = ["dc1"]

  constraint {
    attribute = "${attr.unique.hostname}"
    operator = "regexp"
    value = "^lb0[0-9]$"
  }

  group "ingress-group" {
    network {
      mode = "bridge"
      port "inbound" {
        static = 8080
      }
    }

    service {
      name = "ingress-service"
      port = "8080"

      connect {
        gateway {
          proxy {
            connect_timeout = "500ms"
          }
          ingress {
            tls {
              enabled = false
            }
            listener {
              port = 8080
              protocol = "http"
              service {
                name = "*"
              }
            }
          }
        }
      }
    }
  }
}
Error submitting job: Unexpected response code: 500 (rpc error: rpc error: 1 error occurred:
	* Task group ingress-group validation failed: 1 error occurred:
	* Task group service validation failed: 1 error occurred:
	* Service[0] ingress-service validation failed: 1 error occurred:
	* Consul Ingress Service requires one or more hosts when using HTTP protocol

My goal is to have similar configuration to traefik/fabio.

  1. "Export" everything using wildcard SERVICE.mydomain.com -> SERVICE
    or
  2. Configure each service individually but configuration should be in each job separately.

@shoenig
Copy link
Member Author

shoenig commented Nov 25, 2020

Hi @waszi, sorry I just noticed this. We're you able to resolve the problem? Or if not, could you open a new issue to report it?

@github-actions
Copy link

github-actions bot commented Dec 8, 2022

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants