-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consul/connect: add initial support for ingress gateways #8709
Conversation
This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza. ```hcl service { connect { gateway { proxy { // envoy proxy configuration } ingress { // ingress-gateway configuration entry } } } } ``` A gateway can be run in `bridge` or `host` networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value. Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver. Aims to address #8294 and tangentially #8647
Still TODO
|
599f064
to
173fedf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will finish up tomorrow
!> **Warning:** There is no way to disable the Envoy admin interface, which will be | ||
accessible to any workload running on the same Nomad client. The admin interface exposes | ||
information about the proxy, including a Consul Service Identity token if Consul ACLs | ||
are enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can use the "pick a random port and write it to a file" trick then we can mention that filename here in a note instead of this big warning.
for _, service := range tg.Services { | ||
if service.Name == serviceName { | ||
_, port = tg.Networks.Port(service.PortLabel) | ||
break | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is bound on localhost
we don't have to reserve a port and instead could bind to 127.0.0.1:0
, let the OS pick a free port, and have Envoy write that to a file with -admin-address-path
: https://www.envoyproxy.io/docs/envoy/latest/operations/cli#cmdoption-admin-address-path
command/agent/job_endpoint.go
Outdated
return nil | ||
} | ||
|
||
var bindAddresses map[string]*structs.ConsulGatewayBindAddress = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var bindAddresses map[string]*structs.ConsulGatewayBindAddress = nil | |
var bindAddresses map[string]*structs.ConsulGatewayBindAddress |
c.lock.Unlock() | ||
|
||
if stopped { | ||
return errors.New("client stopped and may not longer create config entries") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk of this getting emitted at shutdown?
I can try to determine that but need to stop for the day.
Update: Ah, it will get emitted to the remote caller which is useful! I was worried it would spew error messages to the agent's log at shutdown which gets scary and confusing for operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the implementation here looks solid!
@@ -288,9 +388,16 @@ func (e envoyBootstrapArgs) args() []string { | |||
"envoy", | |||
"-grpc-addr", e.grpcAddr, | |||
"-http-addr", e.consulConfig.HTTPAddr, | |||
"-admin-bind", e.envoyAdminBind, | |||
"-admin-bind", e.envoyAdminBind, // bleh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"-admin-bind", e.envoyAdminBind, // bleh | |
"-admin-bind", e.envoyAdminBind, |
nomad/job_endpoint.go
Outdated
// | ||
// Every job update will re-write the Configuration Entry into Consul. | ||
for service, entry := range args.Job.ConfigEntries() { | ||
ctx := context.Background() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a WithTimeout? I guess the consul client may already have a built in timeout, just thinking about blocking job submission forever if Consul is not responding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, even if it's high (30 seconds? 1 minute?), dying with a 500 seems preferable to hanging until the user presses Ctrl-C and wonders what state things are in.
nomad/leader_test.go
Outdated
var consulACLsAPI mockConsulACLsAPI | ||
s1.consulACLs = &consulACLsAPI | ||
|
||
// replace consul Config API with a mock for tracking calls in tests | ||
// var consulConfigsAPI mock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be able to remove this
nomad/structs/services.go
Outdated
@@ -981,6 +1023,15 @@ func (p *ConsulProxy) Copy() *ConsulProxy { | |||
return newP | |||
} | |||
|
|||
// opaqueMapsEqual compares map[string]interface{} commonly used for opaque | |||
//// config blocks. Interprets nil and {} as the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//// config blocks. Interprets nil and {} as the same. | |
// config blocks. Interprets nil and {} as the same. |
Required by usage of t.Cleanup introduced in hashicorp/nomad#8709
Required by usage of t.Cleanup introduced in hashicorp/nomad#8709
Required by usage of t.Cleanup introduced in hashicorp/nomad#8709
Required by usage of t.Cleanup introduced in hashicorp/nomad#8709
I have tried to setup ingress gateway with wildcard, but I get error:
My goal is to have similar configuration to traefik/fabio.
|
Hi @waszi, sorry I just noticed this. We're you able to resolve the problem? Or if not, could you open a new issue to report it? |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR adds initial support for running Consul Connect Ingress Gateways (CIGs) in Nomad. These gateways are declared as part of a task group level service definition within the connect stanza.
A gateway can be run in
bridge
orhost
networking mode, with the caveat that host networking necessitates manually specifying the Envoy admin listener (which cannot be disabled) via the service port value.Currently Envoy is the only supported gateway implementation in Consul, and Nomad only supports running Envoy as a gateway using the docker driver.
When the
gateway.ingress
field is set, Nomad will write/update the Configuration Entry into Consul on job submission. Because Configurations are global in Consul scope and there may be more than one Nomad cluster communicating with Consul, there is an assumption that any ingress gateway defined in Nomad for a particular service will be the same among Nomad clusters. Consul may provide a mechanism for more fine-grained control over Configuration Entries in the future.Gateways require Consul 1.8.0+, checked by an additional constraint.
Aims to address #8294 and tangentially #8647