Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow binding to multiple addresses. #13954

Closed
wants to merge 7 commits into from

Conversation

rmuir
Copy link
Contributor

@rmuir rmuir commented Oct 6, 2015

For consistency and simplicity this means that es.network.host/bind_host/publish_host can all take arrays.

Yes, we are still limited to a single publish host, and only the "best" address from publish_host will be used, I do not change that, but its already the case today (even by default!: _local_), and its just an internal limitation we can later fix. Instead we just continue with the same logic of "magic selection of publish host", and users can always be explicit, and we are just clear to the user about what is happening.

I did it this way, because network stuff is already overwhelmed, and i mean really overwhelmed with settings. We need to make this easy on the user to do the right thing and not make that worse.

Examples:

bin/elasticsearch # defaults

[elasticsearch] [2015-10-06 01:15:28,671][WARN ][common.network           ] [Todd Arliss] publish host: [_local_] resolves to multiple addresses, auto-selecting {127.0.0.1} as single publish address
[elasticsearch] [2015-10-06 01:15:28,672][INFO ][transport                ] [Todd Arliss] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}

bin/elasticsearch -Des.network.host=_lo0_,_en0_ # multiple interfaces

[2015-10-06 01:17:08,970][WARN ][common.network           ] [Torso] publish host: [_lo0_, _en0_] resolves to multiple addresses, auto-selecting {192.168.0.19} as single publish address
[2015-10-06 01:17:08,972][INFO ][transport                ] [Torso] publish_address {192.168.0.19:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}, {[fe80::3e15:c2ff:fee5:d26c]:9300}, {192.168.0.19:9300}

bin/elasticsearch -Des.network.host=192.168.0.19,_local_ -Des.network.publish_host=192.168.0.19 # being explicit

[2015-10-06 01:19:28,603][INFO ][transport                ] [Thunderbird] publish_address {192.168.0.19:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}, {192.168.0.19:9300}

Closes #13592

For consistency and simplicity this means that es.network.host/bind_host/publish_host can all take arrays.

Yes, we are still limited to a single publish host, and only the "best" address from publish_host will be used,
I do not change that, but its alread the case today (even by default: _local_ !), and its just an internal
limitation we can later fix. Instead we just continue with the same logic of "magic selection of publish host",
and users can always be explicit, and weare just clear to the user about what is happening.

I did it this way, because network stuff is already overwhelmed, and i mean really overwhelmed with settings.
We need to make this easy on the user to do the right thing and not make that worse.

Examples:

bin/elasticsearch # defaults
```
[elasticsearch] [2015-10-06 01:15:28,671][WARN ][common.network           ] [Todd Arliss] publish host: [_local_] resolves to multiple addresses, auto-selecting {127.0.0.1} as single publish address
[elasticsearch] [2015-10-06 01:15:28,672][INFO ][transport                ] [Todd Arliss] publish_address {127.0.0.1:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}
```

bin/elasticsearch -Des.network.host=_lo0_,_en0_ # multiple interfaces
```
[2015-10-06 01:17:08,970][WARN ][common.network           ] [Torso] publish host: [_lo0_, _en0_] resolves to multiple addresses, auto-selecting {192.168.0.19} as single publish address
[2015-10-06 01:17:08,972][INFO ][transport                ] [Torso] publish_address {192.168.0.19:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}, {[fe80::3e15:c2ff:fee5:d26c]:9300}, {192.168.0.19:9300}
```

bin/elasticsearch -Des.network.host=192.168.0.19,_local_ -Des.network.publish_host=192.168.0.19 # being explicit
```
[2015-10-06 01:19:28,603][INFO ][transport                ] [Thunderbird] publish_address {192.168.0.19:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}, {192.168.0.19:9300}
```

Closes elastic#13592
* Deprecate _non_loopback_. This is far too arbitrary
* Fix _local_ to mean loopback addresses, not addresses on loopback interfaces. This removes the confusion
  of e.g. binding to link-local addresses on the mac for example.
* Remove internal use of _non_loopback_, instead just "expand wildcard addresses"
* Add _site_ and _global_ scopes as counterparts to _local_.

This means users can configure networking by address(es), interface(es), or scope (local, private, public),
or any combination thereof.
@rmuir
Copy link
Contributor Author

rmuir commented Oct 6, 2015

I pushed another commit that fixes the _local_ to behave more sanely, instead of "all the addresses of loopback interfaces" it means "all the loopback addresses". This is an important difference, e.g. it means we no longer confusingly bind to link-local addresses like fe80::1%lo0 on the mac.

I added _site_ and _global_ to match _local_. This means users can configure network binding by reachability scope(s), interface(s), address(es), or any combination of those.

e.g. stuff like -Des.network.host=_local_,_site_ will bind to 127.0.0.1, ::1, and 192.168.1.1, but not public addresses like 92.1.2.3

I am hoping this is enough to prevent confusing options in issues such as #13612 and #13972

If we need to fix this for 2.0 in order to fix those issues, then fine, lets do things right. But our network configuration needs to be easy and make logical sense :)

@nik9000
Copy link
Member

nik9000 commented Oct 6, 2015

[2015-10-06 01:17:08,972][INFO ][transport                ] [Torso] publish_address {192.168.0.19:9300}, bound_addresses {[fe80::1]:9300}, {[::1]:9300}, {127.0.0.1:9300}, {[fe80::3e15:c2ff:fee5:d26c]:9300}, {192.168.0.19:9300}

Nice!

If we need to fix this for 2.0 in order to fix those issues, then fine, lets do things right. But our network configuration needs to be easy and make logical sense :)

I'm +1 on getting this back to 2.1. I'd love to get it into 2.0.1 when that comes around too.

if (bindHost == null) {
bindHost = DEFAULT_NETWORK_HOST;
if (bindHosts == null) {
bindHosts = new String[] { DEFAULT_NETWORK_HOST };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it become DEFAULT_NETWORK_HOSTS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our default is a single host specification: _local_. One point of this change is that you can use multiple: e.g. _local_,12.1.4.2,foo.bar.com

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I suppose that is fine reasoning.

@rmuir
Copy link
Contributor Author

rmuir commented Oct 6, 2015

@nik9000 and for the record i don't really want to rush this stuff in. I just mention that in response to things like #13972 which are set for 2.0

Maybe it is those that need to be reconsidered instead as far as versioning? I am heavily concerned about being able to fix this stuff, I don't want to hear whimpering and whining about how fixing it in 2.2 is "breaking bwc", and i also don't want 2.0 to go out with massive security problems: I am doing everything I can to prevent that.

@dadoonet
Copy link
Member

dadoonet commented Oct 6, 2015

@rmuir There is IMO absolutely no rush to fix #13969 #13970 and #13972 although I marked them initially for 2.0.0. Just because people can use _local_ or _ec2_ for now so they have a workaround.

So if there is any security concern, let's move those issues to a later version. It's only a nice to have I think.

@rmuir
Copy link
Contributor Author

rmuir commented Oct 6, 2015

OK I just feel like i missed something bigtime: some discussion that we are reverting the "bind-to-localhost" default in terms of ease of use.

This should really be discussed in more detail and planned more thoroughly. I saw a bunch of 2.0 issues come up suddenly and just got super worried about it.

On the other hand binding to localhost only as a default always, its very safe and easy to understand.

@nik9000
Copy link
Member

nik9000 commented Oct 6, 2015

Maybe it is those that need to be reconsidered instead as far as versioning? I am heavily concerned about being able to fix this stuff, I don't want to hear whimpering and whining about how fixing it in 2.2 is "breaking bwc", and i also don't want 2.0 to go out with massive security problems: I am doing everything I can to prevent that.

If we're concerned about breaking backwards compatibility we can make those network addresses either strings or arrays. That seems simple enough and might be worth doing anyway.

On the other hand binding to localhost only as a default always, its very safe and easy to understand.

Yeah. I think it'd be nice to have some "this is how you make a real cluster" docs somewhere for users who want a helping hand going from playing around on one box to multiple boxes.

@nik9000
Copy link
Member

nik9000 commented Oct 6, 2015

Yeah. I think it'd be nice to have some "this is how you make a real cluster" docs somewhere for users who want a helping hand going from playing around on one box to multiple boxes.

But not as part of this PR. That'd be silly.

@rmuir rmuir removed the v2.2.0 label Oct 6, 2015
@rmuir
Copy link
Contributor Author

rmuir commented Oct 6, 2015

I'm removing 2.2 and will just remove the garbage like _non_loopback_ in our network configuration completely here in additional commits. It is far more important to clean up master. Otherwise things like deprecation and leniency hang around for too long.

@rmuir
Copy link
Contributor Author

rmuir commented Oct 15, 2015

I pushed additional commits with the cleanups I wanted to do. I think this is good for master. if we want to backport it (e.g. to 2.2) then we should deprecate+warn about the heuristicy _non_loopback_ that will be removed.

@rmuir
Copy link
Contributor Author

rmuir commented Oct 23, 2015

I'm not going to maintain this branch, its too much work

@rmuir rmuir closed this Oct 23, 2015
@s1monw
Copy link
Contributor

s1monw commented Oct 23, 2015

I looked through it and it LGTM I am sorry I missed the last pings 8 days ago, can you merge it when it's up-to-date?

@rmuir
Copy link
Contributor Author

rmuir commented Oct 23, 2015

i resynced this to master.

@rmuir rmuir reopened this Oct 23, 2015
@jaymode
Copy link
Member

jaymode commented Oct 23, 2015

LGTM too

@rmuir rmuir added the v2.2.0 label Oct 23, 2015
@s1monw
Copy link
Contributor

s1monw commented Oct 23, 2015

LGTM thanks rob

@rmuir rmuir closed this in 6c8e290 Oct 24, 2015
rmuir added a commit that referenced this pull request Oct 24, 2015
* Allow for multiple host specifications (e.g. _en0_,192.168.1.2,_site_).
* Add _site_ and _global_ scopes as counterparts to _local_.
* Warn on heuristic selection of publish address.
* Remove the arbitrary _non_loopback_ setting.

Closes #13954

Conflicts:
	docs/reference/migration/migrate_3_0.asciidoc
	plugins/cloud-gce/src/test/java/org/elasticsearch/discovery/gce/GceNetworkTests.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations >feature v2.2.0 v5.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow binding to multiple addresses
6 participants