Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] ES 7.10 cannot pass environment variable to setup a coordinating only node #65577

Closed
kunisen opened this issue Nov 30, 2020 · 14 comments · Fixed by #85186
Closed

[Bug] ES 7.10 cannot pass environment variable to setup a coordinating only node #65577

kunisen opened this issue Nov 30, 2020 · 14 comments · Fixed by #85186
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs help wanted adoptme Team:Core/Infra Meta label for core/infra team v7.10.0

Comments

@kunisen
Copy link
Contributor

kunisen commented Nov 30, 2020

Elasticsearch version (bin/elasticsearch --version):

ES 7.10

Plugins installed:

N/A

JVM version (java -version):

Bundled JDK

OS version (uname -a if on a Unix-like system):

MacOS

Description of the problem including expected versus actual behavior:

I have this setting.
I want to pass $ENV_VAR to start it as coordinate only node.

node.roles: ${ENV_VAR}

When I set $ENV_VAR to "", I get node start with full set of roles.

$ curl localhost:9200/_cat/nodes
127.0.0.1 31 100 29 2.23   cdhilmrstw * senmac

When I set $ENV_VAR to " " (with one space inside), I get this error message.

java.lang.IllegalArgumentException: unknown role []

When I set $ENV_VAR to [], I get this error message.

java.lang.IllegalArgumentException: unknown role [[]]

Even if I don't set ENV_VAR (I used an undefined env I_DIDNOT_SET_THIS_ENV), I will also get error.

Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'I_DIDNOT_SET_THIS_ENV'

Steps to reproduce:

  1. Download ES 7.10
  2. Set environment variable and put it to node.roles in elasticsearch.yml
  3. Start ES

Provide logs (if relevant):

@kunisen kunisen added >bug :Core/Infra/Settings Settings infrastructure and APIs needs:triage Requires assignment of a team area label v7.10.0 labels Nov 30, 2020
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Nov 30, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@rjernst
Copy link
Member

rjernst commented Nov 30, 2020

There are a couple issues here. I don't believe any of these are new in 7.10.

Could not resolve placeholder 'I_DIDNOT_SET_THIS_ENV'

This is expected. Any text of the form ${FOO} expects a substitution to exist for FOO and will error if it does not exist.

When I set $ENV_VAR to [], I get this error message.
java.lang.IllegalArgumentException: unknown role [[]]

IIRC these are side effects of the internal representation of parsed settings. All settings are parsed into a flat String to String map. This means lists must be serialized into a String, and then re-parsed later when needing to look at the values. We use a simple comma separated list as the format. Note that this is not a json list, as it does not contain the enclosing brackets. When a substitution happens, we have already parsed the json structure, so the thing returned by the substitution must be the comma separated list.

This is sort of noted in the documentation, but phrased as a suggestion (to not use brackets), rather than a rule. For now we should fix the documentation to note brackets should not be used.

Long term, we need a better structure for list values that does not rely on ambiguous serialization to strings, but the scope of such a project is unclear, and, from what I remember, past attempts to resolve this issue were given up due to complexity of maintaining current behavior.

When I set $ENV_VAR to " " (with one space inside), I get this error message.
java.lang.IllegalArgumentException: unknown role []

This looks like a bug in default handling, though may be related to the complexities of the above issue.

@kunisen
Copy link
Contributor Author

kunisen commented Dec 1, 2020

Thanks for the quick and detailed pointers!! @rjernst

@rmb938
Copy link

rmb938 commented Jun 30, 2021

Was this issue ever fixed?

When I set $ENV_VAR to "", I get node start with full set of roles.
$ curl localhost:9200/_cat/nodes
127.0.0.1 31 100 29 2.23 cdhilmrstw * senmac

When I set $ENV_VAR to " " (with one space inside), I get this error message.
java.lang.IllegalArgumentException: unknown role []

I am still running into it in 7.13. Any suggestions on how to work around the issue?

@yvbondarenko
Copy link

To make everything work, I used the service file
/usr/lib/systemd/system/elasticsearch.service
image
or
image

In /etc/elasticsearch/elasticsearch.yml
image

@giantjunkbox
Copy link

giantjunkbox commented Dec 19, 2021

Any progress on this?

Having the described issue in 7.15.2 and unable to come up with a workaround other then to not use this feature.

Also having the same issue when trying to pass an empty set of roles via the command line (e.g. -Enode.roles=) results in a "ERROR: setting [node.roles] must not be empty"

Seems like support needs to be added to allow for optionally enclosing the values in square brackets and recognizing "[ ]" as an empty set.

@giantjunkbox
Copy link

giantjunkbox commented Dec 19, 2021

The best workaround that I have come up with so far, by virtue of being the only workaround I have come up with, is to set the node.roles in the elasticsearch.yml file to [ ] and, for any node that is not a coordinating only node, set the role via the command line (e.g. -Enode.roles=master).

@IgorOhrimenko
Copy link

Only deprecated options are working:

    environment:
      - node.master=false
      - node.voting_only=false
      - node.data=false
      - node.ingest=false
      - node.ml=false

part of log when added the node:
"message": "added {{elasticsearch-coordinator}{OntAm_sFT6uIaSE1ud6Tbw}{h1WEUCmlQb-TAbfuex_9zw}{10.10.10.10}{10.10.10.10:9300}{r}{xpack.installed=true, transform.node=false}}

All other variants does not work:

- node.roles=\
- node.roles= ,
- node.roles=" "
- node.roles=' '

and return ERROR with 'unknown role'.

Empty and space symbol variants:

- node.roles= <-space symbol
- node.roles=<-empty line

also does not work, it skip when running and start elasticsearch with all roles:
"message": "added {{elasticsearch-coordinator}{OntAm_sFT6uIaSE1ud6Tbw}{U3RLRbRdR2eG2dJNh3I7Qw}{10.10.10.10}{10.10.10.10:9300}{dilmrt}{ml.machine_memory=50527862784, ml.max_open_jobs=20, xpack.installed=true, transform.node=true}},

@mark-vieira
Copy link
Contributor

mark-vieira commented Dec 30, 2021

@IgorOhrimenko as mentioned, there is currently no way to specify an "empty list" via environment variables. If you must configure this you'll need to configure via elasticsearch.yml.

@dakrone what are your thoughts on introducing an explicit coordinating role vs relying on the implicit lack of other roles. Alternatively, would just using voting_only effectively do the same thing or are there other side-effects of a coordinating node also having that role?

edited by @jakelandis to ping the correct Lee

@giantjunkbox
Copy link

@IgorOhrimenko as mentioned, there is currently no way to specify an "empty list" via environment variables. If you must configure this you'll need to configure via elasticsearch.yml.

@dakrone what are your thoughts on introducing an explicit coordinating role vs relying on the implicit lack of other roles. Alternatively, would just using voting_only effectively do the same thing or are there other side-effects of a coordinating node also having that role?

edited by @jakelandis to ping the correct Lee

I think I tried voting_only, and it won't take it unless you also include the master role.

I pulled apart the code looking for a coordinating role that was perhaps not documented, and of course didn't fine one; however in the process is became obvious why it is not there - if there was a coordinating role, then not including it would imply that the node was not a coordinating node and since every node is a coordinating node it just dosn't make much sense.

IMHO - the real problem is that there needs to be a way to pass an empty set.

@mark-vieira
Copy link
Contributor

I think I tried voting_only, and it won't take it unless you also include the master role.

That certainly doesn't seem right. The purpose of that role is to indicate that a node can participate in master election but not itself be considered to be master-eligible.

I pulled apart the code looking for a coordinating role that was perhaps not documented, and of course didn't fine one; however in the process is became obvious why it is not there - if there was a coordinating role, then not including it would imply that the node was not a coordinating node and since every node is a coordinating node it just dosn't make much sense.

Yes, essentially every node is a coordinating node. There is not way to not have a node participate in that behavior. The implicit "coordinating-only" node just means a node that has no other explicit roles. I think we could still find a way to make this more explicit.

IMHO - the real problem is that there needs to be a way to pass an empty set.

I agree this is definitely a significant limitation that needs to be sorted out.

@giantjunkbox
Copy link

Apologies, I don't think I was clear.

Setting the role to JUST voting_only fails and will prevent the instance from starting up. For the node to become a voting only node, the master role must also be included. This is the documented behavior.

My point was more so that you could not set the node role to voting_only and omit master and have it do nothing other then be a coordinating node. It is totally possible to set the role to "master, voting_only" and have a voting only coordinating node... but then you end up with an extra vote... which changes quorum.... which is not to say you could not game it out to make this work, it's just really ugly. The above work around is only slightly ugly ;)

I think I tried voting_only, and it won't take it unless you also include the master role.

That certainly doesn't seem right. The purpose of that role is to indicate that a node can participate in master election but not itself be considered to be master-eligible.

@mark-vieira
Copy link
Contributor

It is totally possible to set the role to "master, voting_only" and have a voting only coordinating node... but then you end up with an extra vote... which changes quorum.... which is not to say you could not game it out to make this work, it's just really ugly. The above work around is only slightly ugly ;)

Understood, and agreed.

I plan to revisit this with @rjernst.

@gbit-is
Copy link

gbit-is commented Mar 9, 2022

Just ran into this issue, trying to keep my cluster in sync with ansible and a single config file, populated with environment variables, with dedicated elastic nodes for kibana and with default java settings and that is not possible.

rjernst added a commit to rjernst/elasticsearch that referenced this issue Mar 21, 2022
Environment variables are allowed as substitutions within
elasticsearch.yml. Additionally, command line settings are added into
the parsed settings. However, both of these are raw strings applied
after the node yml file has been parsed.

This commit moves environment substitution to occur before parsing
elasticsearch.yml, and override processing to happen with a separate
yaml parser so that both can allow yaml parsing. Note that environment
substitution is not the only type of substitution (there is also setting
substitution, which needs parsed setting keys), so the existing
replacement mechanism is not touched here except to remove environment
handling.

closes elastic#65577
rjernst added a commit that referenced this issue Mar 22, 2022
Environment variables are allowed as substitutions within
elasticsearch.yml. Additionally, command line settings are added into
the parsed settings. However, both of these are raw strings applied
after the node yml file has been parsed.

This commit moves environment substitution to occur before parsing
elasticsearch.yml, and override processing to happen with a separate
yaml parser so that both can allow yaml parsing. Note that environment
substitution is not the only type of substitution (there is also setting
substitution, which needs parsed setting keys), so the existing
replacement mechanism is not touched here except to remove environment
handling.

closes #65577
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Settings Settings infrastructure and APIs help wanted adoptme Team:Core/Infra Meta label for core/infra team v7.10.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants