-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No easy way to fix incorrect persistent cluster setting #47038
Comments
Pinging @elastic/es-core-features |
Pinging @elastic/es-distributed |
In this case setting |
Updating steps to reproduce:
tagging @e-mars |
@DaveCTurner confirming per Sherry's comment that this did not work - because no cluster node was able to load the state file from disk. Each node was printing the following error regularly:
Calling
Attempting to update these settings returned 400. Calling
The only way we were able to recover the state was to do something like
|
Today we log and swallow exceptions during cluster state application, but such an exception should not occur. This commit adds assertions of this fact, and updates the Javadocs to explain it. Relates elastic#47038
I am struggling to reproduce this fact from the information given. Here are the steps I'm following. I started up a new empty 3-node 7.3.0 cluster and ran the following command:
I confirmed that all three nodes were stuck in a loop emitting exceptions like this:
I restarted all three nodes and confirmed that they remained stuck in the same loop. I then added
Finally I removed |
That's interesting - the advice we received from support (and also my understanding) was that
As such, we didn't attempt to override the persistent settings using elasticsearch.yml. Are you saying that elasticsearch.yml can override persistent settings? In which cases - and where is this documented? I'd also argue that the cluster in a persistent crashloop as a result of a change to monitoring configuration is a bug that needs to be fixed. Really appreciate your time to investigate this! |
That's correct for each setting in isolation. However, we are not overriding any single setting in a way that contradicts this.
Agreed. |
Today we log and swallow exceptions during cluster state application, but such an exception should not occur. This commit adds assertions of this fact, and updates the Javadocs to explain it. Relates #47038
Today we log and swallow exceptions during cluster state application, but such an exception should not occur. This commit adds assertions of this fact, and updates the Javadocs to explain it. Relates #47038
Today we log and swallow exceptions during cluster state application, but such an exception should not occur. This commit adds assertions of this fact, and updates the Javadocs to explain it. Relates #47038
Closed by #50694. |
Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters.*.users xpack.security.audit.logfile.events.ignore_filters.*.realms xpack.security.audit.logfile.events.ignore_filters.*.roles xpack.security.audit.logfile.events.ignore_filters.*.indices Closes #52357 Relates #47711 #47038 Follows the example from #47246
Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters.*.users xpack.security.audit.logfile.events.ignore_filters.*.realms xpack.security.audit.logfile.events.ignore_filters.*.roles xpack.security.audit.logfile.events.ignore_filters.*.indices Closes elastic#52357 Relates elastic#47711 elastic#47038 Follows the example from elastic#47246
Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters.*.users xpack.security.audit.logfile.events.ignore_filters.*.realms xpack.security.audit.logfile.events.ignore_filters.*.roles xpack.security.audit.logfile.events.ignore_filters.*.indices Closes elastic#52357 Relates elastic#47711 elastic#47038 Follows the example from elastic#47246
Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters.*.users xpack.security.audit.logfile.events.ignore_filters.*.realms xpack.security.audit.logfile.events.ignore_filters.*.roles xpack.security.audit.logfile.events.ignore_filters.*.indices Closes #52357 Relates #47711 #47038 Follows the example from #47246
Add validation for the following logfile audit settings: xpack.security.audit.logfile.events.include xpack.security.audit.logfile.events.exclude xpack.security.audit.logfile.events.ignore_filters.*.users xpack.security.audit.logfile.events.ignore_filters.*.realms xpack.security.audit.logfile.events.ignore_filters.*.roles xpack.security.audit.logfile.events.ignore_filters.*.indices Closes #52357 Relates #47711 #47038 Follows the example from #47246
Elasticsearch version (
bin/elasticsearch --version
):7.3.0
Plugins installed: []
JVM version (
java -version
):OS version (
uname -a
if on a Unix-like system):Description of the problem including expected versus actual behavior:
While working to consolidate monitoring of the clusters, we applied the following dynamic setting to the cluster persistent settings
xpack.monitoring.exporters.cloud_monitoring.host
. Unfortunately, there was a mistake in the host value as we included a/
at the end. For example,https://myhost:9243/
The command was accepted and applied by the cluster. A bit later, the cluster became unresponsive. Upon investigation, we saw the following error message in the log:
Three issues here:
Steps to reproduce:
xpack.monitoring.exporters.cloud_monitoring.host
https://myhost:9243/
Please note/
at the endProvide logs (if relevant):
The text was updated successfully, but these errors were encountered: