-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to serialize outbound message [Response{778394113}{false}{false}{false}{class org.elasticsearch.action.admin.cluster.stats.ClusterStatsNodeResponse #77973
Comments
Pinging @elastic/es-data-management (Team:Data Management) |
Tracing that back to the code, I'm seeing these lines from the stacktrace:
So my thinking is |
Errrmmmmmm, no, thinking about it more, it's meant to be a count of the number of ingest actions currently in flight (hence |
@ptamba would you perhaps be willing to share your ingest pipeline definitions ( |
@joegallo I wouldn't mind sharing it but we have some specific client information in the pipeline (just customer names really, but , I prefer to share it in a more controlled manner). We have support contract with Elastic and I can share it through that channel. |
Great! Thanks, @ptamba! |
Also very much related to #52339 |
I merged #81450 to help us detect this earlier at dev-time, but so far we haven't been able to find any other evidence of this happening for other customers in the wild. Rebooting an affected node would have the effect of restarting the counter at 0, so it wouldn't be negative (and would therefore serialize just fine). |
Is there a good person to engage with for work on this issue? We found an ECE customer that's seeing this and I think it's causing internal collection of stack monitoring data to fail. Additionally I can see 7 ESS clusters currently exhibiting the error message (version range from 7.17.5 to 8.1.3) |
@elastic/es-data-management this is an old ticket, but this problem seems to be causing a problem with an ECE customer as @matschaffer said above. Is possible to take a look at this? |
@joegallo it looks like we can double-decrement in CompoundProcessor doesn't it? Wouldn't that happen if an exception was thrown somewhere in https://github.com/elastic/elasticsearch/blob/v7.17.4/server/src/main/java/org/elasticsearch/ingest/CompoundProcessor.java#L141-L147? |
Elasticsearch version (
bin/elasticsearch --version
): 7.14.0Plugins installed: []
JVM version (
java -version
): openjdk version "1.8.0_242"OpenJDK Runtime Environment (build 1.8.0_242-b08)
OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
OS version (
uname -a
if on a Unix-like system): Linux 3.10.0-1127.el7.x86_64 #1 SMP Tue Feb 18 16:39:12 EST 2020 x86_64 x86_64 x86_64 GNU/LinuxDescription of the problem including expected versus actual behavior:
Cluster status is green,
however cluster stats shows two failed nodes
This also causes monitoring cluster to not display anything (not sure whether they are related)
Below log is displayed in node logs
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: