-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Spring Boot generated query potentially led to internal error #41077
Comments
This is related to #31361, which we have seen in the past when a cluster is overloaded. The |
@nvanbenschoten this is the same cluster with spikes in log commit latency: Hard to tell without a mouseover but it could easily be ~07:30. So might end up being the same root cause. |
This doesn't seem to have anything to do with placeholders. This happens on node failure, or on cluster overload. Looking at the logs, you see frequent pairs of:
The first line indicates the timeout that generates the error that's ultimately returned to the client. One likely reason why this might happen is by running into the @asubiotto @ajwerner do we all agree to get rid of this flow queuing? Let's just do it; I think it does more harm than good, particularly now that we've isolated node liveness heartbeats somewhat. As far as this customer issue is concerned, I think we should close it unless we continue seeing these errors without also seeing the "flow scheduler enqueuing flow" messages in logspy. |
I'm personally for removing this queuing behavior but I think it's up to @ajwerner to determine whether it's useful at all or not wrt admission control. In this particular case I don't think the queuing is having an effect. To be sure, the number of running flows can be inspected through the admin UI ( |
Should we close this issue now that #31361 is closed? |
This can still happen. It should be rarer but the root cause still exists, especially if the query is run in parallel many times. We can close it and associate it with #34229. |
Describe the problem
An internal error was encountered by a user. From their Spring Boot logs it shows the following: (sql has been anonymized)
It's not clear if the internal error was due to an issue on our end, or if it was caused by the query having placeholders instead of actual values and we did not know how to handle this.
We do see that that Node 6 was fairly unhappy:
W190925 07:20:49.827664 417 storage/node_liveness.go:523 [n6,hb] slow heartbeat took 38.3s
Expected behavior
Queries should execute correctly without any negative impact on cluster.
Additional data / screenshots
Debug zip can be found here
If a node in your cluster encountered a fatal error, supply the contents of the
log directories (at minimum of the affected node(s), but preferably all nodes).
Note that log files can contain confidential information. Please continue
creating this issue, but contact [email protected] to submit the log
files in private.
If applicable, add screenshots to help explain your problem.
Environment:
cockroach sql
, JDBC, ...]Additional context
User was unable to access admin UI from node 3 and Node 6 had periodic issues of unavailability and under-replicated ranges. Node 6 was restarted, and cluster and application were back to a healthy state.
The text was updated successfully, but these errors were encountered: