-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: import/tpcc/warehouses=4000/geo failed #81186
Comments
Most proximately, it appears the test failed because the health-checker process that the test starts failed. The health check failed with:
However, from the journalctl logs, we can see that oom-killer was invoked at 9:47:32
and it resulted in cockroach being killed:
From the tsdump we see an increase in both memory usage and a sharp increase in disk IO right before this. The last available memory profile looks to me like this may be similar to those seen in the various issues linked to #73376. Other Notes On node 7, in the 20 minutes leading up to the OOM, I see a long stream of messages like the following:
We also see a good number of slow AddSSTable RPCs across all nodes:
|
I'm going to remove the release-blocker label here. I've assigned kv in case there is anything useful here to their ongoing work on this, but I imagine we can just close this one since we have a few issues related to this already. |
@tbg looks like what you've been investigating. mind making a call on whether we keep this open? |
We can close this. This is likely a follower-writes issue, which we are tracking in #79215 |
roachtest.import/tpcc/warehouses=4000/geo failed with artifacts on master @ 7f3c06f5f2c26bc84705430a3622f92ec1444e9d:
Help
See: roachtest README
See: How To Investigate (internal)
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-15309
The text was updated successfully, but these errors were encountered: