You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#46832 seems to have somehow caused a regression for tests shutting down. Every now and then, shutting down a server seems to take ~15s, and the logs show:
W200403 19:36:34.837125 3038 sql/temporary_schema.go:430 [n1] error during schema cleanup, retrying: node unavailable; try another peer
W200403 19:36:36.577356 3038 sql/temporary_schema.go:430 [n1] error during schema cleanup, retrying: node unavailable; try another peer
W200403 19:36:41.068366 3038 sql/temporary_schema.go:430 [n1] error during schema cleanup, retrying: node unavailable; try another peer
W200403 19:36:49.157471 3038 sql/temporary_schema.go:430 [n1] error during schema cleanup, retrying: node unavailable; try another peer
W200403 19:36:49.157505 3038 sql/temporary_schema.go:551 [n1] failed to clean temp objects: node unavailable; try another peer
I200403 19:36:49.157515 3038 sql/temporary_schema.go:562 [n1] temporary object cleaner next scheduled to run at 2020-04-03 20:06:33.717927 +0000 UTC
I've bisected it pretty conclusively to that PR. I'm looking superficially at the PR though, and I can't tell what's wrong.
It seems to affect tests at random. The effect is big enough to cause the tests for the sql package to take a lot longer than they used to.
To repro, for example, you can do do
make testshort PKG='./pkg/sql' TESTFLAGS="-v --count=50" TESTS=TestSavepointMetric
One out of every 10 runs or so will be very slow.
The PR in question was backported to 20.1 too. I'm gonna mark it as a release blocker cause I find it pretty scary, but it might turn out to not be too bad.
The text was updated successfully, but these errors were encountered:
andreimatei
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Apr 3, 2020
#46832 seems to have somehow caused a regression for tests shutting down. Every now and then, shutting down a server seems to take ~15s, and the logs show:
I've bisected it pretty conclusively to that PR. I'm looking superficially at the PR though, and I can't tell what's wrong.
It seems to affect tests at random. The effect is big enough to cause the tests for the
sql
package to take a lot longer than they used to.To repro, for example, you can do do
One out of every 10 runs or so will be very slow.
The PR in question was backported to 20.1 too. I'm gonna mark it as a release blocker cause I find it pretty scary, but it might turn out to not be too bad.
The text was updated successfully, but these errors were encountered: