-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error about null value in jobs table during SHOW JOBS #36851
Comments
I hit the same problem today (also seen in the admin UI):
The problem fixed itself within a few minutes. |
Saw this out in the wild for from a user running v19.1.5. We previously saw this in #34878, and assumed it to be fixed in #35371, but this is not the case as v19.1.5 contains #35371. |
It's interesting to note that from the debug zip retrieved from above, the faulty Job IDs straddle range boundaries. For e.g. we have the following:
Where |
Ok, I'm fairly certain I understand what's happening here. The fact that the buggy Job IDs straddled range boundaries was a good (read: lucky) thread to pull on. Seeing as how we most recently observed this panic in v19.1.5, I took a look through our release notes since then to land on https://www.cockroachlabs.com/docs/releases/v19.2.2.html. There I found the following:
Looking through the original PR that was backported to 19.2 (#42833), and the discussions present in #42056, the issue stems from the fact we seemed to have split between column families of the same row, which is evidently illegal. Note the fetcher behavior described in #42056 (comment), specifically:
What I suspect happened here was more or less the same. When attempting to fetch the specific row, we hit the range boundary, and assumed we'd fetched all the KV's for the row. In this case the "status" KV was stored in the next row, and we failed to fetch it. What the fetcher then observed was a NULL column value for a non-NULLable column, thus erroring out as observed above. |
I'm going to go ahead and (optimistically) close this issue as it's already been fixed. |
On a roachprod cluster, I imported a tpcc fixture using
fixtures import
. One of the table imports appeared to be hanging. When I ranSHOW JOBS
, I got this error:I can read the row from the jobs table, though, and the values are not null:
The job seems to have completed, and the imported table shows up in the output of
SHOW TABLES
.The text was updated successfully, but these errors were encountered: