-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: multitenant-upgrade failed #81517
Comments
roachtest.multitenant-upgrade failed with artifacts on master @ 7dd1c51f6b5802e32bafd82e46747f349836592f:
|
right before the stack trace:
|
It's from here: cockroach/pkg/cmd/roachtest/tests/multitenant_upgrade.go Lines 176 to 182 in ed4c6b4
so in serving this query, someone (the tenant pod for t11 probably) was trying to dial n1 but ended up reaching n2. Seems unlikely since we're not shuffling nodes around their addresses here. I wonder if there's some confusion going on regarding SQL instance IDs vs NodeIDs. |
roachtest.multitenant-upgrade failed with artifacts on master @ 48e48db89eb3cd6bc38f3631364c516181811280:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 12198a51408e7333cd4f96b221e6734239479765:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 7a17498a9679853612cb88d82a4a3952d1015f94:
|
roachtest.multitenant-upgrade failed with artifacts on master @ d9fe5f67c75b1b59fc297bf4509a139c640b835b:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 1e2cc61b58dc14386bb68dca59814874648931c2:
|
roachtest.multitenant-upgrade failed with artifacts on master @ e6815947a050e32f21e983aa30dc74ab2a247af3:
|
roachtest.multitenant-upgrade failed with artifacts on master @ e6815947a050e32f21e983aa30dc74ab2a247af3:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 662cc5c3070e6d64d155a9cc9f33253ee5d99ee9:
Same failure on other branches
|
roachtest.multitenant-upgrade failed with artifacts on master @ 1cea73c8a18623949b81705eb5f75179e6cd8d86:
Same failure on other branches
|
@mwang1026 this should be given some attention, it's failing like clockwork and multitenant upgrades are important. |
roachtest.multitenant-upgrade failed with artifacts on master @ 5a54758ce89e866b4fe28c0df74bd610973c6918:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 2181204e9c7ac6b316573073b6b8010f43920f8b:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 90b5db50e1e1cdb0315d8b094081d261e6dcb336:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 0fdac09ca1119b494661e1e1f64ea291d8649782:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 2c08debd6d8c89019f26a3e58cf3b5d3d97b6495:
|
roachtest.multitenant-upgrade failed with artifacts on master @ 0dd438d3dc0b42543890455945a7a6b42811def1:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ d25cb57ccd9bc643ce9058ebd2057cab36b69ad5:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 41db784cb97d2749b162020c2c821979094f87b1:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ f4042d47fa8062a612c38d4696eb6bee9cee7c21:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ b173a16715e71e94115820374da1eb350b3b459d:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 5c2c62ecf1bea60c807edc6b4da22d900ad4ae03:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ a0d8839aa6164af81a9ebb140147d3baf5321287:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ cb55144cdec54d2a70f074ad64b4eca5e6c6891a:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ e6a7dc2f8ee39549e186bd05626c4c375b76fd04:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ aaf50e920ceff3c2863ab96b9e3614b8434b70a8:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 80c274877a917580af62be6eb0cd48c8c7ae9c08:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 003c0360de8b64319b5f0f127b99be91dbdca8a3:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 524fd14da3fefcd849f44a835cc5f88f5dbdadcc:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 770ff3c545a51752490403da64d56fb397f49c5e:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ f59620ec646d1181d358d0dc41ab60815ecf59c9:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ 3b16435371a43d603d193a1e2b480a23fba3f07a:
Parameters: |
Currently this is failing because it is running on v22.1.6, which includes this change #85719. I think the test needs to be updated to account for that. I'll work on that now. |
Oh, I misread. the error is:
cc @RichardJCai that seems related to #81457. could there be a problem with that in mixed-version multitenant clusters? |
roachtest.multitenant-upgrade failed with artifacts on master @ d35c174b71065264b8a3910df3f488d10741f788:
Parameters: |
Bisecting shows that the test started failing with #80353 The error is
It's failing at step 11 of the below sequence.
I also noticed that the test needs to be updated so that it does the version upgrades in the correct order. I'll make those changes, but we still have the problem described above. |
roachtest.multitenant-upgrade failed with artifacts on master @ 448352ce4ed71a58e7701e82c786dd5152498310:
Parameters: |
I added some additional logs - it looks like the request is from tenant 11 heartbeating to itself, using a stale sqlInstanceID. so maybe that means the old sqlInstanceID was not properly cleaned up when tenant 11 was restarted:
So I'll look into the tenant 11 shutdown next. However, I think even in the case of unclean shutdown of a tenant pod, we wouldn't want this kind of failure mode. |
I found the problem; the sqlinstance cache was not being updated with the correct timestamps. The fix is in #87111 |
roachtest.multitenant-upgrade failed with artifacts on master @ b316a5ed5fe7253d113174d9d95ddebf1143b4e4:
Parameters: |
roachtest.multitenant-upgrade failed with artifacts on master @ ce66acdbba801de88f1dd645eaedeb3834f23dbd:
Help
See: roachtest README
See: How To Investigate (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-15107
The text was updated successfully, but these errors were encountered: