-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: acceptance/version-upgrade failed #44732
Comments
(roachtest).acceptance/version-upgrade failed on master@5c37cf2d12bafd4a01d4ac7f1725a0ae6e498c9c:
More
Artifacts: /acceptance/version-upgrade See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@1674c8aa4d955738365087d33ff707525e8a2b94:
More
Artifacts: /acceptance/version-upgrade See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@003fa437fcce1038085bcc314aa3be61084bbbbe:
More
Artifacts: /acceptance/version-upgrade See this test on roachdash |
^- v1.0.6 crashed with a consistency failure. We should change this test to not run versions that have reached end-of-life. |
(roachtest).acceptance/version-upgrade failed on master@69dc87d68addedf2fabfb2b14c098cfb35b5f3d0:
More
Artifacts: /acceptance/version-upgrade See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@49bf810c1ad531ddc75c6a9ce60e81b7dd810726:
More
Artifacts: /acceptance/version-upgrade See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@92495ffb2182671b3338a8c6cd2734cb6de5d392:
More
Artifacts: /acceptance/version-upgrade See this test on roachdash |
Cc @lucy-zhang |
I'm not sure what the best way is to fix this. (For context, we're banning almost all schema changes (e.g., |
The cluster has fully migrated to 19.2 when the error occurs. I think the check is probably just using |
So the cluster version is 19.2, but we're running a 20.1 binary, right? I think the issue is that the schema change can't proceed because the version gate present in the 20.1 binary is for 20.1. |
I think I understand what the problem is. The test runs these steps cockroach/pkg/cmd/roachtest/upgrade.go Lines 466 to 477 in 8e4d7b2
and between the steps it "tests the features": cockroach/pkg/cmd/roachtest/upgrade.go Lines 514 to 544 in 8e4d7b2
When we upgrade the binaries, we rely on auto-upgrades, that is, the nodes automatically bump to the latest version. If this hasn't happened yet by the time we run DDL after starting the cluster into 20.1, we see the error here. This test hasn't aged very well. I think originally it was intentionally targeting these steps at the mixed clusters, but thanks to the auto-upgrade, it mostly hits non-mixed clusters (though if it hits one now, it will fail with the issue we're discussing). I have to touch this test anyway. Will take a look at making this work better. |
Hmm I looked at the test now (and mostly rewrote it) but I am back to square one. We check that the auto-update completed and is visible on all nodes. This means we should never run into the version gate. |
This test used to start a v1.0 cluster and migrated it all the way to the the master branch. As a result, the test would take longer to run with each release, but additionally it would sometimes flake in ancient releases that we don't support any more. Instead of just ditching the old releases altogether, we now start the test up from store directories that were initiated at v1.0 and were upgraded all the way through the lowest version that we care to run in this test. This is different from bootstrapping a new cluster at that version since older versions may have left old data around, or never gotten rid of idiosyncracies that we have since fixed. Touches cockroachdb#44732. Release note: None
Heh, no. My previous comment was wrong. We definitely do this thing and it will continue to flake. Once #46924 is merged, I know how to fix it. |
This comment has been minimized.
This comment has been minimized.
We weren't checking whether n4 had actually upgraded. Fixes the failure cockroachdb#44732 (comment) (but there are others, so not closing). Release note: None
^- this last failure fixed in https://github.com/cockroachdb/cockroach/pull/47201/files (problem in the test) |
47164: kvserver,roachtest/hotspotsplits: adjust constants to ensure backpressure r=nvanbenschoten a=ajwerner See individual commits. Fixes #46957. 47201: roachtest: fix one-off in version-upgrade r=nvanbenschoten,andreimatei a=tbg We weren't checking whether n4 had actually upgraded. Fixes the failure #44732 (comment) (but there are others, so not closing). Release note: None Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]>
Turns out there was more than a one-off, so keeping this open also for #47235 (comment). |
The recent changes to this test seem to have uncovered a latent [bug], as a result of which the test was skipped. This commit lets the test start from v19.2 fixtures (instead of v19.1 before) which experimentally were verified to not exhibit the bug, so the test is unskipped. While I was there, I automated the base version discovery of this test. On a branch at version X, the test will now use the fixtures for the last major release preceding X. We'll occasionally have to bake new fixtures and update the mapping in PredecessorVersion() but this is better than having the test rot on older branches whenever we update it for a new dev cycle. The root cause of the bug will be fixed separately; this is tracked in issue cockroachdb#44732. [bug]: cockroachdb#47235 (comment) Release note: None
47232: opt: corrects function doc for canMaybeConstrainIndex r=mgartner a=mgartner #### opt: corrects function doc for canMaybeConstrainIndex This commit corrects a the function documentation for canMaybeConstrainIndex. Prior to this change, it did not mention a third check, for tight filter constraints, that is performed to determinte the possibility that an index could be constrained by a filer. Also, the documentation now correctly maps to the logic of the function. Previously it falsly claimed that if any of the checks were false then the function would return false. Now it correctly states that if any of the checks are true, then the fucntion returns true. Release note: None 47268: roachtest: improve and de-flake version-upgrade r=spaskob a=tbg roachtest: automate and de-flake version-upgrade The recent changes to this test seem to have uncovered a latent [bug], as a result of which the test was skipped. This commit lets the test start from v19.2 fixtures (instead of v19.1 before) which experimentally were verified to not exhibit the bug, so the test is unskipped. While I was there, I automated the base version discovery of this test. On a branch at version X, the test will now use the fixtures for the last major release preceding X. We'll occasionally have to bake new fixtures and update the mapping in PredecessorVersion() but this is better than having the test rot on older branches whenever we update it for a new dev cycle. The root cause of the bug will be fixed separately; this is tracked in issue #44732. [bug]: #47235 (comment) Release note: None Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]>
I thought the [bug] would not repro after I bumped the base fixture, but it still does (though maybe one in fifteen only) and I'm running out of work day, so skip the test again. Tomorrow's a latter day. [bug]: cockroachdb#44732 (comment) Release note: None
(roachtest).acceptance/version-upgrade failed on master@a63a670b1a591be2ed38af7f98c1f85e32546596:
More
Artifacts: /acceptance/version-upgrade
See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@b5b4a9a55f1122ef9c82b968aa5c8cc137c7e281:
More
Artifacts: /acceptance/version-upgrade
See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@ce5d511368b741555344ea28efa40fb9facfa577:
More
Artifacts: /acceptance/version-upgrade
See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@666a0ac62832b6884bc6b039b4c944dbb42924aa:
More
Artifacts: /acceptance/version-upgrade
See this test on roachdash |
(roachtest).acceptance/version-upgrade failed on master@5ad50d887ad95e6b17b0d20fa98bc30319a67c0d:
More
Artifacts: /acceptance/version-upgrade
See this test on roachdash |
47690: roachtest: deflake acceptance/version-upgrade r=spaskob a=tbg Remove a workaround that wasn't necessary any more (since I regenerated the fixtures last week) but which caused flakes of its own because it was re-uploading the binary for predecessorVersion while processes were running using that binary (resulting in occasional 'text file busy' on linux). Closes #44732. Touches #47024. Release note: None Co-authored-by: Tobias Schottdorf <[email protected]>
(roachtest).acceptance/version-upgrade failed on master@bc31cf16d80dd1820488479666261586a22170e6:
More
Artifacts: /acceptance/version-upgrade
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: