-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/upgrade/upgrademanager/upgrademanager_test: TestMigrationFailure failed #106648
Comments
seems to be the same as #105767 |
This test was recently added in 4f2eca6, so ccing @jeffswenson for any ideas. |
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ 6a6ebcb950c050652ad8cded716dcd242e3a729c:
|
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ c21cfc2823728fde7fa6aecc9506a1afa1f83191:
|
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ 2a70922c5ce3d30373adfc52a46f869f9685497c:
|
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ 7339e0f900eba75db210874612d6e5aa561c854a:
|
I was looking into one of the tests with this error because it looked to me like it may be some test logic was wrong:
But digging into the server logs, it has the same "job_type" error.
So I'm beginning to suspect the test is actually correct and there is a bug in the test cluster infrastructure or a version gate. |
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ 269b9e31c4befe53181340503663ae6e3e8ed07e:
|
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ 2627dc50694837984d7bf7d6c6f25e294ae36039:
|
Also hit in #106574 |
I'll skip this for now. I believe I understand the root cause and I think the test is correct. The issue is version flags and |
Skip TestMigrationFailure until the source of flakes can be fixed. Release note: None Part of: cockroachdb#106648
The trigger for the flakes is #105832. It adds an upgrade precondition that uses an The actual bug is in the My first thought for how to fix this is to have the This is similar to an issue that I identified when working on the RBR system table migrations. I wrote a utility for reading the cluster version in a transaction. But I'm realizing now that the optimization in version guard to skip reading the version setting after the migration completes depends on query serializability and is broken if used within the context of an |
106623: ui: fix timescale for rolling window r=maryliag a=maryliag The change introduced on #105157 had an undesired change on the Metrics page. We want to show the end period of the select, but when the fixed window end for that this caused the Metrics page to stop loading new data. We want the metrics page to keep updating when any of the rolling windows is selected, and we also want to know the time a time period was requested, so it can be used to provide more information on SQL Activity pages. This commit remove the setting of the fixedWindowEnd, since that was not the best approach and instead creates a value on local storage for the requestTime, that can be used to create the label for the timescale, without interferring on Metrics page (or any other that have an automatic update). Epic: none Release note (bug fix): Fix the Metrics page that was not updating automatically on rolling window options. Release note (bug fix): Statement Diagnostics not always showing is now fixed and they show for the correct time period selected. 106757: upgrademanager: skip TestMigrationFailure r=JeffSwenson a=JeffSwenson Skip TestMigrationFailure until the source of flakes can be fixed. Release note: None Part of: #106648 Co-authored-by: maryliag <[email protected]> Co-authored-by: Jeff <[email protected]>
This test reliably reproduces the source of flakes in cockroachdb#106648. Release note: None Part of: cockroachdb#106648
This also removes `TODOTestTenantDisabled`. (I have verified the test works, albeit still flaky due to cockroachdb#106648, by temporarily removing the skip.) Release note: None
This also removes `TODOTestTenantDisabled`. (I have verified the test works, albeit still flaky due to cockroachdb#106648, by temporarily removing the skip.) Release note: None
106925: upgrademanager: simplify TestMigrationFailure r=yuzefovich a=knz Informs #76378 Epic: CRDB-18499 This also removes `TODOTestTenantDisabled`. (I have verified the test works, albeit still flaky due to #106648, by temporarily removing the skip.) Release note: None 107319: opt: add session setting for join elimination optimization r=DrewKimball a=DrewKimball We recently added support for column remapping in the join elimination rules that allows columns from the eliminated input of the join to be mapped to equivalent columns from the preserved input. This allows joins to be eliminated in more cases - in particular, the self-join patterns that can arise from an `UPDATE ... FROM` statement where the table in the `FROM` clause is the same as the table being updated. This patch adds a setting, `optimizer_use_improved_join_elimination`, which gates the column-remapping logic for the join-elimination rules. The plan is to backport the column-remapping changes to 23.1 behind this setting turned off by default. Informs #102614 Release note: None 107392: go.mod: bump Pebble to 94f91669 r=RaduBerinde a=RaduBerinde 94f91669 db: rename Experimental.SharedStorage to RemoteStorage in Options c62c9127 objstorage: rename objstorage/shared package to remote 692f3b61 vfs: move errorfs package to vfs dir to allow use in CockroachDB 5e550364 cmd/pebble: allow passing an explicit checkpoint directory path 9c3f337a cmd/replay: always log checkpoint initialization 0cd8f1b8 replay: calculate separate write stall metrics for each reason f9d08867 cmd/pebble: use non-default block cache size in replay tool d0e583f8 *: improve humanize format 5e76dfab db: minor update of comment for TargetByteDeletionRate 456a2a2a objstorage: rename Shared to Remote in the objstorage provider API Release note (general change): Formatting of byte figures in Pebble logs has been improved; any tools that parse these logs might need updating. Epic: none Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Drew Kimball <[email protected]> Co-authored-by: Radu Berinde <[email protected]>
@adityamaru can you unskip this test once the fix to #106762 merges? |
#107570 seems to have fixed the same spot this test was failing at. 20 minutes of stressing and I don't see a failure yet. |
cc @cockroachdb/disaster-recovery |
108029: upgrademanager: unskip TestMigrationFailure r=msbutler a=adityamaru The test no longer fails with our change in #107570. Fixes: #106648 Fixes: #106762 Release note: None 108192: ccl/sqlproxyccl: deflake TestConnectionMigration r=darinpp a=jaylim-crl Fixes #106885. This test flake seems extremely rare, and it's unclear why it occurred in the first place. The past 1000 runs (all of what TC has) have been successful. Regardless, this commit attempts at deflaking TestConnectionMigration. Given that some subtests transfer connections through `transferConnWithRetries`, it is possible that the transfer was retried, causing the metric to be incremented. Release note: None Epic: none Co-authored-by: adityamaru <[email protected]> Co-authored-by: Jay <[email protected]>
pkg/upgrade/upgrademanager/upgrademanager_test.TestMigrationFailure failed with artifacts on master @ 87cd65352a9903c292d6b6a2e9856cc1f57bb267:
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-29654
The text was updated successfully, but these errors were encountered: