Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unknown: (unknown) failed #65168

Closed
cockroach-teamcity opened this issue May 13, 2021 · 2 comments · Fixed by #65867
Closed

unknown: (unknown) failed #65168

cockroach-teamcity opened this issue May 13, 2021 · 2 comments · Fixed by #65867
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

unknown.(unknown) failed with artifacts on master @ d551b50c415d7021de53a0bb030eb25f8124fafa:

Slow failing tests:
TestChangefeedInitialScan/enterprise/cursor_-_with_initial_scan - 439.79s

Slow passing tests:
TestTenantLogic - 1238.18s
TestLogic - 904.68s
TestRegionChangeRacingRegionalByRowChange - 146.28s
TestAlterTableLocalityRegionalByRowError - 144.94s
TestCCLLogic - 116.50s
TestRestoreMidSchemaChange - 102.09s
TestTypeChangeJobCancelSemantics - 92.86s
TestExecBuild - 64.80s
TestRemoveDeadReplicas - 62.03s
TestBTreeDeleteInsertCloneEachTime - 61.12s
TestBTreeDeleteInsertCloneEachTime - 57.94s
TestImportIntoCSV - 56.25s
TestConcurrentAddDropRegions - 54.01s
TestIndexCleanupAfterAlterFromRegionalByRow - 53.57s
TestRingBuffer - 52.80s
TestImportData - 52.76s
Example_demo - 51.81s
TestTelemetry - 50.24s
TestImportCSVStmt - 47.05s
TestExternalSortRandomized - 46.81s
Reproduce

To reproduce, try:

make stressrace TESTS=(unknown) PKG=./pkg/unknown TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Parameters in this failure:

  • GOFLAGS=-json

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels May 13, 2021
@cockroach-teamcity
Copy link
Member Author

unknown.(unknown) failed with artifacts on master @ 4e3437fb275f1eb7645ca04d942c82a596c3826e:

Slow failing tests:
TestChangefeedInitialScan/enterprise/cursor_-_with_initial_scan - 411.81s

Slow passing tests:
TestTenantLogic - 1261.96s
TestLogic - 902.90s
TestAlterTableLocalityRegionalByRowError - 143.21s
TestRegionChangeRacingRegionalByRowChange - 132.84s
TestAllRegisteredImportFixture - 131.04s
TestCCLLogic - 120.59s
TestTypeChangeJobCancelSemantics - 93.41s
TestLearnerSnapshotFailsRollback - 92.52s
TestRestoreMidSchemaChange - 91.70s
TestBTreeDeleteInsertCloneEachTime - 66.89s
TestExecBuild - 64.90s
TestRemoveDeadReplicas - 63.87s
TestImportIntoCSV - 61.54s
TestImportData - 58.44s
TestBTreeDeleteInsertCloneEachTime - 57.48s
TestConcurrentAddDropRegions - 55.05s
Example_demo - 54.75s
TestImportCSVStmt - 51.75s
TestBackupRestoreDataDriven - 51.46s
TestTelemetry - 48.15s
Reproduce

To reproduce, try:

make stressrace TESTS=(unknown) PKG=./pkg/unknown TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Parameters in this failure:

  • GOFLAGS=-json

Same failure on other branches

This test on roachdash | Improve this report!

@miretskiy miretskiy self-assigned this May 27, 2021
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue May 28, 2021
Fix flaky test and re-enable it to run under stress.
The problem was that the transaction executed by the table feed can
be restarted.  If that happens, then we would see the same keys again,
but because we had side effects inside transaction (marking the keys
seen), we would not emit those keys causing the test to be hung.
The stress race was failing because of both transaction restarts and
the 10ms resolved timestamp frequency (with so many resolved timestamps
being generated, the table feed transaction was always getting
restarted).

In addition, modify job feed close method to directly ask resumer
to exit instead of relying on job cancellation.

Fixes cockroachdb#57754
Fixes cockroachdb#65168

Release Notes: None
@cockroach-teamcity
Copy link
Member Author

unknown.(unknown) failed with artifacts on master @ 55877f386a04ed5b6c7722259ff0b3ac3ad545f6:

Slow failing tests:
TestChangefeedInitialScan/enterprise/no_cursor_-_no_initial_scan - 363.50s

Slow passing tests:
TestTenantLogic - 1307.36s
TestLogic - 945.78s
TestRegionChangeRacingRegionalByRowChange - 150.42s
TestCCLLogic - 125.84s
TestRestoreMidSchemaChange - 107.48s
TestAlterTableLocalityRegionalByRowError - 104.82s
TestTypeChangeJobCancelSemantics - 94.44s
TestLearnerSnapshotFailsRollback - 92.67s
TestExecBuild - 69.44s
TestIndexCleanupAfterAlterFromRegionalByRow - 68.74s
TestRemoveDeadReplicas - 62.93s
TestTelemetry - 61.07s
TestImportData - 59.97s
TestConcurrentAddDropRegions - 57.88s
TestBTreeDeleteInsertCloneEachTime - 57.46s
TestBTree - 51.27s
TestBackupRestoreDataDriven - 50.25s
Example_demo - 50.09s
TestRegionAddDropWithConcurrentBackupOps - 48.60s
TestMergeQueueSeesNonVoters - 46.90s
Reproduce

To reproduce, try:

make stressrace TESTS=(unknown) PKG=./pkg/unknown TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1

Parameters in this failure:

  • GOFLAGS=-json

Same failure on other branches

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Jun 1, 2021
65867: changefeedccl: Fix flaky tests. r=miretskiy a=miretskiy

Fix flaky test and re-enable it to run under stress.
The problem was that the transaction executed by the table feed can
be restarted.  If that happens, then we would see the same keys again,
but because we had side effects inside transaction (marking the keys
seen), we would not emit those keys causing the test to be hung.
The stress race was failing because of both transaction restarts and
the 10ms resolved timestamp frequency (with so many resolved timestamps
being generated, the table feed transaction was always getting
restarted).

Fixes #57754
Fixes #65168

Release Notes: None

65868: storage: expose pebble.IteratorStats through {MVCC,Engine}Iterator r=sumeerbhola a=sumeerbhola

These will potentially be aggregated before exposing in trace
statements, EXPLAIN ANALYZE etc.

Release note: None

65900: roachtest: fix ruby-pg test suite r=rafiss a=RichardJCai

Update blocklist with passing test.
The not run test causing a failure is because the test is no longer failing.
Since it is not failing, it shows up under not run.

Release note: None

65910: sql/gcjob: retry failed GC jobs r=ajwerner a=sajjadrizvi

In the previous implementation, failed GC jobs were not being retried regardless
whether the failure is permanent or transient. As a result, a GC job's failure
risked orphaned data, which cannot be reclaimed.

This commit adds a mechanism to retry failed GC jobs that are not permanent. No
limit is set on the number of retries. For the time being, the failure type is
determined based on the failure categorization of schema-change jobs. This
behavior is expected to change once exponential backoff mechanism is
implemented for failed jobs (#44594).

Release note: None

Fixes: #65000

Release note (<category, see below>): <what> <show> <why>

65925: ccl/importccl: skip TestImportPgDumpSchemas/inject-error-ensure-cleanup r=tbg a=adityamaru

Refs: #65878

Reason: flaky test

Generated by bin/skip-test.

Release justification: non-production code changes

Release note: None

65933: kv/kvserver: skip TestReplicateQueueDeadNonVoters under race r=sumeerbhola a=sumeerbhola

Refs: #65932

Reason: flaky test

Generated by bin/skip-test.

Release justification: non-production code changes

Release note: None

65934: kv/kvserver: skip TestReplicateQueueSwapVotersWithNonVoters under race r=sumeerbhola a=sumeerbhola

Refs: #65932

Reason: flaky test

Generated by bin/skip-test.

Release justification: non-production code changes

Release note: None

65936: jobs: fix flakey TestMetrics r=fqazi a=ajwerner

Fixes #65735

The test needed to wait for the job to be fully marked as paused.

Release note: None

Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: sumeerbhola <[email protected]>
Co-authored-by: richardjcai <[email protected]>
Co-authored-by: Sajjad Rizvi <[email protected]>
Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
@craig craig bot closed this as completed in 10a60f6 Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants