Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: cdc/bank failed #139109

Closed
cockroach-teamcity opened this issue Jan 15, 2025 · 8 comments · Fixed by #139354
Closed

roachtest: cdc/bank failed #139109

cockroach-teamcity opened this issue Jan 15, 2025 · 8 comments · Fixed by #139354
Labels
A-cdc Change Data Capture B-runtime-assertions-enabled branch-master Failures and bugs on the master branch. branch-release-25.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. P-3 Issues/test failures with no fix SLA T-cdc

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jan 15, 2025

Note: This build has runtime assertions enabled. If the same failure was hit in a run without assertions enabled, there should be a similar failure without this message. If there isn't one, then this failure is likely due to an assertion violation or (assertion) timeout.

roachtest.cdc/bank failed with artifacts on master @ 0b4d620740733ec61cf50ca26d19814299d91f8e:

topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
(cluster.go:2478).Run: context canceled
test artifacts and logs in: /artifacts/cdc/bank/run_1

Parameters:

  • arch=amd64
  • cloud=azure
  • coverageBuild=false
  • cpu=4
  • encrypted=false
  • fs=ext4
  • localSSD=true
  • metamorphicLeases=default
  • runtimeAssertionsBuild=true
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

/cc @cockroachdb/cdc

This test on roachdash | Improve this report!

Jira issue: CRDB-46498

@cockroach-teamcity cockroach-teamcity added B-runtime-assertions-enabled branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-cdc labels Jan 15, 2025
@blathers-crl blathers-crl bot added the A-cdc Change Data Capture label Jan 15, 2025
@wenyihu6
Copy link
Contributor

wenyihu6 commented Jan 15, 2025

This seems to be caued by #137947. Do you mind taking a look at it @aerfrei?

@wenyihu6 wenyihu6 added P-3 Issues/test failures with no fix SLA and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Jan 15, 2025
@aerfrei
Copy link
Contributor

aerfrei commented Jan 16, 2025

@wenyihu6 This commit in this PR should resolve this: 90c3cb0. Validating the topic within the BeforeAfterValidator (which runs in this roachtest) is failing.
I think we should just work to merge that PR, but I can also put in a separate one with just that commit if we want this merged and resolved a little quicker.

@wenyihu6
Copy link
Contributor

@aerfrei Do you want to put up a fix for it first since the test is now failing on master?

@cockroach-teamcity
Copy link
Member Author

roachtest.cdc/bank failed with artifacts on master @ f2696730a8e7fdf45ec4432e57893fce43754cc2:

topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
(cluster.go:2481).Run: context canceled
test artifacts and logs in: /artifacts/cdc/bank/run_1

Parameters:

  • arch=amd64
  • cloud=azure
  • coverageBuild=false
  • cpu=4
  • encrypted=false
  • fs=ext4
  • localSSD=true
  • metamorphicLeases=epoch
  • runtimeAssertionsBuild=false
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

Same failure on other branches

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.cdc/bank failed with artifacts on master @ f2696730a8e7fdf45ec4432e57893fce43754cc2:

topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
topic bank does not match expected table bank.bank
(cluster.go:2481).Run: context canceled
test artifacts and logs in: /artifacts/cdc/bank/run_1

Parameters:

  • arch=amd64
  • cloud=gce
  • coverageBuild=false
  • cpu=4
  • encrypted=false
  • fs=ext4
  • localSSD=true
  • metamorphicLeases=default
  • runtimeAssertionsBuild=false
  • ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

Same failure on other branches

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Jan 17, 2025
137743: stop: grow stacks for async tasks r=tbg a=tbg

Surprisingly, this seems to have helped with alloc/op, but not with CPU time, at least in the microbenchmark.

```
name                                   old time/op    new time/op    delta
Sysbench/SQL/3node/oltp_read_write-24    11.6ms ± 1%    11.6ms ± 2%    ~     (p=0.201 n=14+15)

name                                   old alloc/op   new alloc/op   delta
Sysbench/SQL/3node/oltp_read_write-24    2.20MB ± 3%    2.16MB ± 1%  -1.88%  (p=0.047 n=15+12)

name                                   old allocs/op  new allocs/op  delta
Sysbench/SQL/3node/oltp_read_write-24     11.0k ± 2%     10.8k ± 0%    ~     (p=0.069 n=15+12)
```

End-to-end testing via the `sysbench-settings` roachtest (10 runs) shows a significant improvement in qps[^1]:

  | max | mean | median | % change max | % change mean | % change median
-- | -- | -- | -- | -- | -- | --
old | 23673.17 | 22775.40 | 22684.17 |   |   |  
new | 23955.21 | 23154.03 | 23256.25 | 1.19% | 1.66% | 2.52%

[^1]: https://docs.google.com/spreadsheets/d/1QUBZHllhk5CtDfcsJqc5eOIqUO4jVdhoYXMWj4CRQaQ/edit?gid=0#gid=0

Closes #130663.

Epic: CRDB-43584

139327: roachtest/cdc: skip cdc/bank r=rharding6373 a=wenyihu6

This patch skips the cdc/bank test, as it has been broken since #137947. This
patch disables the test while we work on the fix.

Informs: #139109
Release note: none
Epic: none

Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Wenyi Hu <[email protected]>
craig bot pushed a commit that referenced this issue Jan 17, 2025
138372: roachtest: task error groups r=DarrylWong a=herkolategan

Previously, the wait call on a task group would not return anything and any
error, that occurred in a task, would be reported to the test framework and fail
the test.

This change adds functionality to allow test authors more control and create an
error group if it is necessary to wait on errors, and not have the test
framework handle the errors directly.

A context can also now be specified to override the default context passed by
the test framework for the task manager.

Informs: #118214

Epic: None
Release note: None

139327: roachtest/cdc: skip cdc/bank r=rharding6373 a=wenyihu6

This patch skips the cdc/bank test, as it has been broken since #137947. This
patch disables the test while we work on the fix.

Informs: #139109
Release note: none
Epic: none

Co-authored-by: Herko Lategan <[email protected]>
Co-authored-by: Wenyi Hu <[email protected]>
aerfrei added a commit to aerfrei/cockroach that referenced this issue Jan 17, 2025
Before, we were validating the topics of test feed messages
inside the beforeAfter validator which was being used in the
cdc/bank roachtest. That validation should have been put in
its own validator. This commit moves that validation and
the key_in_value validation into their own validators.

Fixes: cockroachdb#139109

Release note: None
craig bot pushed a commit that referenced this issue Jan 17, 2025
139026: licences: update THIRD-PARTY-NOTICES.txt r=celiala a=rail

Fixes: REL-1744
Release note: None

139192: roachtest: disable 4K block size intent resolution test r=pav-kv a=andrewbaptist

With a 4K block size, the intent resolution test will cause unacceptable slowdowns. This commit changes the perturbation/*/intents to only test up to 1024 block sizes.

This also reduces the max perturbation duration to 10 minutes.

Informs: #139187
Informs: #139188
Informs: #137590 
Informs: #135934
Fixes: #137590

Epic: none

Release note: None

139200: roachprod: Create/Destroy should avoid listing _all_ providers r=golgeek,RaduBerinde a=srosenberg

Previously, both `Create` and `Destroy` would attempt to list VMs across _all_ active cloud providers. Not only is it inefficient, but `Create` may also fail if
the user isn't re-authenticated to an unrelated
provider. E.g., `create --clouds gce` may fail
if AWS SSO token expired.

This change derives the set of required providers
from either the user-specified `--clouds` CLI option or the local cluster cache. The set is then used with `ListCloud` to skip listing unrelated providers.
The use of the local cache is sound at this time;
cluster's providers are immutable.

Epic: none

Release note: None

139246: ui: show job messages on the job detail page r=dt a=dt

Release note (ui change): Jobs can now choose to emit messages that are shown on the job detail page in 25.1+.

Epic: none.

<img width="1513" alt="Screenshot 2025-01-16 at 14 46 12" src="https://github.com/user-attachments/assets/1718ccdd-e8a1-4ec2-a032-a646f410f918" />


139327: roachtest/cdc: skip cdc/bank r=rharding6373 a=wenyihu6

This patch skips the cdc/bank test, as it has been broken since #137947. This
patch disables the test while we work on the fix.

Informs: #139109
Release note: none
Epic: none

Co-authored-by: Rail Aliiev <[email protected]>
Co-authored-by: Andrew Baptist <[email protected]>
Co-authored-by: Stan Rosenberg <[email protected]>
Co-authored-by: David Taylor <[email protected]>
Co-authored-by: Wenyi Hu <[email protected]>
@celiala celiala added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-25.1 labels Jan 17, 2025
@celiala
Copy link
Collaborator

celiala commented Jan 17, 2025

FYI - added release-blocker label for release-25.1, as this blocks the release-infra to generate weekly 25.1 test releases.

Slack thread: https://cockroachlabs.slack.com/archives/C9TGBJB44/p1737151366170999?thread_ts=1737124548.317869&cid=C9TGBJB44

@wenyihu6
Copy link
Contributor

Removing the release blocker as we have skipped this test on 25.1 - #139357.

@wenyihu6 wenyihu6 removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-25.1 labels Jan 21, 2025
aerfrei added a commit to aerfrei/cockroach that referenced this issue Jan 21, 2025
Before, we were validating the topics of test feed messages
inside the beforeAfter validator which was being used in the
cdc/bank roachtest. That validation should have been put in
its own validator. This commit moves that validation and
the key_in_value validation into their own validators.

Fixes: cockroachdb#139109

Release note: None
aerfrei added a commit to aerfrei/cockroach that referenced this issue Jan 21, 2025
Before, we were validating the topics of test feed messages
inside the beforeAfter validator which was being used in the
cdc/bank roachtest. That validation should have been put in
its own validator. This commit moves that validation and
the key_in_value validation into their own validators.

Fixes: cockroachdb#139109

Release note: None
craig bot pushed a commit that referenced this issue Jan 22, 2025
139354: cdctest: fix cdc/bank roachtest r=wenyihu6 a=aerfrei

Before, we were validating the topics of test feed messages inside the beforeAfter validator which was being used in the cdc/bank roachtest. That validation should have been put in its own validator. This commit moves that validation and the key_in_value validation into their own validators.

Fixes: #139109

Release note: None

139487: crosscluster/logical: permanent job errors should fail LDR job r=msbutler a=msbutler

Previously, permanent job errors would pause the LDR job, like PCR. Since LDR
doesn't have a cutover step, we should instead fail the job to provide a
clearer UX to the user.

Epic: none

Release note: none

Co-authored-by: Aerin Freilich <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
@craig craig bot closed this as completed in e6b1d96 Jan 22, 2025
Copy link

blathers-crl bot commented Jan 22, 2025

Based on the specified backports for linked PR #139354, I applied the following new label(s) to this issue: branch-release-25.1. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

blathers-crl bot pushed a commit that referenced this issue Jan 22, 2025
Before, we were validating the topics of test feed messages
inside the beforeAfter validator which was being used in the
cdc/bank roachtest. That validation should have been put in
its own validator. This commit moves that validation and
the key_in_value validation into their own validators.

Fixes: #139109

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture B-runtime-assertions-enabled branch-master Failures and bugs on the master branch. branch-release-25.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. P-3 Issues/test failures with no fix SLA T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants