Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: disk-full failed #111902

Closed
cockroach-teamcity opened this issue Oct 6, 2023 · 4 comments · Fixed by #111915
Closed

roachtest: disk-full failed #111902

cockroach-teamcity opened this issue Oct 6, 2023 · 4 comments · Fixed by #111915
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-testeng TestEng Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Oct 6, 2023

roachtest.disk-full failed with artifacts on master @ d926f1e51e23d477bcf3027adc6beac11513e58a:

(disk_full.go:124).2: COMMAND_PROBLEM: exit status 1
(cluster.go:2171).Run: context canceled
(monitor.go:153).Wait: monitor failure: monitor user task failed: t.Fatal() was called
test artifacts and logs in: /artifacts/disk-full/run_1

Parameters: ROACHTEST_arch=amd64 , ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

/cc @cockroachdb/storage

This test on roachdash | Improve this report!

Jira issue: CRDB-32125

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-storage Storage Team labels Oct 6, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.2 milestone Oct 6, 2023
@jbowens
Copy link
Collaborator

jbowens commented Oct 6, 2023

COMMAND_PROBLEM: exit status 1
(1) attached stack trace
  -- stack trace:
  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerDiskFull.func1.2
  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/disk_full.go:124
  | main.(*monitorImpl).Go.func1
  | 	main/pkg/cmd/roachtest/monitor.go:119
  | golang.org/x/sync/errgroup.(*Group).Go.func1
  | 	golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:75
  | runtime.goexit
  | 	src/runtime/asm_amd64.s:1598
Wraps: (2) secondary error attachment
  | COMMAND_PROBLEM: exit status 1
  | (1) Node 1. Command with error:
  |   | ```
  |   | systemctl status cockroach.service | grep 'Main PID' | grep -oE '\((.+)\)'
  |   | ```
  |   | stdout: <empty>
  |   | stderr:Unit cockroach.service could not be found.
  | Wraps: (2) COMMAND_PROBLEM
  | Wraps: (3) exit status 1
  | Error types: (1) *hintdetail.withDetail (2) errors.Cmd (3) *exec.ExitError
Wraps: (3) COMMAND_PROBLEM: exit status 1
Error types: (1) *withstack.withStack (2) *secondary.withSecondaryError (3) *errutil.leafError

@jbowens jbowens removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Oct 6, 2023
@jbowens
Copy link
Collaborator

jbowens commented Oct 6, 2023

@cockroachdb/test-eng are you aware of any changes that might result in the systemd cockroach.service unit not being found?

@renatolabs
Copy link
Contributor

Erm yes, that would be #111064 -- with virtualization, we might not have just one cockroach process in the node, so the unit names changed. It's not great that tests name these services directly as it's arguably "internal" API. Let me think about how to fix this.

@renatolabs renatolabs added the T-testeng TestEng Team label Oct 6, 2023
@blathers-crl
Copy link

blathers-crl bot commented Oct 6, 2023

cc @cockroachdb/test-eng

@renatolabs renatolabs removed the T-storage Storage Team label Oct 6, 2023
craig bot pushed a commit that referenced this issue Oct 6, 2023
111915: roachtest: remove direct references to cockroach.service from tests r=herkolategan a=renatolabs

After #111064, we started naming the systemd unit for cockroach processes differently, since we now might have multiple cockroach processes in the same VM in the context of SQL server processes and cluster virtualization.

That change broke a few tests that directly referenced the `cockroach.service` unit. In this change, we create a util function that is reponsible for encapsulating that name and replace direct references to the systemd unit name with calls to this new function.

Fixes: #111902

Release note: None

111937: kv: add conflicting txn_meta to error message r=nvanbenschoten a=aadityasondhi

This patch adds back the conflicting txn_meta to the error message printed out from a `TransactionRetryError` if available.

Fixes #110689.

Release note: None

Co-authored-by: Renato Costa <[email protected]>
Co-authored-by: Aaditya Sondhi <[email protected]>
@craig craig bot closed this as completed in a02352a Oct 6, 2023
renatolabs added a commit to renatolabs/cockroach that referenced this issue Oct 20, 2023
After cockroachdb#111064, we started naming the systemd unit for cockroach
processes differently, since we now might have multiple cockroach
processes in the same VM in the context of SQL server processes and
cluster virtualization.

That change broke a few tests that directly referenced the
`cockroach.service` unit. In this change, we create a util function
that is reponsible for encapsulating that name and replace direct
references to the systemd unit name with calls to this new function.

Fixes: cockroachdb#111902

Release note: None
@jbowens jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-testeng TestEng Team
Projects
No open projects
Status: Done
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants