Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: multitenant/distsql failed with bad tls certificate #117150

Closed
cockroach-teamcity opened this issue Dec 29, 2023 · 9 comments · Fixed by #117505
Closed

roachtest: multitenant/distsql failed with bad tls certificate #117150

cockroach-teamcity opened this issue Dec 29, 2023 · 9 comments · Fixed by #117505
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. skipped-test T-testeng TestEng Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Dec 29, 2023

roachtest.multitenant/distsql/instances=20/bundle=off/timeout=0 failed with artifacts on master @ c316d6a615fa02a05357a20bf03b8a197ab27810:

(cluster.go:2036).StartServiceForVirtualCluster: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/multitenant/distsql/instances=20/bundle=off/timeout=0/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Jira issue: CRDB-35006

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-queries SQL Queries Team labels Dec 29, 2023
@cockroach-teamcity cockroach-teamcity added this to the 24.1 milestone Dec 29, 2023
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Dec 29, 2023
@cockroach-teamcity
Copy link
Member Author

roachtest.multitenant/distsql/instances=20/bundle=off/timeout=0 failed with artifacts on master @ c316d6a615fa02a05357a20bf03b8a197ab27810:

(cluster.go:2036).StartServiceForVirtualCluster: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/multitenant/distsql/instances=20/bundle=off/timeout=0/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.multitenant/distsql/instances=20/bundle=off/timeout=0 failed with artifacts on master @ c316d6a615fa02a05357a20bf03b8a197ab27810:

(cluster.go:2036).StartServiceForVirtualCluster: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/multitenant/distsql/instances=20/bundle=off/timeout=0/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.multitenant/distsql/instances=20/bundle=off/timeout=0 failed with artifacts on master @ c316d6a615fa02a05357a20bf03b8a197ab27810:

(cluster.go:2036).StartServiceForVirtualCluster: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/multitenant/distsql/instances=20/bundle=off/timeout=0/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.multitenant/distsql/instances=20/bundle=off/timeout=0 failed with artifacts on master @ c316d6a615fa02a05357a20bf03b8a197ab27810:

(cluster.go:2036).StartServiceForVirtualCluster: COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/multitenant/distsql/instances=20/bundle=off/timeout=0/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=gce
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

See: Grafana

This test on roachdash | Improve this report!

@DrewKimball
Copy link
Collaborator

remote error: tls: bad certificate›

@rharding6373
Copy link
Collaborator

The last time we encountered an error like this was #96649, which we closed because it seemed to be a transient failure and we didn't see it again. I'm concerned that we see multiple roachtests in this suite failing every couple days for the last week. If the logs like ‹http: TLS handshake error from 10.0.0.3:33376: remote error: tls: bad certificate› are indeed a symptom, this may be out of our depth. Could test eng help take a look?

I'm removing the release-blocker label for now, since I've seen no other evidence of a failure in the logs so far.

@blathers-crl blathers-crl bot added the T-testeng TestEng Team label Jan 6, 2024
Copy link

blathers-crl bot commented Jan 6, 2024

cc @cockroachdb/test-eng

@stevendanna
Copy link
Collaborator

I'm unsure what is causing a failure here, but I am fairely sure:

http: TLS handshake error from 10.0.0.3:33376: remote error: tls: bad certificate

Is a red herring. This error comes from the non-HTTPS poller on our prometheus instance which scrapes all roachprod clusters. Insecure clusters never see this error, but secure clusters do because the HTTP scraper doesn't know to exclude secure clusters.

@yuzefovich
Copy link
Member

Thanks Steven for taking a look. It seems that Darryl has a tentative fix in #117505, so I assigned the issue to him.

craig bot pushed a commit that referenced this issue Jan 8, 2024
117480: roachtest: skip multitenant/distsql for now r=yuzefovich a=yuzefovich

This test is flaking with some infra issue.

Informs: #117150.
Fixes: #117461.
Fixes: #117462.
Fixes: #117463.
Fixes: #117464.

Release note: None

117506: kv: skip TestStoreLeaseTransferTimestampCacheTxnRecord r=nvanbenschoten a=nvanbenschoten

Informs #117486.

Skip until I can fix the test.

Release note: None

117510: rowenc: fix up a recent commit r=yuzefovich a=yuzefovich

This commit optimizes recently added code a bit (by using slightly more efficient `PeekValueLengthWithOffsetsAndType` method) as well as adds a missing word in a comment.

Epic: None

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 05d1395 Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. skipped-test T-testeng TestEng Team
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants