Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: c.ConnectToLiveNode can return connection to decommissioned/drained node #102603

Closed
smg260 opened this issue Apr 28, 2023 · 1 comment · Fixed by #102622
Closed

roachtest: c.ConnectToLiveNode can return connection to decommissioned/drained node #102603

smg260 opened this issue Apr 28, 2023 · 1 comment · Fixed by #102622
Assignees
Labels
A-testing Testing tools and infrastructure C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-testeng TestEng Team

Comments

@smg260
Copy link
Contributor

smg260 commented Apr 28, 2023

During test teardown, the runner connects to a "live" node to perform further post test validations. In some cases, however, a connection to a drained/decommissioned node is returned instead, which will cause said validations to fail.

See #101620

One solution is to change ConnectToLiveNode to check the admin endpoint of nodes (health?ready=1) for a 200 http status code. Decomm/Drained nodes will not return a 200.

Jira issue: CRDB-27554

@smg260 smg260 added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure T-testeng TestEng Team labels Apr 28, 2023
@smg260 smg260 self-assigned this Apr 28, 2023
@blathers-crl
Copy link

blathers-crl bot commented Apr 28, 2023

cc @cockroachdb/test-eng

craig bot pushed a commit that referenced this issue May 8, 2023
102622: roachtest: use health endpoints to determine whether a node is suitable r=erikgrinaker a=smg260

for running post test validations. This avoids connecting to drained or decommissioned nodes.

Will follow up at some stage with a PR to remove the `ConnectToLiveNode` in favour of reporting the health status of all nodes which would be useful to see in the logs. The post validations can then just use the first (or any) one with a status of `200`

Limiting changes to this PR to make backporting easier.


Epic: none
Release note: none
Fixes: #102603

Co-authored-by: Miral Gadani <[email protected]>
@craig craig bot closed this as completed in 94c5d9a May 8, 2023
smg260 pushed a commit to smg260/cockroach that referenced this issue May 8, 2023
of all nodes in test teardown. Validations can use a db connection
from any node which has a status of 200.

Epic: none
Release note: none
Informs: cockroachdb#102603
Fixes: #TBD
smg260 pushed a commit to smg260/cockroach that referenced this issue May 16, 2023
of all nodes in test teardown. Validations can use a db connection
from any node which has a status of 200.

Epic: none
Release note: none
Informs: cockroachdb#102603
smg260 pushed a commit to smg260/cockroach that referenced this issue May 16, 2023
of all nodes in test teardown. Validations can use a db connection
from any node which has a status of 200.

Epic: none
Release note: none
Informs: cockroachdb#102603
renatolabs pushed a commit to renatolabs/cockroach that referenced this issue Jun 14, 2023
for running post test validations. This avoids connecting to drained
or decommissioned nodes.

Epic: none
Release note: none
Fixes: cockroachdb#102603
renatolabs pushed a commit to renatolabs/cockroach that referenced this issue Jun 14, 2023
of all nodes in test teardown. Validations can use a db connection
from any node which has a status of 200.

Epic: none
Release note: none
Informs: cockroachdb#102603
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-testeng TestEng Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant