Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v23.2.x] CI Failure (Exceeded 30 redirects) in EndToEndShadowIndexingTest.test_recover #15887

Closed
nvartolomei opened this issue Dec 22, 2023 · 3 comments
Labels
area/cloud-storage Shadow indexing subsystem ci-failure kind/backport PRs targeting a stable branch kind/bug Something isn't working

Comments

@nvartolomei
Copy link
Contributor

https://buildkite.com/redpanda/redpanda/builds/43299#018c918c-5a77-4b67-aac0-c4a67d633192

Module: rptest.tests.e2e_shadow_indexing_test
Class:  EndToEndShadowIndexingTest
Method: test_recover
test_id:    rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_recover
status:     FAIL
run time:   1 minute 8.589 seconds


    TooManyRedirects('Exceeded 30 redirects.')
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/utils/mode_checks.py", line 63, in f
    return func(*args, **kwargs)
  File "/root/tests/rptest/services/cluster.py", line 159, in wrapped
    self.redpanda.stop_and_scrub_object_storage()
  File "/root/tests/rptest/services/redpanda.py", line 3663, in stop_and_scrub_object_storage
    wait_until(all_partitions_uploaded_manifest,
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/root/tests/rptest/services/redpanda.py", line 3637, in all_partitions_uploaded_manifest
    status = self._admin.get_partition_cloud_storage_status(
  File "/root/tests/rptest/services/admin.py", line 984, in get_partition_cloud_storage_status
    return self._request("GET",
  File "/root/tests/rptest/services/admin.py", line 334, in _request
    r = self._session.request(verb, url, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 665, in send
    history = [resp for resp in gen]
  File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 665, in <listcomp>
    history = [resp for resp in gen]
  File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 166, in resolve_redirects
    raise TooManyRedirects('Exceeded {} redirects.'.format(self.max_redirects), response=resp)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
@nvartolomei nvartolomei added kind/bug Something isn't working ci-failure labels Dec 22, 2023
@nvartolomei
Copy link
Contributor Author

Seems like a race condition between old node being removed from the raft configuration, new leader being elected for the topic and the new node getting the up to date information.

rpk sends request to the ip of the old leader (node_id: 1) which after restart became node_id: 4. node_id: 4 redirects the request to what it thinks is the leader (node_id: 1) but it ends up on the same node. Redirect loop.

@michael-redpanda
Copy link
Contributor

do not use this issue to track dev failures, if you observe similar failure on dev - create new issue, this issue is only for backports

@dotnwat dotnwat added the area/cloud-storage Shadow indexing subsystem label Jan 16, 2024
@piyushredpanda
Copy link
Contributor

This issue hasn't reoccurred for more than 2 months; closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem ci-failure kind/backport PRs targeting a stable branch kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants