Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (Timeout waiting for delete_records validation) in ControllerUpgradeTest. test_updating_cluster_when_executing_operations #11944

Closed
twmb opened this issue Jul 7, 2023 · 4 comments
Assignees
Labels

Comments

@twmb
Copy link
Contributor

twmb commented Jul 7, 2023

https://buildkite.com/redpanda/redpanda/builds/32654#01892d26-fa86-447a-934a-fd8855a4d428
https://buildkite.com/redpanda/redpanda/builds/32654#01892d34-c9c5-40eb-96fa-0f1a8d5e90b1

Module: rptest.tests.controller_upgrade_test
Class:  ControllerUpgradeTest
Method: test_updating_cluster_when_executing_operations
====================================================================================================
test_id:    rptest.tests.controller_upgrade_test.ControllerUpgradeTest.test_updating_cluster_when_executing_operations
status:     FAIL
run time:   1 minute 15.236 seconds


    TimeoutError("Timeout waiting for {'type': 'delete_records', 'properties': {'truncate_points': {'fuzzy-operator-1-gfmbtb': {0: 161}}}} operation validation")
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 82, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/controller_upgrade_test.py", line 100, in test_updating_cluster_when_executing_operations
    admin_fuzz.wait(5, 240)
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 835, in wait
    wait_until(check, timeout_sec=timeout, backoff_sec=2)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 823, in check
    raise self.error
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 664, in thread_loop
    self.execute_one()
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 726, in execute_one
    raise e
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 708, in execute_one
    wait_until(
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 57, in wait_until
    raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
ducktape.errors.TimeoutError: Timeout waiting for {'type': 'delete_records', 'properties': {'truncate_points': {'fuzzy-operator-1-gfmbtb': {0: 161}}}} operation validation
@twmb twmb added kind/bug Something isn't working ci-failure labels Jul 7, 2023
@twmb twmb changed the title CI Failure (key symptom) in Class.method CI Failure (Timeout waiting for delete_records validation) in ControllerUpgradeTest. test_updating_cluster_when_executing_operations Jul 7, 2023
@graphcareful graphcareful self-assigned this Jul 7, 2023
@graphcareful
Copy link
Contributor

graphcareful commented Jul 7, 2023

I believe this failure is due to the fact that delete-records requests are only handled after successful upgrade, so no bug here, just necessary to fix the test:

[DEBUG - 2023-07-06 22:09:34,959 - kcl - _cmd - lineno:453]: BROKER  TOPIC  PARTITION  NEW LOW WATERMARK  ERROR
5                                            unable to issue request: broker is too old; the broker has already indicated it will not know how to handle the request

@andijcr
Copy link
Contributor

andijcr commented Jul 7, 2023

same failure mode #11942

@travisdowns
Copy link
Member

I believe this failure is due to the fact that delete-records requests are only handled after successful upgrade, so no bug here, just necessary to fix the test:

@graphcareful if it's a functional issue like that wouldn't it be failing 100% of the time in release and debug? I do see it failing almost every time in debug, but not in release.

Is there randomness involved in selecting the operations to apply? I guess that could explain the occasional pass.

@graphcareful
Copy link
Contributor

I believe this failure is due to the fact that delete-records requests are only handled after successful upgrade, so no bug here, just necessary to fix the test:

@graphcareful if it's a functional issue like that wouldn't it be failing 100% of the time in release and debug? I do see it failing almost every time in debug, but not in release.

Is there randomness involved in selecting the operations to apply? I guess that could explain the occasional pass.

Yeah there is , i will look into that though, maybe something else is going on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants