Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (UNKNOWN_SERVER_ERROR) in ControllerUpgradeTest.test_updating_cluster_when_executing_operations #8102

Closed
rystsov opened this issue Jan 7, 2023 · 2 comments · Fixed by #8183

Comments

@rystsov
Copy link
Contributor

rystsov commented Jan 7, 2023

https://buildkite.com/redpanda/redpanda/builds/20780#01858972-1674-4ff6-8a39-3240ba11585a

Module: rptest.tests.controller_upgrade_test
Class:  ControllerUpgradeTest
Method: test_updating_cluster_when_executing_operations
test_id:    rptest.tests.controller_upgrade_test.ControllerUpgradeTest.test_updating_cluster_when_executing_operations
status:     FAIL
run time:   1 minute 48.982 seconds

    RpkException("Unexpected output, expected 'fuzzy-operator-2661-mavhxo\\s+OK' got 'fuzzy-operator-2661-mavhxo  UNKNOWN_SERVER_ERROR' on setting fuzzy-operator-2661-mavhxo segment.bytes=51535634")
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 35, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/controller_upgrade_test.py", line 98, in test_updating_cluster_when_executing_operations
    admin_fuzz.wait(num_executed_before_restart + 2, 240)
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 559, in wait
    wait_until(check, timeout_sec=timeout, backoff_sec=2)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 548, in check
    raise self.error
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 470, in thread_loop
    if self.execute_with_retries(op_type, op):
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 508, in execute_with_retries
    raise error
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 501, in execute_with_retries
    return op.execute(self.operation_ctx)
  File "/root/tests/rptest/services/admin_ops_fuzzer.py", line 176, in execute
    ctx.rpk().alter_topic_config(self.topic, self.property,
  File "/root/tests/rptest/clients/rpk.py", line 333, in alter_topic_config
    raise RpkException(
rptest.clients.rpk.RpkException: RpkException<Unexpected output, expected 'fuzzy-operator-2661-mavhxo\s+OK' got 'fuzzy-operator-2661-mavhxo  UNKNOWN_SERVER_ERROR' on setting fuzzy-operator-2661-mavhxo segment.bytes=51535634>
@rystsov rystsov added kind/bug Something isn't working ci-failure labels Jan 7, 2023
@rystsov
Copy link
Contributor Author

rystsov commented Jan 7, 2023

#8099 fixed rpk to avoid a case in which we were treating failure as a success and the fix revealed this hidden issue

when we update from an older version we pull the head which rejects alter_topic_config in mixed cluster state, eventually the test runs out of retries and fails

INFO  2023-01-06 23:47:13,995 [shard 0] cluster - topics_frontend.cc:183 - Refusing to update topics as not all cluster nodes are running v22.3
TRACE 2023-01-06 23:47:13,995 [shard 0] kafka - request_context.h:168 - [172.16.16.27:47502] sending 44:incremental_alter_configs for {rpk}, response {throttle_time_ms=0 responses={{error_code={ error_code: unknown_server_error [-1] } error_message={nullopt} resource_type=2 resource_name=fuzzy-operator-2661-mavhxo}}}

mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Jan 10, 2023
As lots of test became unstable after fixing validation in admin
operations fuzzer. We disable validating result of topic configuration
alteration not to disturb normal development process with constantly
failing tests.

Related: redpanda-data#8083, redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Jan 10, 2023
As lots of test became unstable after fixing validation in admin
operations fuzzer. We disable validating result of topic configuration
alteration not to disturb normal development process with constantly
failing tests.

Related: redpanda-data#8083, redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Jan 10, 2023
As lots of test became unstable after fixing validation in admin
operations fuzzer. We disable validating result of admin
operations not to disturb normal development process with constantly
failing tests.

Related: redpanda-data#8083, redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
rystsov added a commit to rystsov/redpanda that referenced this issue Jan 10, 2023
rystsov added a commit to rystsov/redpanda that referenced this issue Jan 10, 2023
@rystsov
Copy link
Contributor Author

rystsov commented Jan 12, 2023

@piyushredpanda Michal was going to take a look at it, assigning to him

mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Jan 12, 2023
When cluster is being upgraded from version `22.2.x` to version `22.3.x`
all topic configuration updates are blocked. Preventing admin operations
fuzzer from executing topic properties update operations when updating
cluster from `22.2.x`.

Fixes: redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Jan 16, 2023
When cluster is being upgraded from version `22.2.x` to version `22.3.x`
all topic configuration updates are blocked. Preventing admin operations
fuzzer from executing topic properties update operations when updating
cluster from `22.2.x`.

Fixes: redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Apr 11, 2023
When cluster is being upgraded from version `22.2.x` to version `22.3.x`
all topic configuration updates are blocked. Preventing admin operations
fuzzer from executing topic properties update operations when updating
cluster from `22.2.x`.

Fixes: redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
(cherry picked from commit ce3a798)
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Apr 14, 2023
When cluster is being upgraded from version `22.2.x` to version `22.3.x`
all topic configuration updates are blocked. Preventing admin operations
fuzzer from executing topic properties update operations when updating
cluster from `22.2.x`.

Fixes: redpanda-data#8102

Signed-off-by: Michal Maslanka <[email protected]>
(cherry picked from commit ce3a798)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants