Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (NodeCrash - Segment hydration succeded but file isn't available) in ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy #11060

Closed
NyaliaLui opened this issue May 26, 2023 · 3 comments · Fixed by #11298
Assignees
Labels
area/cloud-storage Shadow indexing subsystem ci-failure kind/bug Something isn't working sev/high loss of availability, pathological performance degradation, recoverable corruption

Comments

@NyaliaLui
Copy link
Contributor

https://buildkite.com/redpanda/vtools/builds/7787#01885486-739d-4f89-8fe3-5dfbcfc24458

Module: rptest.tests.e2e_shadow_indexing_test
Class:  ShadowIndexingWhileBusyTest
Method: test_create_or_delete_topics_while_busy
Arguments:
{
  "cloud_storage_type": 1,
  "short_retention": true
}
test_id:    rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=True.cloud_storage_type=CloudStorageType.S3
status:     FAIL
run time:   5 minutes 44.276 seconds


    <NodeCrash ip-172-31-0-95: ERROR 2023-05-25 22:27:24,173 [shard 1] assert - Assert failure: (/var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-05de279f910198181-1/redpanda/redpanda/src/v/cloud_storage/remote_segment.cc:802) 'is_state_materialized() || err' Segment hydration succeded but file isn't available
>
Traceback (most recent call last):
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 49, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/utils/mode_checks.py", line 63, in f
    return func(*args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/tests/e2e_shadow_indexing_test.py", line 791, in test_create_or_delete_topics_while_busy
    self.redpanda.wait_until(create_or_delete_until_producer_fin,
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1629, in wait_until
    wait_until(wrapped,
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1626, in wrapped
    assert self.all_up()
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 481, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 66, in wrapped
    redpanda.raise_on_crash()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 2036, in raise_on_crash
    raise NodeCrash(crashes)
rptest.services.utils.NodeCrash: <NodeCrash ip-172-31-0-95: ERROR 2023-05-25 22:27:24,173 [shard 1] assert - Assert failure: (/var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-05de279f910198181-1/redpanda/redpanda/src/v/cloud_storage/remote_segment.cc:802) 'is_state_materialized() || err' Segment hydration succeded but file isn't available
>
@NyaliaLui NyaliaLui added kind/bug Something isn't working ci-failure area/cloud-storage Shadow indexing subsystem labels May 26, 2023
@jcsp jcsp added the sev/high loss of availability, pathological performance degradation, recoverable corruption label Jun 5, 2023
@jcsp
Copy link
Contributor

jcsp commented Jun 5, 2023

This looks the same: also on arm64, dedicated nodes.

https://buildkite.com/redpanda/vtools/builds/7919#01888802-8884-46fd-ae0e-980a77943ce5

@andijcr andijcr self-assigned this Jun 5, 2023
@VladLazar
Copy link
Contributor

@andijcr
Copy link
Contributor

andijcr commented Jun 7, 2023

https://buildkite.com/redpanda/vtools/builds/7944#01888fbc-d4c8-4453-b4df-12771fbef0f0 the two variant short_retetion=True,short_retention=False fails,

short_retention=True fails without assertion, with a backtrace. error is similar to #10931

andijcr added a commit to andijcr/redpanda that referenced this issue Jun 13, 2023
previously, a couple of failure paths would a default true, this is a
problem for code after this that assumes the existance of objects after
this function

related to redpanda-data#11060
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem ci-failure kind/bug Something isn't working sev/high loss of availability, pathological performance degradation, recoverable corruption
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants