-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Failure (unknown error) in cloud_storage_rpfixture
#12120
Comments
Just noting the failure:
|
Thanks @andrwng ! |
My guess so far is this is an edge case in the consume path, where with the exact right set of bytes per batch and per segment, go down the "stuck consumer" path and return. |
@abhijat I think the side effect of this is that we return early, but it shouldn't block clients from making progress, right? I'm also wondering if this is an existing bug, or whether it could be related to chunk hydration |
Looking at the logs it seems the test worked as expected, I am not sure why it was reported as an error.
In these tests chunk hydration does not happen because the impostor does not have the index, so the fallback path is used:
The download request is for the full segment:
The segment size is reported as
This basically cuts the body returned by the impostor in half, so Looking at the batch consumer logs, the segment is consumed up to that size (summing up the batch sizes: 1808+6466+2098+1498+8740+7017):
At this point the downloaded segment is fully consumed, and when the next
The test asserts that an exception should be thrown:
But it fails, perhaps the exception thrown does not match
so it should not fail because of this change. I will try to run this test locally and see what error is thrown during the test. |
Running the test locally shows the following error being thrown:
This error originates from the parser trying to read an iobuf and not having the expected number of bytes, which results in the parser returning an error and the partition reader throwing the system error which matches the test assertion. In the failing unit test it seems that the truncation performed by the test happened exactly at the batch boundary, so that the batches in the truncated segment were intact, the system error was never thrown. |
This is what seems to be happening in the test. Maybe we should assert for either system error or stuck reader exception in the test. |
This might be the same:
|
https://buildkite.com/redpanda/redpanda/builds/33591#018971d7-5d7f-4e9c-b6b6-600d21dad304 (23.1.x)
|
wonder if the resetter did any effect on this #13146 |
seen on #14113
|
This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the |
This issue was closed due to lack of activity. Feel free to reopen if it's still relevant. |
https://buildkite.com/redpanda/redpanda/builds/33139#0189548e-5b3f-430d-b310-22a98bb116ed
Analysis of the log did not point to which test case failed within
cloud_storage_rpfixture
. More analysis required.Created issue to move #12088 forward.
The text was updated successfully, but these errors were encountered: