Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (critical check uploaded.size() != 0 has failed) in test_archival_service_rpfixture test_manifest_spillover test_archival_service_rpfixture.test_manifest_spillover #13275

Closed
dotnwat opened this issue Sep 5, 2023 · 13 comments · Fixed by #15277 or #18411
Assignees
Labels
area/cloud-storage Shadow indexing subsystem area/storage ci-failure kind/bug Something isn't working rpunit unit test ci-failure (not ducktape) sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low.

Comments

@dotnwat
Copy link
Member

dotnwat commented Sep 5, 2023

https://buildkite.com/redpanda/redpanda/builds/36337#018a5e9c-fcf7-47f9-8fb5-96b3f48b6a88

The following tests FAILED:
--
  | 105 - test_archival_service_rpfixture (Failed)

JIRA Link: CORE-2803

@dotnwat dotnwat added kind/bug Something isn't working ci-failure area/storage area/cloud-storage Shadow indexing subsystem labels Sep 5, 2023
@dotnwat
Copy link
Member Author

dotnwat commented Sep 5, 2023

_bk;t=1693810802337�/var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08753b44c7350ab0a-
1/redpanda/redpanda/src/v/archival/tests/
ntp_archiver_test.cc(1766): �[4;31;49mfatal error: in "test_manifest_spillover": critical check uploaded.size() != 0 has failed�[0;39;49m

@dotnwat dotnwat added the sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low. label Sep 5, 2023
@piyushredpanda
Copy link
Contributor

@Lazin was looking into this?

@rystsov rystsov added the rpunit unit test ci-failure (not ducktape) label Sep 6, 2023
@andijcr
Copy link
Contributor

andijcr commented Sep 8, 2023

#12726 same issue

@andijcr andijcr changed the title CI Failure test_archival_service_rpfixture CI Failure (critical check uploaded.size() != 0 has failed) in test_archival_service_rpfixture test_manifest_spillover test_archival_service_rpfixture.test_manifest_spillover Sep 8, 2023
@NyaliaLui
Copy link
Contributor

andrwng added a commit to andrwng/redpanda that referenced this issue Dec 2, 2023
It was previously possible that the archiver was started while the Raft
term hadn't been confirmed, and the subsequent spillover exits early
because the archiver isn't synced yet.

This commit fixes this by syncing the manifest.

Fixes redpanda-data#13275
@bharathv
Copy link
Contributor

bharathv commented Dec 5, 2023

@graphcareful
Copy link
Contributor

Reopening as this issue was seen a few times with recent work to update seastar, i don't believe these changes are related to the test failures

https://buildkite.com/redpanda/vtools/builds/13472#018f4e3c-aac9-470b-97c2-1e3a8a38ee81

@graphcareful graphcareful reopened this May 6, 2024
@andrwng
Copy link
Contributor

andrwng commented May 6, 2024

Reopening as this issue was seen a few times with recent work to update seastar, i don't believe these changes are related to the test failures

https://buildkite.com/redpanda/vtools/builds/13472#018f4e3c-aac9-470b-97c2-1e3a8a38ee81

Looking at that build, I'm seeing:

�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,459 [shard 0:main] http - [/10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256] - client.cc:156 - about to start connecting, false, is-closed false
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,459 [shard 0:main] dns_resolver - Query name 127.0.0.1 (ANY)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,459 [shard 0:main] storage - readers_cache.cc:340 - {kafka/test-topic/42} - removing reader: [1,1538] lower_bound: 1539
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,459 [shard 0:main] raft - [group_id:1, {kafka/test-topic/42}] state_machine_manager.cc:358 - updating _next offset with: 1539
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,459 [shard 0:main] http - error connecting to 127.0.0.1:4430 - seastar::timed_out_error (timedout)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,459 [shard 0:main] http - Connection error: seastar::timed_out_error (timedout)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,460 [shard 0:main] http - [/?list-type=2&prefix=cluster_metadata/f9e2c112-cf76-4d30-9b3f-43646a81c436/manifests/&delimiter=/] - client.cc:181 - connection timeout
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,460 [shard 0:main] http - error connecting to 127.0.0.1:4441 - seastar::timed_out_error (timedout)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,460 [shard 0:main] http - Connection error: seastar::timed_out_error (timedout)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,460 [shard 0:main] http - [/10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256] - client.cc:181 - connection timeout
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,460 [shard 0:main] http - [/10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256] - client.cc:187 - connected, false
�_bk;t=1715012792223�WARN  2024-05-06 16:26:00,460 [shard 0:main] http - [/10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256] - client.cc:136 - make_request timed-out connection attempt /10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,460 [shard 0:main] dns_resolver - Query name 127.0.0.1 (INET)
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,460 [shard 0:main] dns_resolver - Query success: 127.0.0.1/127.0.0.1
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,461 [shard 0:main] dns_resolver - Poll sockets
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,461 [shard 0:main] dns_resolver - ares_fds: 0
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,461 [shard 0:main] raft - coordinated_recovery_throttle.cc:216 - Coordination tick: unused bandwidth: 104857600, deficit bandwidth: 0
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,461 [shard 0:main] raft - coordinated_recovery_throttle.cc:52 - Throttler bucket capacity reset to: 104857600, waiting bytes: 0
�_bk;t=1715012792223�WARN  2024-05-06 16:26:00,461 [shard 0:main] s3 - s3_client.cc:787 - S3 PUT request failed with error for key "10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256": seastar::timed_out_error (timedout)
�_bk;t=1715012792223�WARN  2024-05-06 16:26:00,461 [shard 0:main] s3 - util.cc:77 - Connection timeout timedout
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,461 [shard 0:main] cloud_storage - [fiber124~0~0|1|0ms] - remote.cc:425 - Uploading manifest "10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256" to test-bucket, 176ms backoff required
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,461 [shard 0:main] http - error connecting to 127.0.0.1:4430 - std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,461 [shard 0:main] http - Connection error: std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,462 [shard 0:main] http - [/?list-type=2&prefix=cluster_metadata/f9e2c112-cf76-4d30-9b3f-43646a81c436/manifests/&delimiter=/] - client.cc:179 - connection refused std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,462 [shard 0:main] dns_resolver - Query name 127.0.0.1 (INET)
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,462 [shard 0:main] dns_resolver - Query success: 127.0.0.1/127.0.0.1
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,462 [shard 0:main] dns_resolver - Poll sockets
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,462 [shard 0:main] dns_resolver - ares_fds: 0
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,462 [shard 0:main] http - error connecting to 127.0.0.1:4430 - std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,462 [shard 0:main] http - Connection error: std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,462 [shard 0:main] storage - storage_resources.cc:167 - calc_falloc_step: step 33554432 (max 33554432)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,462 [shard 0:main] http - [/?list-type=2&prefix=cluster_metadata/f9e2c112-cf76-4d30-9b3f-43646a81c436/manifests/&delimiter=/] - client.cc:179 - connection refused std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,462 [shard 0:main] dns_resolver - Query name 127.0.0.1 (INET)
�_bk;t=1715012792223�DEBUG 2024-05-06 16:26:00,462 [shard 0:main] dns_resolver - Query success: 127.0.0.1/127.0.0.1
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,463 [shard 0:main] dns_resolver - Poll sockets
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,463 [shard 0:main] dns_resolver - ares_fds: 0
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,463 [shard 0:main] http - error connecting to 127.0.0.1:4430 - std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,463 [shard 0:main] http - Connection error: std::__1::system_error (error system:111, Connection refused)
�_bk;t=1715012792223�TRACE 2024-05-06 16:26:00,463 [shard 0:main] http - [/?list-type=2&prefix=cluster_metadata/f9e2c112-cf76-4d30-9b3f-43646a81c436/manifests/&delimiter=/] - client.cc:179 - connection refused std::__1::system_error (error system:111, Connection refused)
...
�_bk;t=1715012792232�DEBUG 2024-05-06 16:26:00,639 [shard 0:main] dns_resolver - Query name 127.0.0.1 (INET)
�_bk;t=1715012792232�DEBUG 2024-05-06 16:26:00,639 [shard 0:main] dns_resolver - Query success: 127.0.0.1/127.0.0.1
�_bk;t=1715012792232�TRACE 2024-05-06 16:26:00,639 [shard 0:main] dns_resolver - Poll sockets
�_bk;t=1715012792232�TRACE 2024-05-06 16:26:00,639 [shard 0:main] dns_resolver - ares_fds: 0
�_bk;t=1715012792232�WARN  2024-05-06 16:26:00,639 [shard 0:main] cloud_storage - [fiber124~0~0|1|0ms] - remote.cc:445 - Uploading manifest "10000000/meta/kafka/test-topic/42_0/manifest.bin.0.256.0.257.0.256" to test-bucket, backoff quota exceded, manifest not uploaded
�_bk;t=1715012792232�DEBUG 2024-05-06 16:26:00,639 [shard 0:main] client_pool - client_pool.cc:496 - releasing a client, pool size: 1, capacity: 2
�_bk;t=1715012792232�ERROR 2024-05-06 16:26:00,639 [shard 0:main] archival - [fiber124 kafka/test-topic/42] - ntp_archiver_service.cc:2545 - Failed to upload spillover manifest {timed_out}
�_bk;t=1715012792232�INFO  2024-05-06 16:26:00,640 [shard 0:main] test - ntp_archiver_test.cc:1774 - new_so: {0}, new_kafka: {0}, archive_so: -9223372036854775808, archive_kafka: -9223372036854775808, archive_clean: -9223372036854775808
�_bk;t=1715012792232�/var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-006607dfb56355384-1/redpanda/vtools/src/v/archival/tests/ntp_archiver_test.cc(1791): fatal error: in "test_manifest_spillover": critical check uploaded.size() != 0 has failed

Looks like there was a http client connection timeout when uploading the partition manifest, followed by many failed attempts to connect the http client and list from the cluster metadata uploader, and finally the test fails because the partition manifest upload failed.

Briefly chatted with Rob and he said he'd seen this test pass with the Seastar v24.2.x branch and he wasn't able to reproduce this locally with the v24.2.x upgrade, so it's not immediately clear that the v24.2.x changes caused this (but they may have... not sure)

@graphcareful
Copy link
Contributor

@andrwng super strange that this cannot be replicated locally though, only on CI

@nvartolomei
Copy link
Contributor

To me it looks like our timeout is hit too early somehow. It is lowresclock::now() + 1s

return ss::with_timeout(timeout, std::move(f))


Reproduced by running stress --cpu 30 and retrying the test a few times (on a ramdisk but unlike that's relevant)

RP_FIXTURE_ENV=1 /home/nv/redpanda/vbuild/debug/clang/bin/test_archival_service_rpfixture -t test_manifest_spillover -- -c 1 --default-log-level=trace --logger-log-level=exception=error --logger-log-level=io=info  --unsafe-bypass-fsync 1  --overprovisioned

i introduced a long sleep just before the test teardown and verified that the imposter listener is up after a failed run

LISTEN       0            100                      127.0.0.1:4441                     0.0.0.0:*           users:(("test_archival_s",pid=4037209,fd=17))

it also does accept connections fine

 → curl -v 127.0.0.1:4441
*   Trying 127.0.0.1:4441...
* Connected to 127.0.0.1 (127.0.0.1) port 4441 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:4441
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Content-Length: 9
< Content-Type: text/plain
< Date: Thu, 09 May 2024 08:44:53 GMT
< Server: Seastar httpd
<
* Connection #0 to host 127.0.0.1 left intact

heavy duty logging

 → RP_FIXTURE_ENV=1 strace -CfqqrtttTwy /home/nv/redpanda/vbuild/debug/clang/bin/test_archival_service_rpfixture -t test_manifest_spillover ...

# imposter listening
[pid 4038017] 1715244501.781840 (+     0.000034) socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_TCP) = 16<socket:[28071095]> <0.000013>
[pid 4038017] 1715244501.781872 (+     0.000031) setsockopt(16<socket:[28071095]>, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 <0.000006>
[pid 4038017] 1715244501.781897 (+     0.000024) bind(16<socket:[28071095]>, {sa_family=AF_INET, sin_port=htons(4441), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 <0.000009>
[pid 4038017] 1715244501.781926 (+     0.000029) listen(16<socket:[28071095]>, 100) = 0 <0.000006>

# waiting for new connections?
[pid 4038017] 1715244501.789101 (+     0.000051) io_submit(0x7f94b7b82000, 2, [{aio_data=0x6130001cd9a8, aio_lio_opcode=IOCB_CMD_POLL, aio_fildes=16<socket:[28071095]>, aio_buf=POLLIN|POLLOUT}, {aio_data=0x6130001cdbd8, aio_lio_opcode=IOCB_CMD_POLL, aio_fildes=17<socket:[28071096]>, aio_buf=POLLOUT}]) = 2 <0.000007>

# clinet trying to connect
[pid 4038017] 1715244503.728631 (+     0.000081) bind(18<socket:[28074691]>, {sa_family=AF_INET, sin_port=htons(59303), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 <0.000041>
[pid 4038017] 1715244503.728724 (+     0.000093) connect(18<socket:[28074691]>, {sa_family=AF_INET, sin_port=htons(4441), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000109>

# client waiting on previous operation to complete (aio_fildes=18<socket:[28074691]>) it also waits for an unrelated operation to complete 17<socket:[28074690]>
[pid 4038017] 1715244503.728927 (+     0.000204) io_submit(0x7f94b7b82000, 2, [{aio_data=0x6130003673d8, aio_lio_opcode=IOCB_CMD_POLL, aio_fildes=17<socket:[28074690]>, aio_buf=POLLOUT}, {aio_data=0x6130003a47d8, aio_lio_opcode=IOCB_CMD_POLL, aio_fildes=18<socket:[28074691]>, aio_buf=POLLOUT}]) = 2 <0.000053>

# the imposter server accepts the connection, matching sin_port
[pid 4038017] 1715244503.734359 (+     0.000118) accept4(16<socket:[28071095]>, {sa_family=AF_INET, sin_port=htons(59303), sin_addr=inet_addr("127.0.0.1")}, [128 => 16], SOCK_CLOEXEC|SOCK_NONBLOCK) = 19<socket:[28074692]> <0.000040>

# socket shutdown from the client / timeout?
# 0.0036 seconds = 3.6ms from the io_submit above
[pid 4038017] 1715244503.732541 (+     0.000123) shutdown(18<socket:[28074691]>, SHUT_RDWR) = 0 <0.000051>

# server trying to read data from the client socket 0.0075s = 7.5ms later
[pid 4038017] 1715244503.741905 (+     0.000146) io_submit(0x7f94b7b82000, 1, [{aio_data=0x6130003a51e8, aio_lio_opcode=IOCB_CMD_POLL, aio_fildes=19<socket:[28074692]>, aio_buf=POLLIN}]) = 1 <0.000042>
[pid 4038017] 1715244503.742495 (+     0.000132) recvfrom(19<socket:[28074692]>, "", 8192, MSG_DONTWAIT, NULL, NULL) = 0 <0.000034>

# client closing the socket (it was shutdown before)
[pid 4038017] 1715244503.742598 (+     0.000102) close(18<socket:[28074691]>) = 0 <0.000048>

# ???
[pid 4038017] 1715244503.743604 (+     0.000199) accept4(16<socket:[28071095]>, 0x7f94b196e428, [128], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000041>

[pid 4038017] 1715244503.745711 (+     0.000100) io_submit(0x7f94b7b82000, 1, [{aio_data=0x6130001cd9a8, aio_lio_opcode=IOCB_CMD_POLL, aio_fildes=16<socket:[28071095]>, aio_buf=POLLIN|POLLOUT}]) = 1 <0.000042>

@nvartolomei
Copy link
Contributor

nvartolomei commented May 9, 2024

If I change connect_with_timeout to

     return ss::with_timeout(timeout, std::move(f))
-      .handle_exception([socket, address, log](const std::exception_ptr& e) {
-          log->trace("error connecting to {} - {}", address, e);
-          socket->shutdown();
-          return ss::make_exception_future<ss::connected_socket>(e);
-      });
+      .handle_exception(
+        [socket, address, log, timeout](const std::exception_ptr& e) {
+            std::cout << "VVV time in exception handler: "
+                      << (seastar::lowres_clock::now().time_since_epoch())
+                      << "; deadline=" << timeout.time_since_epoch()
+                      << "; port: " << address << std::endl;
+            log->trace("error connecting to {} - {}", address, e);
+            socket->shutdown();
+            return ss::make_exception_future<ss::connected_socket>(e);
+        });

it logs something like

[pid 4046137] 1715249151.537154 (+     0.000120) write(1</dev/pts/5>, "VVV time in exception handler wi"..., 97VVV time in exception handler: 3170203482; deadline=3170202515; port: 127.0.0.1:4441

This indicates that current time is 967ms past the deadline (these are machine uptime in ms; ).

# Connect
[pid 4046137] 1715249151.521946 (+     0.000077) connect(18<socket:[28081129]>, {sa_family=AF_INET, sin_port=htons(4441), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000073>

# Shutdown
[pid 4046137] 1715249151.537787 (+     0.000069) shutdown(18<socket:[28081129]>, SHUT_RDWR) = 0 <0.000037>

A difference in 0.015 seconds again. Oversubscribed reactor?

@nvartolomei
Copy link
Contributor

Adding extra logging like

            std::cout << "VVV time: " << (current).time_since_epoch()
                      << "chrono"
                      << std::chrono::steady_clock::now().time_since_epoch()
                      << " host with port: " << _host_with_port << std::endl;
VVV time: 3173053176chrono3173054268 host with port: 127.0.0.1:4441

>>> 3173054268-3173053176
1092

lowres clock drifts too much at times and the timeouts don't make much sense anymore

@WillemKauf
Copy link
Contributor

@piyushredpanda
Copy link
Contributor

@nvartolomei : could you help chase this? Failing often given the seastar changes brought in.

nvartolomei added a commit to nvartolomei/redpanda that referenced this issue May 11, 2024
Adding a second takes ~1ms. Adding multiple of them in a single batch
can stall the reactor for a while. This, together with a recent seastar
change[^1] caused some timeouts to fire very frequently[^2].

Fix the test by breaking the batch passed to add_segments into smaller
segments so that we execute more finer grained tasks and yield more
often to the scheduler.

[^1]: scylladb/seastar#2238
[^2]: redpanda-data#13275
nvartolomei added a commit to nvartolomei/redpanda that referenced this issue May 11, 2024
I spotted that in some rpfixture tests the cluster metadata uploader is
running with an incorrect port[^1]. Beside logging lots of connection
refused errors, I found it to adding a non-negligible overhead to some
variations of the test. The overhead is result of relatively tight loop
in the `client::get_connected` method which I'm trying to remove in this
commit.

I don't believe this is necessary.  Returning the error to the caller is
much better. The callers usually have better retry mechanisms in place
with backoff and can log much more useful messages since they have
access to more context about the operation.

[^1]: We need to investigate separately why it doesn't get the updated
configuration when we configure cloud storage (you can try this test for
example redpanda-data#13275) or if it
needs to run as well.
nvartolomei added a commit to nvartolomei/redpanda that referenced this issue May 13, 2024
Adding a segment takes ~1ms. Adding multiple segments in a single batch
can stall the reactor for a while. This, together with a recent seastar
change[^1] caused some timeouts to fire very frequently[^2].

Fix the problematic test by splitting the batch passed to add_segments
so that we execute more finer grained tasks and yield more often to the
scheduler.

[^1]: scylladb/seastar#2238
[^2]: redpanda-data#13275
Lazin pushed a commit to Lazin/redpanda that referenced this issue Jun 1, 2024
Adding a segment takes ~1ms. Adding multiple segments in a single batch
can stall the reactor for a while. This, together with a recent seastar
change[^1] caused some timeouts to fire very frequently[^2].

Fix the problematic test by splitting the batch passed to add_segments
so that we execute more finer grained tasks and yield more often to the
scheduler.

[^1]: scylladb/seastar#2238
[^2]: redpanda-data#13275
Lazin pushed a commit to Lazin/redpanda that referenced this issue Jun 12, 2024
Adding a segment takes ~1ms. Adding multiple segments in a single batch
can stall the reactor for a while. This, together with a recent seastar
change[^1] caused some timeouts to fire very frequently[^2].

Fix the problematic test by splitting the batch passed to add_segments
so that we execute more finer grained tasks and yield more often to the
scheduler.

[^1]: scylladb/seastar#2238
[^2]: redpanda-data#13275
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem area/storage ci-failure kind/bug Something isn't working rpunit unit test ci-failure (not ducktape) sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low.
Projects
None yet