-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Failure (TimeoutError) in KgoRepeaterSelfTest.test_kgo_repeater
#10865
Comments
The test expects to write 4MiB in 75 seconds. This is similar to #10500 in that a very modest throughput expectation is not being met. I suspect we have merged some test that is eating too many resources on the shared test node, and consequentially causing these existing tests to start failing. #10500 first appeared when we moved to shared lib builds for debug, but became much more frequent recently, this issue has only just popped up. |
on (amd64, container) in job https://buildkite.com/redpanda/redpanda/builds/29477#01883461-64c4-42e5-b8bf-79844bb04d52 |
|
|
This test succeeds locally for me but fails pretty consistently in CI, like here: https://buildkite.com/redpanda/redpanda/builds/30044#01885b12-fd93-4abd-9941-9dca0a461938 When the test succceds, it completes in, seconds:
When in fails in CI, the percentage complete in 75s is: Apparently the runtime is really dependent on the environment. To reliably pass in the current CI environment, it needs the timeout to be twice as high. |
56 kiB/s is indeed a modest expectation, however the problem with the speed is that the producer sends only 1 message (4kiB) per request. This way
producer only pushes ~11 messages per second. Average processing time of a produce request in the test is 106 ms (ok-ish), given that there are 3 nodes, even with 1 msg/s the producer is utilizing cluster capacity at 39%. @jcsp: is 1 msg per produce request intentional in this test? If yes, the timeout needs to be increased. |
https://buildkite.com/redpanda/redpanda/builds/30108#018868e9-8cf5-4149-a046-793962299e58
|
Not explicitly. I'm going to disable this test in debug mode. It was stable for a long time, but there is some systemic issue in the debug tests that is making things slower and slower -- if I bump the timeout, I may have to come back and do the same thing again later. |
These mostly involve sending some amount of traffic, which tends to be flaky in debug mode. These tests don't exist to test Redpanda, they're just smoke tests for the services themselves, so we don't gain much by running against debug binaries. Fixes redpanda-data#10865
These mostly involve sending some amount of traffic, which tends to be flaky in debug mode. These tests don't exist to test Redpanda, they're just smoke tests for the services themselves, so we don't gain much by running against debug binaries. Fixes redpanda-data#10865
These mostly involve sending some amount of traffic, which tends to be flaky in debug mode. These tests don't exist to test Redpanda, they're just smoke tests for the services themselves, so we don't gain much by running against debug binaries. Fixes redpanda-data#10865 (cherry picked from commit e3af6c2)
These mostly involve sending some amount of traffic, which tends to be flaky in debug mode. These tests don't exist to test Redpanda, they're just smoke tests for the services themselves, so we don't gain much by running against debug binaries. Fixes redpanda-data#10865
These mostly involve sending some amount of traffic, which tends to be flaky in debug mode. These tests don't exist to test Redpanda, they're just smoke tests for the services themselves, so we don't gain much by running against debug binaries. Fixes redpanda-data#10865 (cherry picked from commit e3af6c2)
https://buildkite.com/redpanda/redpanda/builds/29370#01882e53-41c2-436f-81b6-45a359769d05
The text was updated successfully, but these errors were encountered: