Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests to act as long-running consumers #79

Merged
merged 21 commits into from
Nov 9, 2017
Merged

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Oct 16, 2017

Prior to this the test would bootstrap at every run, and thus fail whenever broker 0 was down. Instead of adding more bootstrap addresses, I wanted to test that bootstrap actually works, i.e. the client gets other brokers that it can connect to.

With #78 we might break such a thing, if for example clients get external addresses at bootstrap.

@solsson
Copy link
Contributor Author

solsson commented Oct 16, 2017

I don't get entirely positive results. If I kill kafka-0 and kafka-1 the testcase sees old messages for 30-40 seconds. It then recovers, and kubectl exec test-basic-with-kafkacat-[tab] -c testcase -- cat /shared/consumed.tmp indicates that we didn't lose any messages. I've also seen pod restarts.

I fail to get predictable timeouts from the kafkacat -Q in testcase, but I think it is below 10s now.

Update: I get better results, of course, with d522a89 but some kubectl deleteed brokers result in

test-basic-with-kafkacat [0] offset 95
% KC_ERROR: offsets_for_times failed: Broker: Leader not available
% KC_ERROR: offsets_for_times failed: Broker: Leader not available
% KC_ERROR: offsets_for_times failed: Broker: Leader not available
% KC_ERROR: offsets_for_times failed: Broker: Leader not available
%4|1509880228.495|METADATA|rdkafka#producer-1| [thrd:main]: kafka-0.broker.kafka.svc.cluster.local:9092/bootstrap: Metadata request failed: Local: Timed out (2101ms)
% KC_ERROR: offsets_for_times failed: Broker: Leader not available
test-basic-with-kafkacat [0] offset 101

I think I should convert the java client based test also to this online style.

Update 2: I quite sure the errors above are from https://github.com/Yolean/kubernetes-kafka/pull/79/files#diff-54bc09f6375afd6bd0397dd27a9748e0R48, not from the producer or consumer.

  • TODO clarify in test output that errors from kafkacat -Q are normal at loss of a single broker, because it does bootstrap every time.

@solsson solsson added the v3.0 label Nov 5, 2017
solsson added a commit that referenced this pull request Nov 5, 2017
for every readiness run, so it's a good test case for the bootstrap service (#79)
@solsson
Copy link
Contributor Author

solsson commented Nov 6, 2017

Pushing my test nodes to the limit now by filling up memory, and as one node crashed both the kafkacat and the console producer+consumer based test indicate that one message couldn't be read back in a timely manner:

OK (Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:29:44,720211997+00:00 at 2017-11-06T07:29:46)
OK (Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:29:54,708792545+00:00 at 2017-11-06T07:29:56)
Last message (at 2017-11-06T07:30:06) isn't from this test run (produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:04,709509658+00:00):
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:29:54,708792545+00:00
OK (Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:14,708178686+00:00 at 2017-11-06T07:30:16)
OK (Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:24,708733235+00:00 at 2017-11-06T07:30:26)

test-basic-with-kafkacat [0] offset 6952
test-basic-with-kafkacat [0] offset 6953
Last message is 10.252 old:
Test test-basic-with-kafkacat-6868dd7b78-tsnf9@2017-11-06T07:29:56,160059965+00:00
test-basic-with-kafkacat [0] offset 6955
test-basic-with-kafkacat [0] offset 6956

Were they at all produced? kafkacat's debug output shows the offsets, and it looks like they've increased as expected. Using that offset as indication I consumed from the other test topic and it looks like we didn't do "exactly once" :) :

$ k-test-kafka exec produce-consume-55cc4b4659-qq6mp -- /bin/bash -c './bin/kafka-console-consumer.sh --bootstrap-server $BOOTSTRAP --topic test-produce-consume --partition 0 --offset 6000' | grep -B 2 -A 2 'produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:04,709509658+00:00'
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:29:44,720211997+00:00
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:29:54,708792545+00:00
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:04,709509658+00:00
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:04,709509658+00:00
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:14,708178686+00:00
Test produce-consume-55cc4b4659-qq6mp@2017-11-06T07:30:24,708733235+00:00

@solsson solsson added this to the v3.0 milestone Nov 6, 2017
@solsson solsson removed the merge1.8 label Nov 7, 2017
@solsson solsson changed the title Use kafkacat to test failover after bootstrap with first broker Update tests to act as long-running consumers Nov 9, 2017
@solsson
Copy link
Contributor Author

solsson commented Nov 9, 2017

Would be interesting to have configurable acks behavior, but let's merge this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant