Uses ES bulk api only when there's more than one span #1146

codefromthecrypt · 2016-06-26T10:10:26Z

During a test where 100 single-span messages are sent to Kafka at the
same time, I noticed only 58-97 of them would end up in storage
eventhough all messages parsed properly and no operations failed.

After a 100ms/message pause was added, the store rate of this test went
to 100%, so figured it was some sort of state issue. I noticed the code
was using Bulk operations regardless of input size, so as a wild guess
changed the special-case single-span messages. At least in this test, it
raised the success rate to 100% without any pausing needed.

I don't know why this worked, but it seems sensible to not use bulk apis
when there's no bulk action to perform.

I started to write a unit test to validate single-length lists don't use
bulk, but the Mockito involved became too verbose as the Elasticsearch
client uses chaining and other patterns that are tedious to mock.

Instead, we should make a parallel integration test and apply them to
all storage components.

See #1141

codefromthecrypt · 2016-06-26T10:11:18Z

ping @anuraaga

anuraaga · 2016-06-26T10:31:45Z

...rage/elasticsearch/src/main/java/zipkin/storage/elasticsearch/ElasticsearchSpanConsumer.java

+    if (spans.isEmpty()) return Futures.immediateFuture(null);
+
+    // Create a bulk request when there is more than one span to store
+    ListenableFuture<?> future;


Recommend final since it's a branched initialization

sadly can't unless we refactor further due to the later if (ElasticsearchStorage.FLUSH_ON_WRITES)

oh.. actually that looks not too bad.. will do

nope was too bad.. leaving un-final :P

anuraaga · 2016-06-26T12:53:51Z

LGTM - thanks for fixing (and feel the pain on the difficulty of unit testing ES client...).

During a test where 100 single-span messages are sent to Kafka at the same time, I noticed only 58-97 of them would end up in storage eventhough all messages parsed properly and no operations failed. After a 100ms/message pause was added, the store rate of this test went to 100%, so figured it was some sort of state issue. I noticed the code was using Bulk operations regardless of input size, so as a wild guess changed the special-case single-span messages. At least in this test, it raised the success rate to 100% without any pausing needed. I don't know why this worked, but it seems sensible to not use bulk apis when there's no bulk action to perform. I started to write a unit test to validate single-length lists don't use bulk, but the Mockito involved became too verbose as the Elasticsearch client uses chaining and other patterns that are tedious to mock. Instead, we should make a parallel integration test and apply them to all storage components.

codefromthecrypt mentioned this pull request Jun 26, 2016

Why is the data lost in elasticsearch ? #1141

Closed

anuraaga reviewed Jun 26, 2016
View reviewed changes

codefromthecrypt force-pushed the one-isnt-bu branch from e4b3c89 to 5ea9850 Compare June 27, 2016 00:37

codefromthecrypt force-pushed the one-isnt-bu branch from 5ea9850 to 05d5412 Compare June 27, 2016 01:00

codefromthecrypt merged commit 245e5c0 into master Jun 27, 2016

codefromthecrypt deleted the one-isnt-bu branch June 27, 2016 02:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uses ES bulk api only when there's more than one span #1146

Uses ES bulk api only when there's more than one span #1146

codefromthecrypt commented Jun 26, 2016

codefromthecrypt commented Jun 26, 2016

anuraaga Jun 26, 2016

codefromthecrypt Jun 27, 2016 via email

codefromthecrypt Jun 27, 2016 via email

codefromthecrypt Jun 27, 2016

anuraaga commented Jun 26, 2016

Uses ES bulk api only when there's more than one span #1146

Uses ES bulk api only when there's more than one span #1146

Conversation

codefromthecrypt commented Jun 26, 2016

codefromthecrypt commented Jun 26, 2016

anuraaga Jun 26, 2016

Choose a reason for hiding this comment

codefromthecrypt Jun 27, 2016 via email

Choose a reason for hiding this comment

codefromthecrypt Jun 27, 2016 via email

Choose a reason for hiding this comment

codefromthecrypt Jun 27, 2016

Choose a reason for hiding this comment

anuraaga commented Jun 26, 2016