-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added batch spans sending #52
Added batch spans sending #52
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for submitting this. You're right about the leaky abstraction of the encoding of the span!
This mostly looks good, but I'm concerned that if a client doesn't call flush
, then you could potentially get queued spans that are never emitted. I believe zipkin-reporter-java
handles this by using a sending thread with a timeout so that the max sending delay is bounded. I wonder if we can do the same here?
@kaisen We will need to adjust a few internal tools before using this anywhere to check for the type of the thrift message: https://github.com/openzipkin/zipkin/blob/release-1.28.1/zipkin-collector/kafka/src/main/java/zipkin/collector/kafka/KafkaStreamProcessor.java#L62
py_zipkin/thrift/__init__.py
Outdated
@@ -4,7 +4,8 @@ | |||
import struct | |||
|
|||
import thriftpy | |||
from thriftpy.protocol.binary import TBinaryProtocol | |||
from thriftpy.protocol.binary import TBinaryProtocol, write_list_begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We split imports into one per line. If you run pre-commit, it should fix this. I believe pre-commit should be run as part of tox.
span = create_span( | ||
class ZipkinBatchSender(object): | ||
|
||
MAX_PORTION_SIZE = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this configurable?
@bplotnick if I'm reading the code right, it looks like we won't have to worry about spans not being sent because flush() is always called with the logging context manager exits. Unless you're thinking about another corner case? Still reviewing this PR. |
@kaisen My concern was not with |
@bplotnick re: flush(). I don't think zipkin-reporter-java with its timeouts is suitable for our case, because here we encode all spans together in zipkin_span.stop(). But we can use ZipkinBatchSender as a context manager and do flush() at __exit__(). |
Guys, looking forward to your feedback about the latest changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good!
@@ -142,13 +142,9 @@ your Zipkin collector is running at localhost:9411. | |||
import requests | |||
|
|||
def http_transport(encoded_span): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment describing what type/format encoded_span is expected to be would be useful.
py_zipkin/logging_helper.py
Outdated
timestamp_s, | ||
duration_s, | ||
) | ||
self._add_span_to_queue(thrift_span) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm thinking we don't need a separate function just to add it to the queue. i'm ok with _add_span_to_queue's logic be inside add_span
py_zipkin/logging_helper.py
Outdated
message = thrift_obj_in_bytes(span) | ||
transport_handler(message) | ||
): | ||
if not self.transport_handler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this check is unnecessary. a logging context is only created if perform_logging is True, which is only True if (self.zipkin_attrs or self.sampling_rate is not None) which requires a check for self.transport_handler.
py_zipkin/zipkin.py
Outdated
@@ -126,6 +127,9 @@ def __init__( | |||
:param transport_handler: Callback function that takes a message parameter | |||
and handles logging it | |||
:type transport_handler: function | |||
:param max_span_portion_size: Spans in a trace are sent in batches, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't matter too much but i would prefer 'max_span_batch_size'. @bplotnick thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like the idea of having "batch" somewhere in the name, considering the batch sender class is called ZipkinBatchSender
@@ -1,5 +1,5 @@ | |||
import pytest | |||
from thriftpy.protocol.binary import TBinaryProtocol | |||
from thriftpy.protocol.binary import TBinaryProtocol, read_list_begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 separate import lines for this
@kaisen, @bplotnick Please verify my latest commit. I took into account all your comments. |
1 similar comment
Guys? |
Sorry for the delay @dmitry-prokopchenkov. This looks good to me. @bplotnick ? |
Merged. I'll release a 0.9.0 in a bit with these changes. Thanks so much for this work @dmitry-prokopchenkov!! |
@bplotnick thanks! I really need this version in pip to complete my current task) When are you going to release this? |
@kaisen I released 0.9.0 yesterday, but it didn't get uploaded to the public pypi (for some reason, i thought we had this automated). Can you do this so @dmitry-prokopchenkov can use the release? |
@dmitry-prokopchenkov v0.9.0 is now on pypi |
@bplotnick , @kaisen Thanks! |
This is a pr for #6
Guys, I have a question. I didn't provide a backward compatibility in this pull request. After my changes in thrift encoding code this example from the docs won't work:
I think here we breach encapsulation of thrift encoding logic by adding '\x0c\x00\x00\x00\x01'. In the pull request the count of thrift objects is defined automatically and can be hided from client. So the modified version of this example would be:
I can provide a backward compatibility by adding a 'batch' flag to zipkin_span constructor, but I prefer not to do it without your feedback. Please advise.