Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zipkin firehose #66

Merged
merged 8 commits into from
Feb 8, 2018
Merged

Zipkin firehose #66

merged 8 commits into from
Feb 8, 2018

Conversation

bplotnick
Copy link
Contributor

@bplotnick bplotnick commented Jan 29, 2018

This is a quick and dirty implementation of the "firehose mode" (openzipkin/zipkin#1869). The full specification for firehose mode is to come, but the basic idea is that there is a second handler that emits spans regardless of sampling decision.

@msindwan @drolando

@coveralls
Copy link

coveralls commented Jan 29, 2018

Coverage Status

Coverage remained the same at 100.0% when pulling f3d0c4b on bplotnick:zipkin-firehose into 6199309 on Yelp:master.

if self.firehose_handler:
self._log_spans_with_span_sender(
ZipkinBatchSender(self.firehose_handler,
self.max_span_batch_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 2 max_batch_size don't have to be the same as the zipkin api and collectors don't have the same MTU limitations we have for firehose

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as a FIXME for this branch, since I am not sure how we want to handle this, so for now we can assume both max sizes are the same.

@@ -113,7 +113,8 @@ def __init__(
add_logging_annotation=False,
report_root_timestamp=False,
use_128bit_trace_id=False,
host=None
host=None,
firehose_handler=None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want people to re-implement the firehose handler then? I thought we had decided to implement it in py_zipkin directly. In that case this could just be a boolean flag.

Copy link
Contributor

@msindwan msindwan Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think we should keep it as a reference to a function. I'd rather see a default implementation that people can import and pass as an argument. That way, users can define their own handlers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a reference implementation here once we have a design doc and spec for the local firehose daemon. But, at least for now, I agree with @msindwan that we can leave it up to the user.

@@ -285,7 +288,7 @@ def start(self):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need to check self.firehose_handler in line 246 (self.perform_logging = bool(...)).

github doesn't let me add a comment there since it's far from your changes :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, though I'm not really sure what this code path would even do. In what case would you have no sample rate, no passed in zipkin attrs, and no ongoing trace? Maybe if you called some function that is usually an "inner span"? I dunno. Anyway, I added a test for it and fixed the issue.

@codefromthecrypt
Copy link

codefromthecrypt commented Jan 30, 2018 via email

Copy link
Contributor

@msindwan msindwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@@ -260,7 +264,7 @@ def start(self):
)

if not self.zipkin_attrs:
# This span is inside the context of an existing trace
# Is this span is inside the context of an existing trace?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Is this span is inside.."

# with sample rate of 0
report_root_timestamp = True
self.zipkin_attrs = create_attrs_for_span(
sample_rate=0.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused here - wouldn't we want a 100% sampling rate on the span? Or is that not how it works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forgot to respond to you. As far as the attrs is concerned, the sample rate is 0%. The firehose is separate from the sample rate.

@bplotnick
Copy link
Contributor Author

@drolando @msindwan I had to pull master. I added documentation in the docstring and the README and parametrized one of the tests that was duplicated. Can you take another look?

retention. It works in tandem with normal operation, however there may
be additional overhead. In order to use this, you add a
`firehose_handler` just like you add a `transport_handler`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone wanted to enable only firehose mode and disable the normal transport, they could simply set the sampling rate to 0% and provide the firehose_handler, correct? Could be worth mentioning that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is implied in "regardless of sampling rate". I'm gonna merge as-is. We'll need to add additional docs about firehose anyway (e.g. example handler and proxy daemon) so if this still isn't clear I'll fix it up later

@bplotnick bplotnick changed the title WIP: Zipkin firehose Zipkin firehose Feb 8, 2018
@bplotnick bplotnick merged commit 3031ab0 into Yelp:master Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants