Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable priority sampling by default #654

Merged
merged 3 commits into from
Dec 18, 2018

Conversation

delner
Copy link
Contributor

@delner delner commented Dec 14, 2018

This pull request enables priority sampling by default.

@delner delner added the core Involves Datadog core libraries label Dec 14, 2018
@delner delner added this to the 0.19.0 milestone Dec 14, 2018
@delner delner self-assigned this Dec 14, 2018
max_spans_before_partial_flush = options.fetch(:max_spans_before_partial_flush, nil)
min_spans_before_partial_flush = options.fetch(:min_spans_before_partial_flush, nil)
partial_flush_timeout = options.fetch(:partial_flush_timeout, nil)

@enabled = enabled unless enabled.nil?
@sampler = sampler unless sampler.nil?

if priority_sampling
if priority_sampling && [email protected]_a?(PrioritySampler)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this check because if you configured the tracer twice, it would wrap the previous PrioritySampler with another PrioritySampler, and that seems problematic...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we have a weird thing in dd-trace-py where we have priority_sampling default to nil so that we only try to update the sampler if the option has been provided, rather than defaulting to True.

I think this basically does the same thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be good to add a comment here about why we have this check here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll add that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we updated options.fetch(:priority_sampling, true), does that mean we should have this be if !priority_sampling.nil? (sorry for my bad ruby)

Since nil is falsey, we probably only want to modify what sampler/writer we are using if we are setting priority sampling.

Also, do we need to handle :priority_smapling => false? if we reconfigure and set to false do we need to update the sampler/writer again to use a RateSampler?

Copy link
Contributor Author

@delner delner Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since nil is falsey, we probably only want to modify what sampler/writer we are using if we are setting priority sampling.

Consider the following: {}.fetch(:a, true) #=> true. Users would have to explicitly specify the key with a nil value in order to pass nil.

Given the above, I think your original assertion is already the case here. The if block wouldn't run if given nil, but it should never receive nil unless given nil explicitly.

Also, do we need to handle :priority_smapling => false? if we reconfigure and set to false do we need to update the sampler/writer again to use a RateSampler?

I think if we support this, then we need to add an elsif:

elsif !priority_sampling
  @sampler = sampler || Datadog::AllSampler.new if @sampler.is_a?(PrioritySampler)
  @writer = Writer.new
end

If you want nil to be a no-op, then you'd have to change !priority_sampling to priority_sampling == false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, sorry I misread the code. In dd-trace-py we default to None (nil), which means the user did not supply the option so we then default to True.

How do we instruct people to use #configure ?

In dd-trace-py we allow people to supply only the arguments they want to modify.

An example:

tracer.configure(priority_sampling=False)

tracer.configure(host='127.0.0.1')

Which will configure the tracer disabling priority sampling, and then configure the host to be localhost while keeping priority sampling disabled.

This might just be how we have .configure() setup for dd-trace-py, curious if this is (or should be) the same case here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, first call would deactivate, then second would reactivate. That's no good.

Changing the logic such that it defaults to nil, activates if != false, and deactivates only if == false.

@@ -366,6 +366,7 @@ def test_tracer_hard_limit_overrides_soft_limit
tracer.configure(min_spans_before_partial_flush: context.max_length,
max_spans_before_partial_flush: context.max_length,
partial_flush_timeout: 3600)
tracer.writer = FauxWriter.new
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nasty side effect of this is that everytime we reconfigure the tracer, it automatically overrides the previous writer with a new one, which puts test tracers in a bad state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to add a comment here about that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this comment will apply just about everywhere we use tracer.configure in tests so I don't think it'll help. I'd rather try to find some way to get rid of this, because this kind of practice in tests is not sustainable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something dd-trace-py did for tests in create a DummyTracer to use in all tests that will always overwrite the writer to be a DummyWriter, even after calling .configure()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was looking at that, too. Problem is configure applies a bunch of mutations to Writer, so if you have to re-initialize with a FauxWriter, then you have to copy over all the settings, otherwise you'll end up with a misconfigured Writer.

I think ideally, the code should either construct immutable Writer objects that have to be replaced, or mutable ones that are never replaced; right now we're in a kind of unhappy middle ground.

Nonetheless I'm going to try to avoid such refactoring right now, maybe there's something else I can do to work around this in the mean time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we have that same issue. since we isolated it into one place in a subclass of Tracer it wasn't too bad... enough to get by for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure, I'll try one of these tricks to make something work for now, but it would be nice if we refactored things to be more testable. A goal for the future, perhaps.

@delner delner added the do-not-merge/WIP Not ready for merge label Dec 14, 2018
@delner
Copy link
Contributor Author

delner commented Dec 14, 2018

Looks like simply changing the option from false to true breaks the test suite in all kinds of ways. This is going to require more work than expected.

@@ -1386,6 +1378,14 @@ span.context.sampling_priority = Datadog::Ext::Priority::USER_REJECT
span.context.sampling_priority = Datadog::Ext::Priority::USER_KEEP
```

You can disable priority sampling via:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dd-trace-py we removed this part from our docs, since disabling priority sampling has a chance of breaking distributed tracing, better to just leave this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

max_spans_before_partial_flush = options.fetch(:max_spans_before_partial_flush, nil)
min_spans_before_partial_flush = options.fetch(:min_spans_before_partial_flush, nil)
partial_flush_timeout = options.fetch(:partial_flush_timeout, nil)

@enabled = enabled unless enabled.nil?
@sampler = sampler unless sampler.nil?

if priority_sampling
if priority_sampling && [email protected]_a?(PrioritySampler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we have a weird thing in dd-trace-py where we have priority_sampling default to nil so that we only try to update the sampler if the option has been provided, rather than defaulting to True.

I think this basically does the same thing.

@@ -366,6 +366,7 @@ def test_tracer_hard_limit_overrides_soft_limit
tracer.configure(min_spans_before_partial_flush: context.max_length,
max_spans_before_partial_flush: context.max_length,
partial_flush_timeout: 3600)
tracer.writer = FauxWriter.new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to add a comment here about that

max_spans_before_partial_flush = options.fetch(:max_spans_before_partial_flush, nil)
min_spans_before_partial_flush = options.fetch(:min_spans_before_partial_flush, nil)
partial_flush_timeout = options.fetch(:partial_flush_timeout, nil)

@enabled = enabled unless enabled.nil?
@sampler = sampler unless sampler.nil?

if priority_sampling
if priority_sampling && [email protected]_a?(PrioritySampler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be good to add a comment here about why we have this check here.

test/sampler_test.rb Show resolved Hide resolved
@delner delner force-pushed the feature/priority_sampling_by_default branch from f8c2f89 to c33dc35 Compare December 17, 2018 17:35
@delner delner removed the do-not-merge/WIP Not ready for merge label Dec 17, 2018
Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, just a note about how we are handling #configure but otherwise, looks good

max_spans_before_partial_flush = options.fetch(:max_spans_before_partial_flush, nil)
min_spans_before_partial_flush = options.fetch(:min_spans_before_partial_flush, nil)
partial_flush_timeout = options.fetch(:partial_flush_timeout, nil)

@enabled = enabled unless enabled.nil?
@sampler = sampler unless sampler.nil?

if priority_sampling
if priority_sampling && [email protected]_a?(PrioritySampler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we updated options.fetch(:priority_sampling, true), does that mean we should have this be if !priority_sampling.nil? (sorry for my bad ruby)

Since nil is falsey, we probably only want to modify what sampler/writer we are using if we are setting priority sampling.

Also, do we need to handle :priority_smapling => false? if we reconfigure and set to false do we need to update the sampler/writer again to use a RateSampler?

@@ -7,7 +7,7 @@
RSpec.describe Datadog::Contrib::ActiveSupport::Notifications::Subscription do
describe 'instance' do
subject(:subscription) { described_class.new(tracer, span_name, options, &block) }
let(:tracer) { ::Datadog::Tracer.new(writer: FauxWriter.new) }
let(:tracer) { get_test_tracer }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, 👍 for encapsulating this somewhere.

# TODO: Let's try to get rid of this override, which has too much
# knowledge about the internal workings of the tracer.
# It is done to prevent the activation of priority sampling
# from wiping out the configured test writer, by replacing it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 agree with this comment, this is great, especially if it helps us get by, but a little hacky/nasty.

test/helper.rb Show resolved Hide resolved
@delner delner force-pushed the feature/priority_sampling_by_default branch from c33dc35 to df01e39 Compare December 18, 2018 19:40
@delner delner force-pushed the feature/priority_sampling_by_default branch from df01e39 to 956577f Compare December 18, 2018 20:38
Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@delner delner changed the base branch from 0.18-dev to 0.19-dev December 18, 2018 21:05
@delner delner merged commit a57074f into 0.19-dev Dec 18, 2018
@delner delner deleted the feature/priority_sampling_by_default branch December 18, 2018 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Involves Datadog core libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants