-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddtrace/tracer: introduce priorities PriorityRuleSamplerKeep (3) and PriorityRuleSamplerReject (-3) for rulesSampler choices. #1012
Conversation
Hi @j-rufino, Can we clarify how this change will behave with older agents that don't know about the new priorities? Also would you link to the datadog-agent change that corresponds with this? |
Hi @knusbaum, Thanks for looking into this so quickly and for the excellent point, I should have documented that in the PR in the first place. As far as I can tell, since the first commit of priority sampler, the behaviour of the agent is to reject spans with negative priority and accept those with priority > 0 and so no corresponding changes on the agent are required for this to be the behaviour. See relevant code for this:
First commit of the priority sampler in the agent: https://github.com/DataDog/datadog-agent/blob/5f64165c8c719bdab64e63db3e4d1735f341f610/pkg/trace/sampler/prioritysampler.go#L107 I have verified that the current version of the agent works with this changes by locally setting up the agent and a test application using the modified tracer and I have verified that traces are correctly reported to the backend and visible in the UI. However, until we don't introduce some changes on the agent sampler statistics (and customers update the agent), the counter "datadog.trace_agent.receiver.traces_priority" will still report priority 2. See here. Although the changes in the agent are decoupled from this, I will do create a separate PR for that and link it here as requested ASAP. |
ddtrace/ext/priority.go
Outdated
// PrioritySamplingRuleReject informs the backend that a trace should be rejected and not stored as specified by | ||
// a sampling rule. | ||
// This should be used by the rules sampler based on the rules configured by the user | ||
PrioritySamplingRuleReject = -3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PrioritySamplingRuleReject = -3 | |
PriorityRuleSamplerReject = -3 |
I think this will make it much clearer that this is here to clarify that the "rule sampler" made this choice. WDYT?
ddtrace/ext/priority.go
Outdated
// PrioritySamplingRuleReject informs the backend that a trace should be rejected and not stored as specified by | ||
// a sampling rule. | ||
// This should be used by the rules sampler based on the rules configured by the user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// PrioritySamplingRuleReject informs the backend that a trace should be rejected and not stored as specified by | |
// a sampling rule. | |
// This should be used by the rules sampler based on the rules configured by the user | |
// PriorityRuleSamplerReject specifies that the rule sampler has decided that this trace should be rejected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there! 👋🏻 Thanks for your first PR. I left a few suggestions. Hope it helps!
Hi Gabriel, thanks for the suggestions, I agree is a better naming :) applied and committed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine by me. One thing that would be nice please, is if you could read our contribution guidelines: https://github.com/DataDog/dd-trace-go/blob/v1/CONTRIBUTING.md. I mean specifically to the PR title here, and the commit message.
For now, I will edit the title, but for the future, in your next PRs 🙂
Thank you Gabriel, I really appreciate the help and flexibility here. I should definitely have paid more attention to that, and will certainly take it into account for the next PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks!
Pleas note that after discussing with other integration teams we will be re-encoding these priority as -2, 3. See up to date list of priority encodings here: https://docs.google.com/document/d/1lVhgeG6uS0C9DVmN3t74tBDL7h44duwynIQqExljxRc/edit# |
Changed priority to 3,-3 for traces sampled/not based on sampling rate configuration
Problem statement and changes
So far there were 3 mechanisms to control sampling decisions in the agent, using the following priority values:
Priorities 0 and 1 being used by both the dynamic agent-tracer feedback loop and by the configuration rules sampling, has two problems, that we intend to resolve by using separate priority values (-3 and 3) when the sampling decision is based on config rules.
Problem 1: The agent will always reject spans/traces with negative priority, but it might decide to keep some with priority 0. This behaviour is not intuitive for users that intend to reject a specific % of spans/traces.
Problem 2: In the backend it's not possible to differentiate and produce stats on which spans/traces were sampled based on one mechanism vs the other.
Backwards compatibility
As far as I can tell, since the first commit of priority sampler, the behaviour of the agent is to reject spans with negative priority and accept those with priority > 0 and so no corresponding changes on the agent are required for this to be the behaviour.
See relevant code for this:
First commit of the priority sampler in the agent: https://github.com/DataDog/datadog-agent/blob/5f64165c8c719bdab64e63db3e4d1735f341f610/pkg/trace/sampler/prioritysampler.go#L107
I have verified that the current version of the agent works with this changes by locally setting up the agent and a test application using the modified tracer and I have verified that traces are correctly reported to the backend and visible in the UI.