-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add GetSampleRateMulti #53
Conversation
## Which problem is this PR solving? Include a Stop() function so that the samplers can be shut down properly. This is currently only useful in tests. ## Short description of the changes - Add stop function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these changes look like they were in #54, did something strange happen with a rebase?
Could you describe an example of how this can help people using Refinery? I...think I get it, but I'm honestly not sure! Would this mean that sample rates are incorrectly reported or inflated today? |
Sample rates in refinery are calculated by asking the samplers to make their calculations once per trace. But there are a variable number of spans per trace, and Honeycomb bills by the span. So when we're sampling to limit throughput, we should be calculating these numbers based on the number of spans in the trace. When we're sampling based on keeping a distribution of keys, we could do it either based on spans or traces. It's not likely to make a difference unless span length is correlated to key value (possible, but probably not common). For the EMAThroughput sampler I'm currently working on, controlling throughput while still seeing representative data is the primary goal, so it will definitely be looking at span count. |
Which problem is this PR solving?
This library was designed to count spans and to sample based on the assumption that every span processed would get an independent sample rate. But that's not how it's being used by Refinery -- Refinery does its calculations based on traces, so each call to GetSampleRate is likely representative of multiple spans.
Short description of the changes