-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039
Conversation
Signed-off-by: Mahad Zaryab <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6039 +/- ##
==========================================
- Coverage 96.92% 96.91% -0.02%
==========================================
Files 349 349
Lines 16599 16603 +4
==========================================
+ Hits 16088 16090 +2
- Misses 328 329 +1
- Partials 183 184 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
// LeaderLeaseRefreshInterval is the duration to sleep if this processor is elected leader before | ||
// attempting to renew the lease on the leader lock. NB. This should be less than FollowerLeaseRefreshInterval | ||
// to reduce lock thrashing. | ||
LeaderLeaseRefreshInterval time.Duration `mapstructure:"leader_lease_refresh_interval"` | ||
|
||
// CalculationInterval determines how often new probabilities are calculated. E.g. if it is 1 minute, | ||
// new sampling probabilities are calculated once a minute and each bucket will contain 1 minute worth | ||
// of aggregated throughput data. | ||
CalculationInterval time.Duration `mapstructure:"calculation_interval"` | ||
// FollowerLeaseRefreshInterval is the duration to sleep if this processor is a follower | ||
// (ie. failed to gain the leader lock). | ||
FollowerLeaseRefreshInterval time.Duration `mapstructure:"follower_lease_refresh_interval"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro do you have any thoughts on how to group these? I was thinking grouping them into a type like Synchronization
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.
@@ -38,7 +38,8 @@ extensions: | |||
# path: ./cmd/jaeger/sampling-strategies.json | |||
adaptive: | |||
sampling_store: some_store | |||
initial_sampling_probability: 0.1 | |||
sampling: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole struct here is "adaptive sampling", I think repeating "sampling" is redundant, does not add any clarity.
// TODO implement manual overrides per service/operation. | ||
TargetSamplesPerSecond float64 `mapstructure:"target_samples_per_second"` | ||
Calculation Calculation `mapstructure:"calculation"` | ||
Sampling Sampling `mapstructure:"sampling"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is an improvement. Most flags are related to calculation anyway. Some are related to boundary conditions, like min rate.
@yurishkuro Here is what the current full config would look like remote_sampling:
adaptive:
sampling_store: some_store
target_samples_per_second:
delta_tolerance:
calculation_interval:
aggregation_buckets:
calculation_buckets:
calculation_delay:
initial_sampling_probability:
min_sampling_probability:
min_samples_per_second:
leader_lease_refresh_interval:
follower_lease_refresh_interval:
http:
grpc: I saw two main groupings here. One for the collection of the data (sampling), and one for the computations (calculation). The config from the current PR would look something like: remote_sampling:
adaptive:
calculation:
aggregation_buckets:
buckets:
delay:
delta_tolerance:
interval:
sampling:
store:
initial_probability:
min_probability:
min_rate:
target_rate:
# potentially a better grouping for the following two
leader_lease_refresh_interval:
follower_lease_refresh_interval:
http:
grpc: Let me know what you think. Do you see any other groupings or do you want to keep the current flat list? I can adjust the migration guide accordingly and close this PR out if we don't want any changes here. |
I find the flat list easier to read right now. The calc/sampling separation is artificial, looks like it saves typing one word, but actually makes naming harder to understand. I don't think it's worth changing. |
Closing as per discussion above |
Which problem is this PR solving?
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:yarn lint
andyarn test