Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

Closed

Conversation

mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Oct 3, 2024

Which problem is this PR solving?

Description of the changes

How was this change tested?

  • CI / Unit Tests / Integration Tests

Checklist

Copy link

codecov bot commented Oct 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.91%. Comparing base (5598400) to head (f96fe53).
Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6039      +/-   ##
==========================================
- Coverage   96.92%   96.91%   -0.02%     
==========================================
  Files         349      349              
  Lines       16599    16603       +4     
==========================================
+ Hits        16088    16090       +2     
- Misses        328      329       +1     
- Partials      183      184       +1     
Flag Coverage Δ
badger_v1 7.98% <ø> (ø)
badger_v2 1.81% <ø> (ø)
cassandra-4.x-v1 15.75% <ø> (ø)
cassandra-4.x-v2 1.74% <ø> (ø)
cassandra-5.x-v1 15.75% <ø> (ø)
cassandra-5.x-v2 1.74% <ø> (ø)
elasticsearch-6.x-v1 18.69% <ø> (ø)
elasticsearch-7.x-v1 18.74% <ø> (-0.02%) ⬇️
elasticsearch-8.x-v1 18.94% <ø> (ø)
elasticsearch-8.x-v2 1.80% <ø> (ø)
grpc_v1 9.51% <ø> (ø)
grpc_v2 7.12% <ø> (+0.01%) ⬆️
kafka-v1 9.69% <ø> (ø)
kafka-v2 1.81% <ø> (ø)
memory_v2 1.81% <ø> (ø)
opensearch-1.x-v1 18.79% <ø> (-0.02%) ⬇️
opensearch-2.x-v1 18.80% <ø> (+0.01%) ⬆️
opensearch-2.x-v2 1.81% <ø> (+0.01%) ⬆️
tailsampling-processor 0.46% <ø> (ø)
unittests 95.70% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +46 to +53
// LeaderLeaseRefreshInterval is the duration to sleep if this processor is elected leader before
// attempting to renew the lease on the leader lock. NB. This should be less than FollowerLeaseRefreshInterval
// to reduce lock thrashing.
LeaderLeaseRefreshInterval time.Duration `mapstructure:"leader_lease_refresh_interval"`

// CalculationInterval determines how often new probabilities are calculated. E.g. if it is 1 minute,
// new sampling probabilities are calculated once a minute and each bucket will contain 1 minute worth
// of aggregated throughput data.
CalculationInterval time.Duration `mapstructure:"calculation_interval"`
// FollowerLeaseRefreshInterval is the duration to sleep if this processor is a follower
// (ie. failed to gain the leader lock).
FollowerLeaseRefreshInterval time.Duration `mapstructure:"follower_lease_refresh_interval"`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro do you have any thoughts on how to group these? I was thinking grouping them into a type like Synchronization.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.

@@ -38,7 +38,8 @@ extensions:
# path: ./cmd/jaeger/sampling-strategies.json
adaptive:
sampling_store: some_store
initial_sampling_probability: 0.1
sampling:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole struct here is "adaptive sampling", I think repeating "sampling" is redundant, does not add any clarity.

// TODO implement manual overrides per service/operation.
TargetSamplesPerSecond float64 `mapstructure:"target_samples_per_second"`
Calculation Calculation `mapstructure:"calculation"`
Sampling Sampling `mapstructure:"sampling"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is an improvement. Most flags are related to calculation anyway. Some are related to boundary conditions, like min rate.

@mahadzaryab1
Copy link
Collaborator Author

mahadzaryab1 commented Oct 3, 2024

I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.

@yurishkuro Here is what the current full config would look like

  remote_sampling:
    adaptive:
      sampling_store: some_store
      target_samples_per_second: 
      delta_tolerance: 
      calculation_interval: 
      aggregation_buckets: 
      calculation_buckets: 
      calculation_delay: 
      initial_sampling_probability: 
      min_sampling_probability: 
      min_samples_per_second: 
      leader_lease_refresh_interval: 
      follower_lease_refresh_interval: 
    http:
    grpc:

I saw two main groupings here. One for the collection of the data (sampling), and one for the computations (calculation). The config from the current PR would look something like:

  remote_sampling:
    adaptive:
      calculation: 
        aggregation_buckets:
        buckets:
        delay:
        delta_tolerance:
        interval:
      sampling:
        store: 
        initial_probability:
        min_probability: 
        min_rate:
        target_rate: 
      # potentially a better grouping for the following two
      leader_lease_refresh_interval: 
      follower_lease_refresh_interval: 
    http:
    grpc:

Let me know what you think. Do you see any other groupings or do you want to keep the current flat list? I can adjust the migration guide accordingly and close this PR out if we don't want any changes here.

@yurishkuro
Copy link
Member

I find the flat list easier to read right now. The calc/sampling separation is artificial, looks like it saves typing one word, but actually makes naming harder to understand. I don't think it's worth changing.

@mahadzaryab1
Copy link
Collaborator Author

Closing as per discussion above

@mahadzaryab1 mahadzaryab1 deleted the refactor-adaptive branch October 31, 2024 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[jaeger-v2] Refactor Adaptive Sampling Processor Configurations
2 participants