[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

mahadzaryab1 · 2024-10-03T00:26:32Z

Which problem is this PR solving?

Resolves [jaeger-v2] Refactor Adaptive Sampling Processor Configurations #6021

Description of the changes

Migration Guide

How was this change tested?

CI / Unit Tests / Integration Tests

Checklist

I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
I have signed all commits
I have added unit tests for the new functionality
I have run lint and test steps successfully
- for jaeger: make lint test
- for jaeger-ui: yarn lint and yarn test

Signed-off-by: Mahad Zaryab <[email protected]>

codecov · 2024-10-03T00:36:47Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.91%. Comparing base (5598400) to head (f96fe53).
Report is 7 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6039      +/-   ##
==========================================
- Coverage   96.92%   96.91%   -0.02%     
==========================================
  Files         349      349              
  Lines       16599    16603       +4     
==========================================
+ Hits        16088    16090       +2     
- Misses        328      329       +1     
- Partials      183      184       +1

Flag	Coverage Δ
badger_v1	`7.98% <ø> (ø)`
badger_v2	`1.81% <ø> (ø)`
cassandra-4.x-v1	`15.75% <ø> (ø)`
cassandra-4.x-v2	`1.74% <ø> (ø)`
cassandra-5.x-v1	`15.75% <ø> (ø)`
cassandra-5.x-v2	`1.74% <ø> (ø)`
elasticsearch-6.x-v1	`18.69% <ø> (ø)`
elasticsearch-7.x-v1	`18.74% <ø> (-0.02%)`	⬇️
elasticsearch-8.x-v1	`18.94% <ø> (ø)`
elasticsearch-8.x-v2	`1.80% <ø> (ø)`
grpc_v1	`9.51% <ø> (ø)`
grpc_v2	`7.12% <ø> (+0.01%)`	⬆️
kafka-v1	`9.69% <ø> (ø)`
kafka-v2	`1.81% <ø> (ø)`
memory_v2	`1.81% <ø> (ø)`
opensearch-1.x-v1	`18.79% <ø> (-0.02%)`	⬇️
opensearch-2.x-v1	`18.80% <ø> (+0.01%)`	⬆️
opensearch-2.x-v2	`1.81% <ø> (+0.01%)`	⬆️
tailsampling-processor	`0.46% <ø> (ø)`
unittests	`95.70% <100.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mahadzaryab1 · 2024-10-03T01:48:50Z

plugin/sampling/strategyprovider/adaptive/options.go

+	// LeaderLeaseRefreshInterval is the duration to sleep if this processor is elected leader before
+	// attempting to renew the lease on the leader lock. NB. This should be less than FollowerLeaseRefreshInterval
+	// to reduce lock thrashing.
+	LeaderLeaseRefreshInterval time.Duration `mapstructure:"leader_lease_refresh_interval"`

-	// CalculationInterval determines how often new probabilities are calculated. E.g. if it is 1 minute,
-	// new sampling probabilities are calculated once a minute and each bucket will contain 1 minute worth
-	// of aggregated throughput data.
-	CalculationInterval time.Duration `mapstructure:"calculation_interval"`
+	// FollowerLeaseRefreshInterval is the duration to sleep if this processor is a follower
+	// (ie. failed to gain the leader lock).
+	FollowerLeaseRefreshInterval time.Duration `mapstructure:"follower_lease_refresh_interval"`


@yurishkuro do you have any thoughts on how to group these? I was thinking grouping them into a type like Synchronization.

yurishkuro

I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.

yurishkuro · 2024-10-03T05:30:42Z

cmd/jaeger/config.yaml

@@ -38,7 +38,8 @@ extensions:
    #   path: ./cmd/jaeger/sampling-strategies.json
    adaptive:
      sampling_store: some_store
-      initial_sampling_probability: 0.1
+      sampling:


The whole struct here is "adaptive sampling", I think repeating "sampling" is redundant, does not add any clarity.

yurishkuro · 2024-10-03T05:33:07Z

plugin/sampling/strategyprovider/adaptive/options.go

-	// TODO implement manual overrides per service/operation.
-	TargetSamplesPerSecond float64 `mapstructure:"target_samples_per_second"`
+	Calculation Calculation `mapstructure:"calculation"`
+	Sampling    Sampling    `mapstructure:"sampling"`


I'm not sure this is an improvement. Most flags are related to calculation anyway. Some are related to boundary conditions, like min rate.

mahadzaryab1 · 2024-10-03T11:22:16Z

I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.

@yurishkuro Here is what the current full config would look like

  remote_sampling:
    adaptive:
      sampling_store: some_store
      target_samples_per_second: 
      delta_tolerance: 
      calculation_interval: 
      aggregation_buckets: 
      calculation_buckets: 
      calculation_delay: 
      initial_sampling_probability: 
      min_sampling_probability: 
      min_samples_per_second: 
      leader_lease_refresh_interval: 
      follower_lease_refresh_interval: 
    http:
    grpc:

I saw two main groupings here. One for the collection of the data (sampling), and one for the computations (calculation). The config from the current PR would look something like:

  remote_sampling:
    adaptive:
      calculation: 
        aggregation_buckets:
        buckets:
        delay:
        delta_tolerance:
        interval:
      sampling:
        store: 
        initial_probability:
        min_probability: 
        min_rate:
        target_rate: 
      # potentially a better grouping for the following two
      leader_lease_refresh_interval: 
      follower_lease_refresh_interval: 
    http:
    grpc:

Let me know what you think. Do you see any other groupings or do you want to keep the current flat list? I can adjust the migration guide accordingly and close this PR out if we don't want any changes here.

yurishkuro · 2024-10-03T14:33:26Z

I find the flat list easier to read right now. The calc/sampling separation is artificial, looks like it saves typing one word, but actually makes naming harder to understand. I don't think it's worth changing.

mahadzaryab1 · 2024-10-03T22:40:20Z

Closing as per discussion above

Refactor Configurations For Adaptive Sampling Processor

f96fe53

Signed-off-by: Mahad Zaryab <[email protected]>

mahadzaryab1 commented Oct 3, 2024

View reviewed changes

yurishkuro reviewed Oct 3, 2024

View reviewed changes

mahadzaryab1 closed this Oct 3, 2024

mahadzaryab1 deleted the refactor-adaptive branch October 31, 2024 22:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading

codecov bot commented Oct 3, 2024 •

edited

Loading

mahadzaryab1 Oct 3, 2024

yurishkuro left a comment

yurishkuro Oct 3, 2024

yurishkuro Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading

yurishkuro commented Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

Conversation

mahadzaryab1 commented Oct 3, 2024 • edited Loading

Which problem is this PR solving?

Description of the changes

How was this change tested?

Checklist

codecov bot commented Oct 3, 2024 • edited Loading

Codecov Report

mahadzaryab1 Oct 3, 2024

Choose a reason for hiding this comment

yurishkuro left a comment

Choose a reason for hiding this comment

yurishkuro Oct 3, 2024

Choose a reason for hiding this comment

yurishkuro Oct 3, 2024

Choose a reason for hiding this comment

mahadzaryab1 commented Oct 3, 2024 • edited Loading

yurishkuro commented Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading

codecov bot commented Oct 3, 2024 •

edited

Loading

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading