Tail sampling #263

alexmojaki · 2024-06-14T10:58:51Z

There's some details which I'm unsure about such as defaults, but I'd like to get this through so we can test it out for ourselves. It could be quite useful in our backend, especially in places where we're using random trace sampling. Documentation and maybe some tweaking (including env var configuration) can come after we've used it.

Usage in a nutshell:

import logfire

logfire.configure(
    tail_sampling=logfire.TailSamplingOptions(
        # These are the defaults of TailSamplingOptions, tail_sampling is None by default.
        level='notice',  # include traces with at least one span/log at this level or higher
        duration=1.0,  # include traces with at least this duration
    ),
    # Also include 10% of traces randomly from the beginning, before checking the other conditions.
    trace_sample_rate=0.1,
)

Improper usage will eat up memory as the spans are buffered.

cloudflare-workers-and-pages · 2024-06-14T10:59:22Z

Deploying logfire-docs with Cloudflare Pages

Latest commit:	`80e8f4e`
Status:	✅ Deploy successful!
Preview URL:	https://c7bf0f28.logfire-docs.pages.dev
Branch Preview URL:	https://alex-tail-sampling.logfire-docs.pages.dev

View logs

codecov · 2024-06-14T11:08:40Z

Codecov Report

All modified and coverable lines are covered by tests ✅

📢 Thoughts on this report? Let us know!

…pling

samuelcolvin · 2024-06-21T13:51:57Z

logfire/_internal/exporters/processor_wrapper.py

-    def shutdown(self) -> None:
-        self.processor.shutdown()
-
-    def force_flush(self, timeout_millis: int = 30000) -> bool:


is this the method we were using to get logfire to work with AWS lambda? If so, I guess we need it.

It's moved to the base class.

adriangb

Welp I guess you merged this already but please address comments

adriangb · 2024-06-21T14:24:11Z

logfire/_internal/config.py

+    tail_sampling: TailSamplingOptions | None
+    """Tail sampling options"""


Could just be a typeddict?

I tried that, I've also wanted ConsoleOptions and PydanticPlugin to be typed dicts, but they seemed worse. They can't define defaults in the class, and passing in a plain dict is actually not that user friendly.

Makes sense

adriangb · 2024-06-21T14:25:37Z

logfire/_internal/config.py

+            # Avoid using the usual sampler if we're using tail-based sampling.
+            # The TailSamplingProcessor will handle the random sampling part as well.
+            sampler = (
+                ParentBasedTraceIdRatio(self.trace_sample_rate)
+                if self.trace_sample_rate < 1 and self.tail_sampling is None
+                else None
+            )


How do we document / tell users they can't combine these? Is there a world where I want to head sample down to 10% (to reduce overhead in the SDK) and then tail sample down to 1%? There's still advantages to head sampling.

They can combine them, see the PR body or the new test_random_sampling.

Is there a world where I want to head sample down to 10% (to reduce overhead in the SDK) and then tail sample down to 1%?

I don't know what this means. Tail sample down to a percentage?

It sounds like you want to be able to discard most spans up front randomly regardless of whether tail-sampling would include them, so that a span only gets through if it's 'notable' AND 'lucky'.

Yes exactly

adriangb · 2024-06-21T14:27:50Z

logfire/_internal/exporters/tail_sampling.py

+        return self.started[0][0]
+
+
+class TailSamplingProcessor(WrapperSpanProcessor):


I'm really starting to feel like we should stop adding layers / wrapping processors like this. I would prefer to have a single processor that handles everything (sampling, batching, retries, etc.). I feel like it could be optimized more and would be easier to understand. It can still have pluggable bits just more explicit and less abstract.

This layer wraps all processors, including the console and user-defined processors.

I get that. But I feel like we should just make LogfireSpanProcessor which does all of those things in one place. In particular we avoid double buffering.

adriangb · 2024-06-21T14:29:03Z

logfire/_internal/exporters/tail_sampling.py

+        self.duration: float = (
+            float('inf') if options.duration is None else options.duration * ONE_SECOND_IN_NANOSECONDS
+        )


I'd like these durations to have the unit in their name. So self.duration -> self.duration_ns and TailsamplingOptions.duration -> TailSamplingOptions.duration_sec or something like that.

alexmojaki added 3 commits June 13, 2024 14:46

tail sampling

cc7211d

test_level_sampling

fa714d2

test_duration_sampling

6cf0ba1

alexmojaki added 2 commits June 14, 2024 13:02

3.8

f01cdfd

3.8

486d35e

alexmojaki added 9 commits June 14, 2024 17:45

Test (de)serializing tail_sampling config

a251545

pragma

8be336b

no param manager

f5c4de6

docs

5d1491b

random

3ef6d5e

comments

9bf994e

comments

4996d74

comments

789124b

Merge branch 'main' of github.com:pydantic/logfire into alex/tail-sam…

b39de8a

…pling

alexmojaki marked this pull request as ready for review June 18, 2024 18:56

alexmojaki requested review from adriangb, samuelcolvin and Kludex June 18, 2024 18:56

samuelcolvin reviewed Jun 21, 2024

View reviewed changes

Merge branch 'main' into alex/tail-sampling

80e8f4e

alexmojaki enabled auto-merge (squash) June 21, 2024 14:38

alexmojaki merged commit e811c48 into main Jun 21, 2024
11 checks passed

alexmojaki deleted the alex/tail-sampling branch June 21, 2024 14:40

adriangb reviewed Jun 21, 2024

View reviewed changes

alexmojaki mentioned this pull request Aug 26, 2024

Remove default_span_processor parameter from configure #400

Merged

alexmojaki mentioned this pull request Sep 26, 2024

Add metrics parameter to logfire.configure() #444

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tail sampling #263

Tail sampling #263

alexmojaki commented Jun 14, 2024 •

edited

Loading

cloudflare-workers-and-pages bot commented Jun 14, 2024 •

edited

Loading

codecov bot commented Jun 14, 2024 •

edited

Loading

samuelcolvin Jun 21, 2024

alexmojaki Jun 21, 2024

adriangb left a comment

adriangb Jun 21, 2024

alexmojaki Jun 21, 2024

adriangb Jun 21, 2024

adriangb Jun 21, 2024

alexmojaki Jun 21, 2024

adriangb Jun 21, 2024

adriangb Jun 21, 2024

alexmojaki Jun 21, 2024

adriangb Jun 21, 2024

adriangb Jun 21, 2024

		tail_sampling: TailSamplingOptions \| None
		"""Tail sampling options"""

		return self.started[0][0]


		class TailSamplingProcessor(WrapperSpanProcessor):

Tail sampling #263

Tail sampling #263

Conversation

alexmojaki commented Jun 14, 2024 • edited Loading

cloudflare-workers-and-pages bot commented Jun 14, 2024 • edited Loading

Deploying logfire-docs with Cloudflare Pages

codecov bot commented Jun 14, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmojaki commented Jun 14, 2024 •

edited

Loading

cloudflare-workers-and-pages bot commented Jun 14, 2024 •

edited

Loading

codecov bot commented Jun 14, 2024 •

edited

Loading