Dynamically resolved metadata (normalized_metadata) in core #2048

CAD97 · 2022-04-04T18:08:01Z

(With humble apologies to the CODEOWNERS getting pinged for mechanical changes...)

Best reviewed commit-by-commit. I've gone to lengths to ensure that each commit is a semantically meaninful chunk on its own and builds/passes all tests.

This is a "what if" mainly to scratch my own itch. Feel free to say "we don't want this", as it is not insignificant chunk of additional complexity, and broadly breaking changes to core APIs. However, I do think this is generally beneficial, thus the implementation and PR.

PR #2043 offers a lesser solution which works for [email protected]. This is generally the "principled" version of that hack.

Motivation

Consider the case where I'm binding to an external library that provides C FFI. (For convenience, we'll call it lib.) Because they're trying to provide a nicely usable library, lib performs some diagnostic logging.

By default, lib logs directly to stderr. However, because lib wants to integrate well into your application, they provide a callback API for you to handle lib's logs. The interface looks something like this:

/// Configure delivery of debug messages.
extern "C" Lib_Configure_Debug(
    flags: Lib_Debug_Flags,
    callback: Option<extern "C" fn(
        /// The flags (e.g. severity) for this debug message.
        flags: Lib_Debug_Flags,
        /// The file name that this debug message is from.
        file: *const c_char,
        /// The line number that this debug message is from.
        line: c_int,
        /// The semantic location that this debug message is from.
        func: *const c_char,
        /// The actual debug message to be e.g. logged.
        message: *const c_char,
    ) -> Lib_Result>;
) -> Lib_Result;

We can then hook up lib to tracing like this:

let enable_flags = determine_enabled();
ffi::Lib_Configure_Debug(enable_flags, lib_debug_callback)?;

extern "C" fn lib_debug_callback(
    flags: ffi::Lib_Debug_Flags,
    file: *const ffi::c_char,
    line: ffi::c_int,
    func: *const ffi::c_char,
    message: *const ffi::c_char,
) -> ffi::Lib_Result {
    let level = debug_flags_to_level(flags)?;
    let file: Option<&str> = cstr(file)?;
    let line: u32 = line.try_into()?;
    let func: Option<&str> = cstr(func)?;
    let message: Option<&str> = cstr(message)?;
    
    tracing::event!(
        level: level,
        // wait... how do we specify file/line?
        message
    );
    
    ffi::Lib_Result_Ok
}

In current tracing, it's impossible to hook this debug callback into tracing properly, because Metadata is 'static, and the metadata is where file/line/module_path live.

I want it to be possible to do so. And hey, why doesn't tracing-log have this same problem?..

Solution

tracing-log solves the issue by conspriring with approximately the entire tracing ecosystem. Specifically, tracing-log contains the NormalizeEvent trait, whose purpose is to take whatever static metadata the event thinks it has and provide the real, normalized metadata instead.

Event normalization is done in tracing-log by identifying that an event came from a tracing-log callsite, and then using the fields of that event to smuggle in the dynamic metadata that log provides.

That's where the conspriring comes into play. Either code knows about tracing-log, uses NormalizeEvent and filters out event fields log.target, log.module_path, log.file, and log.line for is_log() events¹, or they treat events emitted from log's macros incorrectly; potentially making incorrect choices on the metadata, and logging the metadata as semantic log fields.

This PR takes much the same approach, but in doing so directly in tracing-core, can do so mostly transparently to most consumers. However, we additionally support dynamic metadata holding any number of custom fields, so checking for a known callsite is no longer sufficient.

To identify a piece of dynamic metadata, we use a magic name. The name is constructed of two randomly chosen noncharacters, so should be sufficiently unique to not occur accidentally. Each metadata field has a magic name prefixed with the same magic, and are used to normalize the metadata as in tracing-log.

To hide the magic fields when not requested, we require them to be a strict (and ordered) prefix² of the metadata's FieldSet. As such, we can slice off the prefix and only leave the nonmagic fields in the set. For events' ValueSet, we require by construction³ that events with dynamic metadata use the normalized FieldSet. For spans' attributes... a solution has yet to be designed³, but will likely take the same approach.

Each commit is a semantic step towards this goal, and compiles and passes tests individually. For more information, see each commit, which is has additional context attached as well. To accomplish dynamic metadata, we:

Add a lifetime to field::Visit, which communicates the lifetime of any visited string. Existing visitors can do impl Visit<'_> for the existing semantics.
Use the lifetime on field::Visit to remove the unsafety in tracing-log. The lifetime will allow doing so generically in the uplifted solution for spans as well as events.
Uplift the RecordFields trait into tracing-core.
Remove 'static from Metadata::name, so it can be injected as well, to replace the magic name for spans. (The metadata name is typically unused for events, but used for spans. Spans using dynamic metadata are expected to be less common, but will certainly want to inject a name for post-normalization.)
Make places that don't need &'static Metadata<'static> use a shorter lifetime, so they can work with dynamic metadata.
The big meat: implement the actual support for dynamic metadata. The mere act of providing the patch data is simple enough, but presenting a normalized view to the outside world is where things get interesting. Event and tracing-subscriber's SpanRef should present consistent post-normalization views unless prenormal data is requested.
Payoff: use the system we've implemented to implement tracing-log.

Remaining work

Tests, tests, tests. tracing-log acts as a simple proof that the system works, but we need more tests ensuring that magic fields don't leak into the public API except where explicitly requested.
APIs to actually emit spans/events with dynamic metadata.
Did I mention more tests?

And here's an interesting note: currently, they do so by just always filtering out fields starting with "log." This means potential data loss for events which don't come from the log crate; a tracing::error!(log.data, "oh no"); will not record the log.data field for most subscribers. Even more annoying, consider putting log.file; it's not enough to filter out just the magic fields, they have to be filtered only when NormalizeEvent::is_log(). ↩
When writing this, I realized that having them as a postfix may be simpler, as FieldSet::contains(Field) can continue to be a simple bounds check rather than having to account for an offset slice start. ↩
While there is a macro for creating the dynamic metadata itself, there is no public API for constructing/dispatching the event or span with said metadata. tracing-log continues to use its bespoke dispatch code. ↩ ↩²

CAD97 · 2022-04-28T13:45:13Z

Technically closes #372, though supplying a dynamic level still isn't advised per the added documentation.

CAD97 · 2022-06-14T19:17:28Z

(Sorry for flooding the actions queue with the rebase, but for this large PR I feel showing/proving each commit independently passes CI is important)

gferon · 2023-02-02T13:31:25Z

@CAD97 thanks for this change, we were going on this exact same road before knowing that this PR existed. Is it something that you're willing to drive to the end?

CAD97 · 2023-02-05T00:08:17Z

@gferon I'm still excited to see something along these lines happen, and I've rebased the PR. Unfortunately, I don't have much time I can spend on refining this at this exact moment. I did my best to make sure each commit is a single unit and fairly easily understood in isolation; if y'all want to take this and go from here, or take it as a draft and redo it, I'm fine either way.

I might end up with some time to spend, in which case I'd love to spend that driving this change, but I can't promise anything at the moment; it's my final semester at $gradschool currently and that takes priority 🙂

Partially an exercise in "how bad would it be?" This allows the removal of the unsafety in tracing-log's LogVisitor (see next commit), as well as generalization of NormalizeEvent to more RecordField containers.

Changes like this are the purpose of the previous commit. Now that Visit has a lifetime parameter, it can be used to extract &str data in a safe and general manner. While tracing-log is already sound (as indicated by this safe impl being valid), the lifetime unlocks more generic uses.

For the time being, it's also been unsealed. There doesn't seem to be a specific reason to seal it, other than it just existing to generalize over the provided types, and not being an input generalization. It can be resealed without issue; however, it being in tracing-core is necessary for what's coming up.

Yes, I'm as surprised as you are that this causes 0 issues.

Now that places are using non-'static metadata, we can overlay a patched view of the metadata. Using the strategy seen in tracing-log, we recognize specifically named metadata/fields and use the values they provide to create metadata with the resolved runtime values. Importantly, we also strive to only expose dynamic injection fields when explicitly asked for. This means that most code can continue on without knowing that these fields even exist. Applying these patches does have a cost; notably, things which used to deal in references like &Metadata or &FieldSet now often use an owned, patched version instead. These types are primarily bags of references themselves, and no copying of actual data is involved, but the fact that the bags of references (which aren't exactly tiny) are copied does have an impact. More places which don't have to deal with metadata patching (e.g. the macro support code) should be audited for what can use the _prenormal versions of methods. Additionally, there are likely places where the magic fields leak out without being explicitly asked for. Ideally, these are all bugs which can be fixed. There aren't really any tests that effectively check for this, so new tests will need to be written. Finaly, there's the question of actually dispatching such a dynamic event. tracing-log is an example of doing so (but uses touchy, hidden API to construct the ValueSet) (implemented in the next commit), but a generally applicable is likely desirable. This will probably take the shape of dynamic_span! and dynamic_event! macros in tracing proper.

And now we reach the point of this commit series. tracing-log is now capable of implementing its dynamic metadata resolution on solely publicly available APIs*, and thus so can anyone else do the same. *Dispatch still uses #[doc(hidden)] APIs to create the value set. Figuring out how to best expose is still an open question.

emersonford · 2023-02-07T19:11:01Z

just a note, this would make #2457 way cleaner to implement and would allow it to be an independent crate :)

gferon · 2023-02-07T19:15:05Z

just a note, this would make #2457 way cleaner to implement and would allow it to be an independent crate :)

This is exactly the same I'm trying to pursue with this, but with C++ code using spdlog and this PR would help a lot indeed.

CAD97 force-pushed the metadata-injection branch from 8fd1166 to 5588714 Compare April 4, 2022 18:08

CAD97 marked this pull request as ready for review April 4, 2022 21:11

CAD97 requested review from jtescher, a team, yaahc, hawkw, carllerche and davidbarsky as code owners April 4, 2022 21:11

CAD97 mentioned this pull request Apr 28, 2022

Dynamic level support #372

Open

This was referenced Jun 11, 2022

Introduce a lifetime generic to Visit in order to allow borrowing out specific parts of the valuable::Value during visits tokio-rs/valuable#97

Open

Expose tracing-log's metadata normalization system #2043

Closed

CAD97 force-pushed the metadata-injection branch from 8fd1166 to 557103e Compare June 14, 2022 19:14

CAD97 force-pushed the metadata-injection branch 3 times, most recently from 61a1379 to 161f938 Compare June 14, 2022 20:04

This was referenced Jul 25, 2022

log: Support structured values #341

Open

subscriber: rewrite Registry to use static arrays #2230

Open

CAD97 force-pushed the metadata-injection branch 2 times, most recently from df959a2 to 9763bdc Compare February 5, 2023 00:05

CAD97 added 7 commits February 4, 2023 18:09

Add lifetime to field::Visit

d4c8525

Partially an exercise in "how bad would it be?" This allows the removal of the unsafety in tracing-log's LogVisitor (see next commit), as well as generalization of NormalizeEvent to more RecordField containers.

Remove 'static requirement from Metadata::name

c001fea

Yes, I'm as surprised as you are that this causes 0 issues.

Make more things work with non-'static metadata

6afe420

CAD97 force-pushed the metadata-injection branch from 9763bdc to 6d8bce1 Compare February 5, 2023 00:10

KodrAus mentioned this pull request Feb 27, 2024

Release a build with kv support rust-lang/log#616

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamically resolved metadata (normalized_metadata) in core #2048

Dynamically resolved metadata (normalized_metadata) in core #2048

CAD97 commented Apr 4, 2022 •

edited

Loading

CAD97 commented Apr 28, 2022

CAD97 commented Jun 14, 2022

gferon commented Feb 2, 2023

CAD97 commented Feb 5, 2023

emersonford commented Feb 7, 2023

gferon commented Feb 7, 2023

Dynamically resolved metadata (normalized_metadata) in core #2048

Are you sure you want to change the base?

Dynamically resolved metadata (normalized_metadata) in core #2048

Conversation

CAD97 commented Apr 4, 2022 • edited Loading

Motivation

Solution

Remaining work

Footnotes

CAD97 commented Apr 28, 2022

CAD97 commented Jun 14, 2022

gferon commented Feb 2, 2023

CAD97 commented Feb 5, 2023

emersonford commented Feb 7, 2023

gferon commented Feb 7, 2023

CAD97 commented Apr 4, 2022 •

edited

Loading