-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add log-formatting to the otel-LogHandler #3673
base: main
Are you sure you want to change the base?
Conversation
|
…before emitting it as the body, and retain all meaningful attributes of the original logrecord.
I think this is against the semantic conventions. To understand what I mean, see #3675 in otel we have standard semantic conventions for things like |
@bogdandrutu Yeah it is. Scanning over |
It is ok to create an issue, but would be worried merging this PR since then it becomes a breaking change. |
In that case I'll try to find all semantic conventions that exist for the standard logrecord attributes and add them with the right names at least. It might actually be just the |
Ok, I've updated my fork with your changes and suggestions. |
opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py
Outdated
Show resolved
Hide resolved
opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py
Outdated
Show resolved
Hide resolved
@bogdandrutu @mariojonke if it's not a bother, I'd like to ask if the PR in its current state could be approved, or if there is something left that I should improve on. Thank you for your time. |
Discussed in 07/11/2024 SIG. Today we have an exclude list of attributes (which essentially includes all of python log record attributes) that are ignored when mapping to Otel LogRecord attributes. We do recognize that the transformation of python logging record to Otel LogRecord is confusing for Python users (especially ones that are coming in to use formatters expecting certain attributes to exist). We believe this issue of discovery and transparency is a separate issue from the inability to allow formatting of logrecords. We can address the discovery issue separately (whether it be part of this PR or not is up to @a-recknagel :) or you can open up a separate tracking issue to maybe improve our documentation). The PR as is is valuable in itself without introducing any attribute changes. @a-recknagel if you agree with this guidance, please update the pr with any changes you think are necessary to reflect the conclusion and let people know when it is ready to review. Thanks for your hard work! |
@lzchen now that I read the pr again, I guess we should have also discussed the matter of the breaking change, my bad. Should it be picked up in the next SIG again? That would give me a bit of time to write the docs, too. |
Yes of course! Feel free to add your topic to the agenda for the Python. Thanks a lot for your patience and hard work on this issue :) |
|
I don't know why it's there, it's not part of the issue text so I can't remove it. |
# Conflicts: # CHANGELOG.md
@lzchen @jeremydvoss I have removed the code which changed the attributes on log-records in order to keep this PR more in line with its intended purpose of adding log-formatting, and only that. I hope this will make it easier to add this change to the next release. |
self.assertTrue("division by zero" in stack_trace) | ||
self.assertTrue(__file__ in stack_trace) | ||
self.assertIsNotNone(log_record) | ||
self.assertIn("Zero Division Error", log_record.body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hides a bad breaking change. Exception messages now weirdly add a newline and NoneType parameter.
|
||
self.assertIsNotNone(log_record) | ||
self.assertEqual(len(log_record.attributes), 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this was removed. Attributes should stay the same even when formatted.
I recommend closing this PR since #4166 has been merged. The other changes that were in the earlier iterations of this PR can be separated into a new PR. |
Description
Added support for log message formatters to
opentelemetry.sdk._logs._internal.LogHandler
.Fixes #3664
Fixes #3353
Type of change
self.format
call on the originalLogRecord
to the code in the LogHandler'semit
method, the result is added as thebody
field of the otel-LogRecord
.LogRecord
fields from getting added to the otel-LogRecord
. I identified three sections, 1) those that are considered "private" by theLogRecord
API (see a note "You shouldn’t need to format this yourself" in https://docs.python.org/3/library/logging.html#logrecord-attributes) or only get used to cache results, 2) those that have a semantic convention and have their key translated accordingly, and 3) those that do not have a semantic convention (yet) and just get plain dropped.The documentation should include an example which shows how exactly adding a formatter changes the result of what gets emitted to the receiver. I think there is some element of surprise here, because the standard procedure is that, if a formatter was set, the handler emits a string whose content is wholly dictated by said formatter.
The breaking change is due to me removing a check where a non-string
msg
attribute in the originalLogRecord
will be attached to the otel-LogRecord
as-is. Now, theformat
call will always return a string, even if no formatter was set andargs
is empty. I believe this is the right behavior and makes the codebase easier to maintain. Code that looked like thiscould, before this change, be interpreted as a complex data structure by some log-renderer. I'd argue that the "right way of logging" (with this example here illustrating a migration-path) should be the following:
How Has This Been Tested?
I wrote three new unit tests, and updated a few other ones that broke due to the change.
Does This PR Require a Contrib Repo Change?
Checklist:
Breaking Change for the log-record's body-field
Type change
Important context here is the formal definition of the body-field in the otel data-model docs. Summed up, the key points are:
any
, so exporters need to support it as suchThe proposed patch would change the possible type of the emitted data to effectively be just
string
. This is not in conflict with the data-model, sincestring
is a subtype ofany
. Exporters should still supportany
to be compliant with the spec, even if it would not occur in the reference implementation (i.e. users can of course choose to subclass otel'sLoggingHandler
and overwrite itsemit
method to write something else than astring
into the emitted log-record's body-field).Content Change
Given a logging-call like
Formerly, the content of the
body
field of the resulting record would always be just"foo"
. After this change, runninglogging.basicConfig
would attach a formatter usingBASIC_FORMAT
which transform that message into"INFO:foo:bar"
on the body field. If you'd want to keep the former behavior, you'd have to avoid usinglogging.basicConfig
or explicitly switch its formatter withlogging.Formatter("%(message)s")
afterwards.Excursion: What are the
args
in a log-record for?If you take a look at the signature of
Logger.log
(which is whatLogger.debug
,Logger.info
, etc. call with their respectivelevel
), there is an*args
in there, which is also persisted in theLogRecord
as an attribute.Given a complex object such as
its purpose is to safe time and effort on log-messages with expensive serialization operations to avoid constructs like
where you don't want to spend 5 seconds waiting for
complex_object
to be turned into a string just to not end up emitting the result at all because your loglevel was higher than debug.The same can be achieved with just
And that's it, that's all that the
*args
inLogger.log
should reasonably be used for. In practice, this rarely ever comes up.Examples
We are going to use this script to test the behavior of the logging for the current main-branch (original), and what it would look like after applying this patch (patched):
otel_logging.py
The output is quite verbose, so I'll only show the actual differences here, namely the content of the
body
-field.original
Note that the original type of the first call-argument is retained, string-formatting of the
args
took place, but the specified logging-formatter was not called. As can be seen in the last call,None
has also been transformed intonull
by the otel processor/exporter, which will be picked up on in the next section.patched
Note that all of these bodies are strings.
Justification of the change and discarded alternatives
Leave as-is
Python developers expect log-formatting to work. I acknowledge that as far as otel is concerned it is in a bit of an awkward space, since it could be considered a kind of "rendering" that should be handled in the frontend, and not within the client. But it's a staple of python's standard logging, there is an always-on default, and its absence caused a fair bit of confusion already (see linked issues).
Also, the current behavior is buggy for non-jsonable objects. If there were an additional logging event
logging.info(set())
, it would not appear in the output, and instead be quietly discarded.Only apply log-formatting if the first argument is a string
This would keep the option open to emit
any
-type arguments, but would in my personal opinion be inconsistent behavior and cause confusion. It would share the bug of the current implementation with respect to non-jsonable objects, and the output would look like this:mixed
A possible issue here could be that the bodies lacking the format-information might fall through if dashboard-users are not aware of this discrepancy, e.g. "I know something went wrong in app.py, so I'll search for all bodies with
*.\[app\.\d+\].*
". Following the least astonishment principle and being consistent with how the body looks in general would be preferable in such a case.Merge this PR
Log-formatting would work in a predictable, consistent, expected, and robust manner. We would lose the option to log structured data though, but at least for dictionaries there is a migration path. While sparingly documented (in the docs of
Logger.debug
, if you scroll a bit), you can pass dictionaries during logging with theextra
-keyword and have its values persisted as log-record attributes:This is not new functionality, it was supported for a long time and would, as far as I can tell, provide a clean migration-path for users currently relying on passing structured data passed in log-events retaining a parseable shape:
Note that
qux
was filtered out though, sinceNone
is not a valid attribute-value as decided by theAttributeValue
type and the_clean_attribute
function applying it.