LogRecord Resource Serialization #3346

sernst · 2023-06-13T13:46:37Z

Description

Currently, the LogRecord.to_json method serializes the resource object using repr of its attributes. This differs from how the serialization process is handled in ReadableSpan.to_json and MetricsData.to_json, which utilize the
Resource.to_json functionality directly. Using repr does not produce a json-parseable output and doesn't follow the same depth of serialization as the other two signal types. Therefore, this change carries over the serialization process from the spans and metrics signal types to the logs type.

Fixes #3345

Type of change

Please delete options that are not relevant.

[ X ] Bug fix (non-breaking change which fixes an issue)
[ X ] Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested?

Ran tox as specified in CONTRIBUTING.md and tested within a simple test application using the ConsoleLogExporter to see the changes echoed in the terminal.

Does This PR Require a Contrib Repo Change?

Yes. - Link to PR:
[ X ] No.

Checklist:

[ X ] Followed the style guidelines of this project
[ X ] Changelogs have been updated
[ X ] Unit tests have been added - modified in this case
[ X ] Documentation has been updated - Not Needed

lzchen · 2023-06-15T16:37:48Z

opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py

@@ -211,9 +211,9 @@ def to_json(self, indent=4) -> str:
                if self.span_id is not None
                else "",
                "trace_flags": self.trace_flags,
-                "resource": repr(self.resource.attributes)
+                "resource": json.loads(self.resource.to_json())


Please pass in the indent to to_json()

@lzchen The serialized JSON is immediately deserialized before being reserialized again as part of the larger data structure. The indent should only matter in terms of consistency for that final serialization process, right?

See comparable ReadableSpan that behaves identically:

opentelemetry-python/opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py

Line 493 in 6b9f389

f_span["resource"] = json.loads(self.resource.to_json())

To add some context, this to_json was originally added for debugging purposes to be used in ConsoleExporter. That's why there is a repr.

@lzchen 🎗️

Hi @sernst, what do you think about making a change to -- instead of converting the Resource to JSON then immediately converting that JSON to an object -- refactor the code a bit to add a method to Resource, e.g. to_dictionary that returns a dictionary representation that both to_json can use and that can be used here, e.g.

"resource": self.resource.to_dictionary()

Totally understand that your code here is consistent with the rest of the codebase, and no worries if you feel my suggestion would be out of scope -- we can definitely make these changes in a separate PR.

@pmcollins Yeah, I agree that avoiding the conversion to JSON and back with a direct dictionary serialization is a much better way to go. I would suggest that this should go beyond just the resource object. I noticed the pattern heavily used in within the sub-objects of ReadableSpan, MetricsData and LogRecord conversion. I think it would make sense to make all of the objects involved in the serialization process capable of dictionary serialization without intermediate JSON serialization.

I've just added a second commit to this PR, which does just that:

64946b4

These new changes introduce to_dict to the objects included in the existing JSON serialization process for ReadableSpan, MetricsData, LogRecord, and Resource objects. This includes adding to_dict to objects that are included within the serialized data structures of these objects. In places where repr() serialization was used, it has been replaced by a JSON-compatible serialization instead. Inconsistencies between null and empty string values were preserved, but in cases where attributes are optional, an empty dictionary is provided as well to be more consistent with cases where attributes are not optional and an empty dictionary represents no attributes were specified on the containing object.

These changes also included:

Dictionary typing was included for all the to_dict methods for clarity in subsequent usage.

DataT and DataPointT were did not include the exponential histogram types in point.py, and so those were added with new to_json and to_dict methods as well for consistency. It appears that the exponential types were added later and including them in the types might have been overlooked. Please let me know if that is a misunderstanding on my part.

OrderedDict was removed in a number of places associated with the existing to_json functionality given its redundancy for Python 3.7+ compatibility. I was assuming this was legacy code for previous compatibility, but please let me know if that's not the case as well.

to_dict was added to objects like SpanContext, Link, and Event that were previously being serialized by static methods within the ReadableSpan class and accessing private/protected members. This simplified the serialization in the ReadableSpan class and those methods were removed. However, once again, let me know if there was a larger purpose to those I could not find.

Finally, I used to_dict as the method names here to be consistent with other related usages. For example, dataclasses.asdict(). But, mostly because that was by far the most popular usage within the larger community:

328k files found on GitHub that define to_dict functions, which include some of the most popular Python libraries to date:

https://github.com/search?q=%22def+to_dict%28%22+language%3APython&type=code&p=1&l=Python

versus

3.3k files found on GitHub that define to_dictionary functions:
https://github.com/search?q=%22def+to_dictionary%28%22+language%3APython&type=code&l=Python

However, if there is a preference for this library to use to_dictionary instead let me know and I will adjust.

Wow awesome solution @sernst! Yes, it appears that to_dict is a better choice 😄

Do you think your second commit warrants its own PR? (I'm not sure what the convention is in this repo). Thanks.

Thanks! Yeah, I'm not sure what the preference would be here either. I'm happy to separate it out into another PR if that is preferred.

Changes from the second commit discussed here have been moved into a separate PR:

#3365

tracking against #3364

@srikanthccv In response to yesterday's comment, please see previous on this thread for those topics already being discussed and actions taken. Would have been helpful to have feedback in context rather than a bifurcated thread. Thanks!

pmcollins · 2023-06-20T14:52:22Z

opentelemetry-sdk/tests/logs/test_log_record.py

@@ -31,7 +31,7 @@ def test_log_record_to_json(self):
                "trace_id": "",
                "span_id": "",
                "trace_flags": None,
-                "resource": "",
+                "resource": None,


Might be nice to add a test to verify that a non-empty Resource also converts to json correctly.

Yep, certainly agree that coverage was missing. As a first time contributor here I was trying to things slim, but I'm happy to add it:

https://github.com/open-telemetry/opentelemetry-python/pull/3346/files#diff-c202e3816c113727d0d32d83fa4becda1632ae14c6bf62b7421e96cb936b2c85R45

pmcollins

Changes LGTM and seem like a nice improvement.

ocelotl · 2023-06-28T14:35:35Z

opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/__init__.py

+    def to_dict(self) -> LogRecordDict:
+        return {
+            "body": self.body,
+            "severity_number": getattr(self.severity_number, "value", None),


There is an UNSPECIFIED severity value, should we used it here instead of None?

Yep, that seems reasonable to me. Made the change and updated the LogRecordDict to reflect that severity_number will be an int instead of an Optional[int].

srikanthccv · 2023-07-01T15:55:46Z

This added a lot of public symbols and changes that are not related to the original issue in the last commit. Please keep the changes limited to the linked issue. I wouldn't add all the to_dict public API classes or any new symbols. Create separate issues and discuss them with maintainers/approvers because we try to keep the public API small.

Currently, the `LogRecord.to_json` method serializes the resource object using `repr` of its attributes. This differs from how the serialization process is handled in `ReadableSpan.to_json` and `MetricsData.to_json`, which utilize the `Resource.to_json` functionality directly. Using `repr` does not produce a json-parseable output and doesn't follow the same depth of serialization as the other two signal types. Therefore, this change carries over the serialization process from the spans and metrics signal types to the logs type. Fixes open-telemetry#3345

ocelotl · 2023-07-12T16:51:14Z

@sernst there are failing tests. Should the expected output include 0 dropped attributes?

lzchen · 2023-08-16T22:05:49Z

@sernst

bump. Are you still working on this?

sernst · 2023-08-18T12:54:14Z

@lzchen Yep, will re-engage here soon.

ocelotl · 2024-03-20T17:58:57Z

This PR needs fixes, marking it as draft.

xrmx · 2024-06-26T08:13:54Z

Closing since #3972 did the roughly the same change

related to fix in open-telemetry/opentelemetry-python#3346

sernst requested a review from a team June 13, 2023 13:46

lzchen reviewed Jun 15, 2023

View reviewed changes

lzchen approved these changes Jun 15, 2023

View reviewed changes

pmcollins reviewed Jun 20, 2023

View reviewed changes

sernst force-pushed the fix-log-redord-json-resource-export branch from a63486a to 587581e Compare June 20, 2023 19:56

pmcollins approved these changes Jun 23, 2023

View reviewed changes

ocelotl reviewed Jun 28, 2023

View reviewed changes

sernst force-pushed the fix-log-redord-json-resource-export branch from 64946b4 to 5e47fcd Compare June 28, 2023 21:38

sernst force-pushed the fix-log-redord-json-resource-export branch from 5e47fcd to 0dff7dd Compare July 2, 2023 13:31

sernst mentioned this pull request Jul 2, 2023

Inefficient and undertyped to_json public data structures #3364

Open

sernst force-pushed the fix-log-redord-json-resource-export branch 2 times, most recently from cb1a3ce to 37de27a Compare July 2, 2023 13:55

ocelotl marked this pull request as draft March 20, 2024 17:58

harelmo-lumigo mentioned this pull request Apr 7, 2024

Support logging lumigo-io/opentelemetry-python-distro#725

Merged

xrmx closed this Jun 26, 2024

eugene-lumigo pushed a commit to lumigo-io/opentelemetry-python-distro that referenced this pull request Aug 29, 2024

fixing test_logging

13e6efe

related to fix in open-telemetry/opentelemetry-python#3346

eugene-lumigo mentioned this pull request Aug 30, 2024

TRAC-1-Support-python-3.12-in-python-otel-distro lumigo-io/opentelemetry-python-distro#873

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LogRecord Resource Serialization #3346

LogRecord Resource Serialization #3346

sernst commented Jun 13, 2023

lzchen Jun 15, 2023

sernst Jun 16, 2023

srikanthccv Jun 16, 2023

sernst Jun 20, 2023

pmcollins Jun 21, 2023

sernst Jun 22, 2023

pmcollins Jun 22, 2023

sernst Jun 22, 2023

sernst Jul 2, 2023

sernst Jul 2, 2023

pmcollins Jun 20, 2023

sernst Jun 20, 2023

pmcollins left a comment

ocelotl Jun 28, 2023

sernst Jun 28, 2023

srikanthccv commented Jul 1, 2023 •

edited

Loading

ocelotl commented Jul 12, 2023

lzchen commented Aug 16, 2023

sernst commented Aug 18, 2023

ocelotl commented Mar 20, 2024

xrmx commented Jun 26, 2024

LogRecord Resource Serialization #3346

LogRecord Resource Serialization #3346

Conversation

sernst commented Jun 13, 2023

Description

Type of change

How Has This Been Tested?

Does This PR Require a Contrib Repo Change?

Checklist:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmcollins left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srikanthccv commented Jul 1, 2023 • edited Loading

ocelotl commented Jul 12, 2023

lzchen commented Aug 16, 2023

sernst commented Aug 18, 2023

ocelotl commented Mar 20, 2024

xrmx commented Jun 26, 2024

srikanthccv commented Jul 1, 2023 •

edited

Loading