Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExceptionDictTransformer incorrectly formatting strings #638

Open
wochinge opened this issue Jul 30, 2024 · 8 comments
Open

ExceptionDictTransformer incorrectly formatting strings #638

wochinge opened this issue Jul 30, 2024 · 8 comments

Comments

@wochinge
Copy link

Version
24.4.0

Code to reproduce

def test_structlog():
    structlog.configure(processors=[ExceptionRenderer(ExceptionDictTransformer(show_locals=True)), JSONRenderer()])
    logger = structlog.get_logger(__name__)
    try:
        my_variable = "number"
        raise ValueError()
    except ValueError as e:
        logger.exception("An error occurred")

Problem
The local string variable my_variable is escaped as "'number'" where it should be just `"number"

{"event": "An error occurred", "exception": [{"exc_type": "ValueError", "exc_value": "", "syntax_error": null, "is_cause": false, "frames": [{"filename": "...", "lineno": 11, "name": "test_structlog", "locals": {"logger": "<BoundLoggerLazyProxy(logger=None, wrapper_class=None, processors=None, context_class=None, initial_values={}, logger_factory_args=('test_tobi',))>", "my_variable": "'number'", "e": "ValueError()"}}]}]}

Root Cause

This is caused by this line. Instead of repr(object) it should be just object for strings

@hynek
Copy link
Owner

hynek commented Jul 30, 2024

@sscherfke 😇

@sscherfke
Copy link
Contributor

This is an expected change. repr() is now used for every object, including strings. This is needed so that Pydantic Secret Types and Typed Settings SecretStr do not leak.

For example:

>>> import json
>>> from typed_settings.types import SecretStr
>>>
>>> json.dumps({"a": repr("spam"), "b": repr(SecretStr("eggs"))})
'{"a": "\'spam\'", "b": "\'*******\'"}'
>>>
>>> json.dumps({"a": "spam", "b": SecretStr("eggs")})
'{"a": "spam", "b": "eggs"}'

I’m sure we could find a way to get sth. like {"a": "spam", "b": "*******"} but this would unnecessarily(?) bloat the code.

@sscherfke
Copy link
Contributor

Okay, Pydantic is not affected – their Secretstr does not inherit str. Either they changed behavior in v2 or I didn't remember it correctly.

@sscherfke
Copy link
Contributor

sscherfke commented Jul 31, 2024

Rich's "to_repr" implementation does the same thing and when printing "'spam'" it becomes just 'spam' which is the (new) expected result.

So imho, this issue should be closed as wontfix/works-as-expected/... :)

@wochinge
Copy link
Author

Thanks for your swift responses 🙌🏻

So if I understand that correctly, you suggest this is not an issue because the Renderer is taking care of representing the data correctly?

In my above example code, the JSONRenderer doesn't do that which results in the extra single quotes "'number'".

Given your logic this means that structlog ties the correct logging of log data to the specific Renderer implementation. Doesn't this break the concept of having isolated components which you plug together?

@sscherfke
Copy link
Contributor

sscherfke commented Aug 1, 2024

The dict of locals which is part of the stacktrace is now a dict mapping names (as strings) to reprs (as strings).

In the older versions, there was an exceptions for strings, so the mapping was not a "mapping of names to reprs" but a "mapping of names to reprs but also maybe strings".

Now it works like this: The repr of a numer is, for example, 42, the repr of an arbitrary object is <object object at 0x1003c47e0> and the repr of a string is 'spam'. So a dict with these three locals values will be {"a": "42", "b": "<object object at 0x1003c47e0>", "c": "'spam'"}.

What the renderers do with this mapping is up to them but the json renderer happens to render it as shown in the example above.

@hynek
Copy link
Owner

hynek commented Aug 2, 2024

Hmmm, that seems bad though? Firstly, it's a breaking change and secondly we'd have to modify all renderers and our users would have to modify all of their renderers?

@sscherfke
Copy link
Contributor

The to_repr() function is only used by the dict tracebacks and that is only (usually) only used with a JSON renderer. And then only the looks of strings are changed in the "locals" dictionary:

>>> import structlog
>>> structlog.configure([structlog.processors.dict_tracebacks, structlog.processors.JSONRenderer(indent=2)])
>>> log = structlog.get_logger()
>>> a = 42
>>> b = object()
>>> c = "spam"
>>> try: 1 / 0
... except: log.exception("onoes")
... 
{
  "event": "onoes",
  "exception": [
    {
      "exc_type": "ZeroDivisionError",
      "exc_value": "division by zero",
      "syntax_error": null,
      "is_cause": false,
      "frames": [
        {
          "filename": "<stdin>",
          "lineno": 1,
          "name": "<module>",
          "locals": {
            "structlog": "<module 'structlog' from '/Users/stefan/Projects/structlog/src/structlog/__init__.py'>",
            "rich": "<module 'rich' from '/Users/stefan/Projects/structlog/.venv/lib/python3.12/site-packages/rich/__init__.py'>",
            "a": "42",
            "b": "<object object at 0x100d313d0>",
            "c": "'spam'",  # This is the change in question
            "log": "<BoundLoggerLazyProxy(logger=None, wrapper_class=None, processors=None, context_class=None, initial_values={}, logger_factory_args=())>"
          }
        }
      ]
    }
  ]
}

And again, the aim of this change was to safe people using simple secret classes with a def __repr__(self): return "'*******'" from accidentally leaking credentials and other secrets.

@wochinge Is this just a minor annoyance for you or does it really break some log processing in a way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants