Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure in opentelemetry-sdk on Python 3.12 #3370

Closed
musicinmybrain opened this issue Jul 6, 2023 · 12 comments · Fixed by #3557
Closed

Test failure in opentelemetry-sdk on Python 3.12 #3370

musicinmybrain opened this issue Jul 6, 2023 · 12 comments · Fixed by #3557
Assignees
Labels
bug Something isn't working

Comments

@musicinmybrain
Copy link
Contributor

musicinmybrain commented Jul 6, 2023

Describe your environment

Running in a fresh checkout of main, 3f459d3, on Fedora Linux 38, x86_64, with

  • python3.11 -VV = Python 3.11.4 (main, Jun 7 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)]
  • python3.12 -VV = Python 3.12.0b3 (main, Jun 21 2023, 00:00:00) [GCC 13.1.1 20230614 (Red Hat 13.1.1-4)]

This should be platform-independent.

Steps to reproduce

The version pins in dev-requirements.txt cause a lot of problems when trying to test with Python 3.12. We will bypass them all and test without tox.

gh repo clone open-telemetry/opentelemetry-python
cd opentelemetry-python
python3.12 -m venv _e
. _e/bin/activate
cd opentelemetry-semantic-conventions
pip install -e .
cd ../opentelemetry-api
pip install -e .
cd ../opentelemetry-sdk
pip install -e .
cd ../tests/opentelemetry-test-utils/
pip install -e .
cd ../../opentelemetry-sdk
pip install pytest pytest-benchmark flaky
python -m pytest

What is the expected behavior?

If you repeat the above with python3.11 instead of python3.12, or run tox -e py311-opentelemetry-sdk:

(lots of output, DeprecationWarnings, so on)

======================= 377 passed, 9 warnings in 16.09s ========================

What is the actual behavior?

=================================== FAILURES ====================================
______________ TestLoggingHandler.test_log_record_user_attributes _______________

self = <tests.logs.test_handler.TestLoggingHandler testMethod=test_log_record_user_attributes>

    def test_log_record_user_attributes(self):
        """Attributes can be injected into logs by adding them to the LogRecord"""
        emitter_provider_mock = Mock(spec=LoggerProvider)
        emitter_mock = APIGetLogger(
            __name__, logger_provider=emitter_provider_mock
        )
        logger = get_logger(logger_provider=emitter_provider_mock)
        # Assert emit gets called for warning message
        logger.warning("Warning message", extra={"http.status_code": 200})
        args, _ = emitter_mock.emit.call_args_list[0]
        log_record = args[0]

        self.assertIsNotNone(log_record)
>       self.assertEqual(log_record.attributes, {"http.status_code": 200})
E       AssertionError: {'taskName': None, 'http.status_code': 200} != {'http.status_code': 200}
E       - {'http.status_code': 200, 'taskName': None}
E       + {'http.status_code': 200}

tests/logs/test_handler.py:93: AssertionError
------------------------------- Captured log call -------------------------------
WARNING  tests.logs.test_handler:test_handler.py:88 Warning message
================== 1 failed, 376 passed, 17 warnings in 16.26s ==================

Additional context

We first encountered this in the Python 3.12 mass rebuild in Fedora Linux in preparation for the release of Fedora 39 this fall. Downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2220378

I plan to skip this test in the Fedora Linux package for now; I don’t expect to spend more time looking for the root cause.

@musicinmybrain musicinmybrain added the bug Something isn't working label Jul 6, 2023
@musicinmybrain
Copy link
Contributor Author

musicinmybrain commented Jul 13, 2023

I have confirmed a similar error with 93992ce, except that the failing test is TestLoggingInit.test_logging_init_exporter, and it’s clear that unbounded recursion is occurring.

@lzchen
Copy link
Contributor

lzchen commented Aug 16, 2023

I have followed the above steps exactly on Py3.11.4 on a Windows machine and cannot reproduce the issue. @musicinmybrain are you still getting the same errors?

@musicinmybrain
Copy link
Contributor Author

musicinmybrain commented Aug 23, 2023

I have followed the above steps exactly on Py3.11.4 on a Windows machine and cannot reproduce the issue. @musicinmybrain are you still getting the same errors?

I just repeated the steps (still on Fedora 38, x86_64, now with Python 3.12.0rc1 and d8490c5) and still reproduced the single test failure.

Note that this issue is specific to Python 3.12, so failure to reproduce on 3.11.4 is consistent with what I see. (I used 3.11.4 as a “control” to confirm that this appeared to be a 3.12 regression rather than a pre-existing problem.)

@lzchen
Copy link
Contributor

lzchen commented Sep 7, 2023

Interesting, we don't officially support Python 3.12 yet but am curious as to what sort of behavior is causing this error.

@musicinmybrain
Copy link
Contributor Author

I have confirmed a similar error with 93992ce, except that the failing test is TestLoggingInit.test_logging_init_exporter, and it’s clear that unbounded recursion is occurring.

This still reproduces with unbounded recursion using https://github.com/open-telemetry/opentelemetry-python/releases/tag/v1.20.0 and Python 3.12.0rc2.

In addition, TestBatchSpanProcessor.test_batch_span_processor_many_spans sometimes fails (flakily) with AssertionError: 974 != 1024. I’m guessing this is just a matter of marginal runtime performance in the test environment, but I haven’t attempted to really understand the test.

@phillipuniverse
Copy link

@lzchen is there a target version for Python 3.12 support?

@lzchen
Copy link
Contributor

lzchen commented Nov 6, 2023

@phillipuniverse

Not as of today no. We will discuss adding support in the next SIG meeting.

@lzchen
Copy link
Contributor

lzchen commented Nov 16, 2023

@ocelotl
Copy link
Contributor

ocelotl commented Nov 16, 2023

@musicinmybrain we'll be dropping support for 3.7 in 5 weeks when 3.7 is out of support. When that happens we'll begin working on supporting 3.12.

@mariojonke
Copy link
Contributor

The reason for the failing test is the newly added taskName attribute on the (python) LogRecord in python 3.12. From what i saw, the OTel LogHandler has a hardcoded set of these attributes to exclude them when the OTel LogRecord attributes are collected.
Since the newly introduced taskName attribute is not in this set it gets collected as OTel LogRecord attribute. By default its value is None and when the OTel LogRecord creates its BoundedAttributes object it will trigger another log (something like: Invalid type NoneType for attribute 'taskeName' value. ...) which itself will create another OTel LogRecord and so on.

Adding taskName to the set of excluded attributes should fix the test i think (actually currently on my system only TestLogingInit.test_logging_init_exporter is failing but from the taskName in the traceback i assume it is the same cause for this issue).
There probably also should be an attribute validity check (without logging) or some kind suppression to avoid creating further log records since attributes on a (python) LogRecord could potentially have any type which would then most likely also trigger a stack overflow recursion.

@musicinmybrain
Copy link
Contributor Author

The reason for the failing test is …

Thank you for this excellent analysis. I’ve used your suggestion to open a PR, #3557, to fix the immediate issue. I didn’t try to add any kind of attribute validity check.

@jeremydvoss
Copy link
Contributor

It seems this new attribute can completely break logging: #3560
Since this issue is just about the test failure, is there another to track this attribute more broadly? I want to make sure I can close my ticket

ocelotl pushed a commit to musicinmybrain/opentelemetry-python that referenced this issue Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants