gh-106238: Handle KeyboardInterrupt during logging._acquireLock() #106239

arieleiz · 2023-06-29T12:32:14Z

We've come across a concurrency bug in logging/init.py which involves the handling of asynchronous exceptions, such as KeyboardInterrupt, during the execution of logging._acquireLock().

In the current implementation, when threading.RLock.acquire() is executed, there is a possibility for an asynchronous exception to occur during the transition back from native code, even if the lock acquisition is successful.

The typical use of _acquireLock() in the logging library is as follows:

def _loggingMethod(handler):
    """
    Add a handler to the internal cleanup list using a weak reference.
    """
    _acquireLock()
    try:
        # doSomething
    finally:
        _releaseLock()

In this pattern, if a KeyboardInterrupt is raised during the lock acquisition, the lock ends up getting abandoned.

When can this happen? One example is during forks. logging/init.py registers an at-fork hook, with

os.register_at_fork(before=_acquireLock,
                    after_in_child=_after_at_fork_child_reinit_locks,
                    after_in_parent=_releaseLock)

A scenario occurring in our production environment is during a slow fork operation (when the server is under heavy load and performing a multitude of forks). The lock could be held for up to a minute. If this is happening in a secondary thread, and a SIGINT signal is received in the main thread while is waiting to acquire the lock for logging, the lock will be abandoned. This will causes the process to hang during the next _acquireLock() call.

To address this issue, we provide a simple pull request to add a try-except block within _acquireLock(), e.g.:

def _acquireLock():
    if _lock:
        try:
            _lock.acquire()
        except BaseException:
            _lock.release()
            raise

This way, if an exception arises during the lock acquisition, the lock will be released, preventing the lock from being abandoned and the process from potentially hanging.

Issue: Handle KeyboardInterrupt during logging._acquireLock() #106238

We've come across a concurrency bug in logging/__init__.py which involves the handling of asynchronous exceptions, such as KeyboardInterrupt, during the execution of logging._acquireLock(). In the current implementation, when threading.RLock.acquire() is executed, there is a possibility for an asynchronous exception to occur during the transition back from native code, even if the lock acquisition is successful. The typical use of _acquireLock() in the logging library is as follows: def _loggingMethod(handler): """ Add a handler to the internal cleanup list using a weak reference. """ _acquireLock() try: # doSomething finally: _releaseLock() In this pattern, if a KeyboardInterrupt is raised during the lock acquisition, the lock ends up getting abandoned. When can this happen? One example is during forks. logging/__init__.py registers an at-fork hook, with os.register_at_fork(before=_acquireLock, after_in_child=_after_at_fork_child_reinit_locks, after_in_parent=_releaseLock) A scenario occurring in our production environment is during a slow fork operation (when the server is under heavy load and performing a multitude of forks). The lock could be held for up to a minute. If this is happening in a secondary thread, and a SIGINT signal is received in the main thread while is waiting to acquire the lock for logging, the lock will be abandoned. This will causes the process to hang during the next _acquireLock() call. To address this issue, we provide a simple pull request to add a try-except block within _acquireLock(), e.g.: def _acquireLock(): if _lock: try: _lock.acquire() except BaseException: _lock.release() raise This way, if an exception arises during the lock acquisition, the lock will be released, preventing the lock from being abandoned and the process from potentially hanging.

bedevere-bot · 2023-06-29T12:32:18Z

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

cpython-cla-bot · 2023-06-29T12:32:41Z

All commit authors signed the Contributor License Agreement.

vsajip · 2023-06-30T09:55:54Z

I'm not sure this is the right place to fix this. See the long discussion on #50970. Anyway, wouldn't your proposed fix be better arranged to move the lock.acquire() to inside the try:? That way, the finally: would release the lock.

arieleiz · 2023-06-30T10:02:41Z

Thank you very much for pointing me to #50970. It is not the same issue but similar, as we encounter the issue in the original process, not the forked process, and the issue is not the state of the lock but the lock being abandoned due to an async exception.
Another possibility for the fix is directly within the lock acquire native code, detecting that an exception has occured and unlocking. However this will have a much greater scope as it will affect all locks.
Regarding your suggestion to move _acquireLock() inside the try: except: clauses, this will not fix the lock acquire in atfork(), and also might be more prone to errors in future changes to this code.
With our proposed fix we maintain the invariant and underlying assumption of the library that when an exception is thrown from within _acquireLock() the lock is never abandoned.

vsajip · 2023-07-04T16:42:45Z

@arieleiz You need to sign the CLA before this PR can be progressed.

arieleiz · 2023-07-04T16:50:28Z

Hi @vsajip, I've tried signing the CLA multiple times, I keep getting a "Internal Server Error" from CLAbot after "Sign here with github to agree". Any idea what I am doing wrong?

vsajip · 2023-07-05T06:13:00Z

I'm afraid not. Might be worth asking on https://discuss.python.org/ to see if anyone there can help.

bedevere-bot · 2023-07-05T16:01:05Z

🤖 New build scheduled with the buildbot fleet by @vsajip for commit 99b651d 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

vsajip · 2023-07-05T21:55:49Z

CLA signing status appears to be stuck on "not signed". Will convert to draft and back to see if it unsticks.

vsajip · 2023-07-06T05:33:21Z

CLA signing status appears to be stuck on "not signed". Will convert to draft and back to see if it unsticks.

That didn't work - closing and reopening to see if that does the trick.

arieleiz · 2023-07-06T05:36:14Z

Thanks @vsajip! shows up as CLA signed for me

Misc/NEWS.d/next/Library/2023-06-29-12-40-52.gh-issue-106238.VulKb9.rst

arieleiz requested a review from vsajip as a code owner June 29, 2023 12:32

bedevere-bot added the awaiting review label Jun 29, 2023

AlexWaygood changed the title ~~issue-106238: Handle KeyboardInterrupt during logging._acquireLock()~~ gh-106238: Handle KeyboardInterrupt during logging._acquireLock() Jun 29, 2023

bedevere-bot mentioned this pull request Jun 29, 2023

Handle KeyboardInterrupt during logging._acquireLock() #106238

Closed

📜🤖 Added by blurb_it.

99b651d

vsajip added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jul 5, 2023

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jul 5, 2023

vsajip marked this pull request as draft July 5, 2023 21:55

bedevere-bot removed the awaiting review label Jul 5, 2023

vsajip marked this pull request as ready for review July 5, 2023 21:56

bedevere-bot added the awaiting review label Jul 5, 2023

vsajip closed this Jul 6, 2023

vsajip reopened this Jul 6, 2023

vsajip reviewed Jul 6, 2023

View reviewed changes

Misc/NEWS.d/next/Library/2023-06-29-12-40-52.gh-issue-106238.VulKb9.rst Outdated Show resolved Hide resolved

Grammar changes.

3fe910b

vsajip approved these changes Jul 6, 2023

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Jul 6, 2023

vsajip merged commit 99b00ef into python:main Jul 6, 2023

bedevere-bot removed the awaiting merge label Jul 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-106238: Handle KeyboardInterrupt during logging._acquireLock() #106239

gh-106238: Handle KeyboardInterrupt during logging._acquireLock() #106239

arieleiz commented Jun 29, 2023 •

edited by AlexWaygood

Loading

bedevere-bot commented Jun 29, 2023

cpython-cla-bot bot commented Jun 29, 2023 •

edited

Loading

vsajip commented Jun 30, 2023

arieleiz commented Jun 30, 2023

vsajip commented Jul 4, 2023

arieleiz commented Jul 4, 2023

vsajip commented Jul 5, 2023

bedevere-bot commented Jul 5, 2023

vsajip commented Jul 5, 2023

vsajip commented Jul 6, 2023

arieleiz commented Jul 6, 2023

gh-106238: Handle KeyboardInterrupt during logging._acquireLock() #106239

gh-106238: Handle KeyboardInterrupt during logging._acquireLock() #106239

Conversation

arieleiz commented Jun 29, 2023 • edited by AlexWaygood Loading

bedevere-bot commented Jun 29, 2023

cpython-cla-bot bot commented Jun 29, 2023 • edited Loading

vsajip commented Jun 30, 2023

arieleiz commented Jun 30, 2023

vsajip commented Jul 4, 2023

arieleiz commented Jul 4, 2023

vsajip commented Jul 5, 2023

bedevere-bot commented Jul 5, 2023

vsajip commented Jul 5, 2023

vsajip commented Jul 6, 2023

arieleiz commented Jul 6, 2023

arieleiz commented Jun 29, 2023 •

edited by AlexWaygood

Loading

cpython-cla-bot bot commented Jun 29, 2023 •

edited

Loading