Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Popen.__del__() crashes with RuntimeError('maximum recursion depth exceeded while calling a Python object') #927

Closed
dimp-gh opened this issue Oct 24, 2016 · 13 comments

Comments

@dimp-gh
Copy link

dimp-gh commented Oct 24, 2016

Hi,

First of all, thanks for maintaining psutil. It's a wonderful cross-platform library and I'm pretty sure that every Python programmer used it at least once, if not daily.

Back to the point. It looks like bdcde7e introduced a regression which causes nasty 'maximum recursion depth exceeded' crashes on both Linux and Windows.

As crashing code seems to be in Popen.__del__(), I'm having trouble tracing it or creating a simple script to reproduce crash, but hopefully i would be able to craft one later today or tomorrow.

I'm pretty sure the issue is in Popen.__del__(), though, because I tried to inherit a CustomPopen class from psutil.Popen, created a dummy __del__() implementation (that doesn't call parent's __del__()) — and my program doesn't crash anymore. Also, if I downgrade psutil to 4.3.1 — there are no crashes too.

@giampaolo
Copy link
Owner

Hi, thanks and I'm glad psutil is useful to you. Do you have a stack trace? I understand you can't reproduce the issue but the initial part of the traceback showing what call caused the recursion error would be nice to see.

@dimp-gh
Copy link
Author

dimp-gh commented Oct 24, 2016

It looks like it's an internal Python crash, as all I'm getting is the wall of the following messages:

Exception RuntimeError: RuntimeError('maximum recursion depth exceeded while calling a Python object',) in  ignored
Exception RuntimeError: RuntimeError('maximum recursion depth exceeded while calling a Python object',) in  ignored
Exception RuntimeError: RuntimeError('maximum recursion depth exceeded while calling a Python object',) in  ignored

So there's no stacktrace at all.

@giampaolo
Copy link
Owner

Mm weird... can you paste the script you're using (or part of it)?

@dimp-gh
Copy link
Author

dimp-gh commented Oct 24, 2016

Okay, i was finally able to craft a script.

import gc
from time import sleep

from psutil import Popen


# NOTE: there's no binary with name `1` in my $PATH so OSError is expected
def reproduce1():
    proc = None
    try:
        proc = Popen(['1'])
    except OSError:
        pass
    proc.__del__()


def reproduce2():
    try:
        Popen(['1'])
    except OSError:
        pass

    # a few iterations are needed for GC to recognize that Popen instance has no references
    for _ in range(3):
        sleep(1)
        gc.collect()  # GC will call Popen.__del__ when collecting it


if __name__ == '__main__':
    reproduce1()

Both reproduce1() and reproduce2() cause the crash on Python 2.7.6. It looks like with Python 3.4.3 everything works fine, though.

@giampaolo
Copy link
Owner

giampaolo commented Oct 24, 2016

Wow! That is weird! It looks like a Python bug. I also verified the issue goes way if I remove __del__ from Popen class but I really don't understand why.

@dimp-gh
Copy link
Author

dimp-gh commented Oct 24, 2016

It looks like Python has fancy semantics for exceptions raised from __del__: https://docs.python.org/2/reference/datamodel.html#object.__del__

Normally, exceptions from __del__ are just ignored, but maybe stack overflow is the kind of exception that can't be simply ignored.

But yeah, it's starting to look like Python bug to me, too.

@dimp-gh
Copy link
Author

dimp-gh commented Oct 24, 2016

Little side note: I'm not sure it's necessary to explicitly call self.__subproc.__del__(*args, **kwargs) in Popen.__del__(). I mean, GC will sooner or later collect __subproc anyway when no references to it are left. Assuming the only reference to __subproc is in Popen instance itself, it should be collected automatically pretty soon.

@giampaolo
Copy link
Owner

I'm also not sure about the introduction of __del__. I will try to investigate this a little bit further but it's likely I'm gonna release a new version without __del__ pretty soon as it looks like the most obvious solution.

@giampaolo
Copy link
Owner

One more question: how did you bump into this? Did you were explicitly calling __del__? Because you're not supposed to.

@dimp-gh
Copy link
Author

dimp-gh commented Oct 25, 2016

Yesterday our CI stack for Taurus (TravisCI and Appveyor) started failing with this exact problem (1, 2). I quickly examined it and got to a conclusion that the new psutil release was the main source of the issue.

Then i started fiddling around with the test suite locally and the only two ways to reliably reproduce the issue i found were:

  1. Invoke __del__ on Popen directly to cause the crash (function reproduce1 in the above code sample).
  2. Force-run gc.collect() a few times to cause garbage collector to collect free Popen instance and invoke its __del__, therefore causing the crash (function reproduce2 in the above code sample).

And yeah, I can assure you that there are no explicit calls to Popen.__del__ in our project.

@giampaolo
Copy link
Owner

OK, we definitively need to release a new version then. Will do it soon, hopefully today.
Thanks for your support.

@giampaolo
Copy link
Owner

OK, a new version is out.

@dimp-gh
Copy link
Author

dimp-gh commented Oct 25, 2016

Yep, can verify. Our test suite passes all tests with 4.4.1.

Thanks a lot for the quick response and the fix.

nlevitt added a commit to nlevitt/psutil that referenced this issue Apr 9, 2019
* master: (375 commits)
  update psutil
  fix procsmem script which was not printing processes
  try to fix tests on travis
  try to fix tests on travis
  OSX: fix compilation warning
  giampaolo#936: give credits to Max Bélanger
  giampaolo#811: move DLL check logic in _pswindows.py
  winmake: use the right win slashes
  winmake: do not try to install GIT commit hook if this is not a GIT cloned dir
  giampaolo#811: on Win XP let the possibility to install psutil from sources as it still (kind of) works)
  giampaolo#811: add a Q&A section in the doc; tell what Win versions are supported
  giampaolo#811: raise a meaningful error message if on Windows XP
  update doc; bump up version
  giampaolo#939: update MANIFEST to include only src files and not much else
  update HISTORY
  travis: execute mem leaks and flake8 tests only on py 2.7 and 3.5; no need to test all python versions
  bump up version
  update version in doc
  add simple test case for oneshot() ctx manager
  add simple test case for oneshot() ctx manager
  speedup fetch all process test by using oneshot
  giampaolo#799 - oneshot / linux: speedup memory_full_info and memory_maps
  fix flake8
  first pass
  giampaolo#943: better error message in case of version conflict on import.
  update doc
  799 onshot / win: no longer store the handle in python; I am now sure this is slower than using OpenProcess/CloseHandle in C
  update doc
  (win) add memleak test for proc_info()
  move stuff around
  memleak: fix false positive on windows
  giampaolo#933 (win) fix memory leak in WindowsService.description()
  giampaolo#933 (win) fix memory leak in cpu_stats() (missing free())
  refactoring
  giampaolo#799 / win: pass handle also to memory_maps() and username() functions
  fix numbers
  mem leak script: provide better error output in case of failure
  refactor memleak script
  refactor memleak script
  refactor memleak script
  refactor memleak script
  refactor memleak script: get rid of no longer used logic to deal with Process properties
  memleak script refactoring
  doc styling
  giampaolo#799 / win: use oneshot() around num_threads() and num_ctx_switches(); speedup from 1.2x to 1.8x
  refactor windows tests
  win: enable dueal process impl tests
  win / C: refactor memory_info_2 code() and return it along side other proc_info() metrics
  windows c refactor proc_info() code
  update windmake script
  winmake clean: make it an order of magnitude faster; also update Makefile
  update doc
  bench script: add psutil ver
  winmake: more aggressive logic to uninstall psutil
  adjust bench2 script to new perf API
  try to adjust perf
  upgrade perf code
  memory leak script: humanize memory difference in case of failure
  style changes
  fix giampaolo#932 / netbsd: check connections return value and raise exception
  netbsd / connections: refactoring
  netbsd / connections: refactoring
  netbsd / connections: refactoring
  netbsd / connections: refactoring
  netbsd / connections: refactoring
  testing make clean with unittests was a bad idea after all
  make 'make clean' 4x faster!
  add test for make clean
  adjust winmake script
  fix netbsd/openvsd compilation failure
  bsd: fix mem leak
  osx: fix memory leak
  pre-release
  refactoring
  update IDEAS
  add mtu test for osx and bsd
  osx: separate IFFLAGS function
  osx/bsd: separate IFFLAGS function
  linux: separate IFFLAGS function
  share C function to retrieve MTU across all UNIXes
  HISTORY: make anchors more easily referenceable
  fix giampaolo#927: Popen.__del__ may cause maximum recursion depth error.
  fix Popen test which is occasionally failing
  more releases timeline from README to doc
  ignore failing tests on OSX + TRAVIS
  update INSTALL instructions
  update print_announce.py script
  update HISTORY
  HISTORY: provide links to issues on the bug tracker
  update IDEAS
  giampaolo#910: [OSX / BSD] in case of error, psutil.pids() raised RuntimeError instead of the original OSError exception.
  fix unicode tests on windows / py3
  small test refactoring
  fix giampaolo#926: [OSX] Process.environ() on Python 3 can crash interpreter if process environ has an invalid unicode string.
  osx: fix compiler warnings
  refactor unicode tests
  fix unicode test
  giampaolo#783: fix some unicode related test failures on osx
  test refactoring
  test refactroring
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants