Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Click 7.0 ctx.exit() hangs. #1134

Closed
msapiro opened this issue Oct 5, 2018 · 11 comments
Closed

Click 7.0 ctx.exit() hangs. #1134

msapiro opened this issue Oct 5, 2018 · 11 comments
Labels

Comments

@msapiro
Copy link

msapiro commented Oct 5, 2018

This is a mysterious issue. I an a GNU Mailman developer and more detail is at https://gitlab.com/mailman/mailman/issues/520. I have narrowed the cause to the commit at 1602436.

We use click for our mailman command interface. The mailman command has a number of sub-commands. The issue is two of these sub-commands under some conditions call ctx.exit() and instead of exiting at that point, the process just hangs and never exits until it is manually killed. Further, the sub-commands all have their own --help and that too prints the help and hangs, but the main mailman --help does exit normally after printing the help.

This may have something to do with how we implement the sub-commands and I will try to provide more detail upon request.

The mystery is this hanging behavior only occurs in certain environments. I have observed it on three different Ubuntu 16.04 machines with Python 3.5.2 and bash and also on one of those with tcsh. I am unable to duplicate it on Mac OS 10.13.6 with Python 3.7.2. Another developer has tried and can't reproduce it with combinations of Python 3.6 and 3.7 under bash and zsh. It also doesn't hang under Python 3.5, 3.6 or 3.7 on our GitLab CI.

Also, I have run tox against the HEAD of the click master branch on one of the Ubuntu 16.04 machines and both py35 and py27 pass 196 tests with one expected failure.

@jcrotts jcrotts added the bug label Oct 6, 2018
@thechief389
Copy link

Use sys.exit instead.

@davidism
Copy link
Member

davidism commented Nov 4, 2018

@thechief389 that's not a replacement for what's being reported here.

@thechief389
Copy link

What I was saying was that it could be a work around until it's fixed.

@msapiro
Copy link
Author

msapiro commented Nov 5, 2018

@thechief389 using sys.exit() in the two places where we explicitly call ctx.exit() is an acceptable work around for those two places, but it doesn't address the case where a sub-command has its own --help function and the process hangs after printing the sub-command's help. In those cases the hang is somewhere within click's processing of the sub-command's --help and there is no explicit ctx.exit() in our code to replace.

@davidism
Copy link
Member

davidism commented Nov 5, 2018

cc @sirosen, looks like this was a change you added in #1098.

@sirosen
Copy link
Contributor

sirosen commented Nov 5, 2018

Sorry, I haven't kept up very well with open issues lately. Thanks for tagging me.

I just fired up an Ubuntu 16.04 machine to try to test this out right now, but I haven't been able to reproduce yet.
@msapiro, am I right in understanding that this issue only comes up when running your testsuite, when standalone_mode=False? It's not immediately obvious from the thread if you're able to get this behavior outside of the tests.

I'm starting to look at the relevant mailman test to see if I can figure this out.

@msapiro
Copy link
Author

msapiro commented Nov 6, 2018

@sirosen The issue occurs outside of the test suite. It occurs with various mailman sub-commands that have a --help option. I.e. mailman aliases --help will print the help and then hang. The same is true of essentially all the other subcommands, although mailman --help
exits normally after printing the help. There are also two subcommands that explicitly call ctx.exit() under some circumstances and this hangs, but that can be worked around by replacing ctx.exit() with sys.exit().

@sirosen
Copy link
Contributor

sirosen commented Nov 6, 2018

Okay, I've got a lot to unpack here, but let's start with the most important bit of good news: this code reproduces the bug.

I am not sure if the bad interaction with contextlib.ExitStack is a click bug or a weird, but known or well-understood, behavior of ExitStack.
I'm going to keep looking at this to see if I can figure out whether or not we have a bug we need to fix.

Tinkering with mailman itself to see if I could work out a fix there, I tried replacing ExitStack with an arguably-less-elegant, but undeniably-working version:

    def invoke(self, ctx):
        if config.db is not None:
            with transaction():
                return super().invoke(ctx)
        return super().invoke(ctx)

Another bit of relevant info: I can't seem to inspect the call stack when the click.exceptions.Exit handler is running.
I tried setting up PDB with a breakpoint in that handler, and found that hangs on the pdb set_trace call.
Picking apart a bit more, here's what I ended up putting in there:

        except Exit as e:
            if standalone_mode:
                print('! checkpoint1')
                ty, val, trace = sys.exc_info()
                print('! checkpoint2')
                stack = traceback.extract_tb(trace, limit=10)
                print('! checkpoint3')
                print(traceback.format_list(stack))
                print('! checkpoint4')
                sys.exit(e.exit_code)

and the output I see is

! checkpoint1
! checkpoint2

with the process spinning at 100% CPU. So I feel like something badly weird is happening to the state of the interpreter.
os._exit() works from that handler, but raise BaseException('kthxbai') doesn't.

It's pretty clear that ExitStack is putting things in a weird state, but I haven't ever used it before, so I can't pretend to know or understand it that well.

@sirosen
Copy link
Contributor

sirosen commented Nov 6, 2018

I'm finding that a context manager defined as a class with __enter__ and __exit__ behaves fine, but one defined from a generator does not.
I think what this comes down to is that the generator context manager will swallow RuntimeErrors. click.exceptions.Exit is a RuntimeError and SystemExit is not.

@davidism, is it too much to suggest that maybe click.exceptions.Exit should inherit from SystemExit?
That's the only way in which I think click can reasonably change the behavior in this scenario.

@jcrotts jcrotts added this to the 7.1 milestone May 6, 2019
@jcrotts jcrotts modified the milestones: 7.1, 7.2 Oct 12, 2019
@RazerM
Copy link
Contributor

RazerM commented Apr 27, 2020

Generator context managers swallowed RuntimeErrors in 3.5.2 (BPO-27122), which was fixed in 3.5.3.

@davidism
Copy link
Member

Closing as it sounds like this was a bug that was fixed in Python 3.5.2, not an issue with Click.

@davidism davidism removed this from the 7.2 milestone Apr 27, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants