Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flushing a queue on shutdown in case of reconfiguration results in an unresponsive app #141

Closed
Flauschbaellchen opened this issue Feb 25, 2021 · 6 comments

Comments

@Flauschbaellchen
Copy link

Flauschbaellchen commented Feb 25, 2021

I'm currently facing an issue with watchtower 1.0.6 hanging on shutdown/flush and renders the whole application non-responsive.

I'm using Django (3.1) with celery (5.0.5) and django-celery-beat (2.2.0).
I mention celery and django-celery-beat as this is the environment I can currently reproduce this bug.

The whole "exception" stack is a little bit complicated, so I'm going from inside out.

The application hangs on the following line:

    def flush(self):
       self.addFilter(_boto_filter)
       if self.shutting_down:  # This is False
           return
       for q in self.queues.values():
           q.put(self.FLUSH)
       for q in self.queues.values():
           q.join()  # <-- This line.

This line is called, when logging.config.dictConfig() is called a second time.
When doing so, all already existing handlers are shut down:

Here the lines which are called, keeping the order from the inside out:

I'm very grateful for any help.

@Flauschbaellchen
Copy link
Author

Hi @kislyuk,
May I push this issue and get some feedback if this issue is correct at this repository or if it needs to be solved in celery or in django-celery-beat?
I don't want to create an issue in all repositories at once, but would like to get an initial assessment here first.

@kislyuk
Copy link
Owner

kislyuk commented Mar 19, 2021

Hi, I don't have any specific comments on this, unfortunately. Since I don't use Django, I'm having trouble guessing what might be going wrong.

@TRManderson
Copy link

TRManderson commented Apr 14, 2021

Hi, I'm having a similar issue over in Airflow land - here's the extent of the Python backtrace I was able to grab using GDB:

  File "/usr/local/lib/python3.8/queue.py", line 89, in join
    self.all_tasks_done.wait()
  File "/usr/local/lib/python3.8/site-packages/watchtower/__init__.py", line 250, in flush
    q.join()
  File "/usr/local/lib/python3.8/logging/__init__.py", line 2122, in shutdown
    h.flush()
  File "/usr/local/lib/python3.8/site-packages/airflow/task/task_runner/standard_task_runner.py", line 92, in _start_by_fork
    logging.shutdown()
  File "/usr/local/lib/python3.8/site-packages/airflow/task/task_runner/standard_task_runner.py", line 41, in start
    self.process = self._start_by_fork()
  (frame information optimized out)
  File "/usr/local/lib/python3.8/site-packages/airflow/jobs/base_job.py", line 237, in run
    self._execute()
  File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 376, in _run_task_by_local_task_job
    if args.task_params:
  File "/usr/local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py", line 320, in _run_task_by_selected_method
    """Get the status of all task instances in a DagRun"""
  (frame information optimized out)
  (frame information optimized out)
  (frame information optimized out)
  File "/usr/local/lib/python3.8/site-packages/airflow/__main__.py", line 40, in main
    args.func(args)
  (frame information optimized out)

I thought it might be a fork-related issue, but this error is not consistent at all, whereas forking occurs every time.

@terencehonles
Copy link
Contributor

This is very likely related to #139 and what I was running into when I opened the PR

@TRManderson
Copy link

I reckon you're absolutely correct @terencehonles, good find!

@Flauschbaellchen
Copy link
Author

Seems to be fixed in version 2.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants