-
-
Notifications
You must be signed in to change notification settings - Fork 930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple celery fork pool workers don't work #1785
Comments
@lzgabel do you have an example to repo the issue? |
Hi @mbierma. After our investigation, due to the following code existing in the system: @celeryd_init.connect
def recover_job(sender=None, conf=None, **kwargs) -> None:
i = app.control.inspect()
running_list = []
for worker_list in i.active().values():
for task_item in worker_list:
name = task_item.get('name')
# do something
args = task_item.get('args')
job_id = args[1]
if job_id:
running_list.append(job_id)
logger.info(f'lost job : {running_list}') It will be blocked here |
hi @lzgabel , I think this is different than my issue. My issue is not fork pool workers but celery workers themselves. I can have 2 celery workers when I'm using rabbitmq as a broker but, when I use kafka as a broker, only one celery worker is working. |
We have exactly the same problem. Downgrading kombu to 5.3.1 solves it. |
@mfaw is this still the case? If so then maybe I'll put it as a limitation in my celery documentation ticket here celery/celery#8935 |
It seems that if you specify another queue, like |
I wonder if we can use partitions to allow for multiple workers. |
Is there any plan to fix the issue reported by @lzgabel or is there any known workaround? It's still preventing us from upgrading kombu and celery beyond 5.3.1. |
I don't think that there is a maintainer for Kafka at the moment. |
The original issue reported by @lzgabel is not for Kafka but using Redis as broker, which we are also using. |
@lzgabel @ojnas I am unable to reproduce the issue using Celery
Additionally, have you tried the latest versions of Kombu and Celery? |
@thuibr I just tried with latest versions Celery 5.4.0 and Kombu 5.4.2 and the issue still persists for us. Redis version is 7.4.0. |
@thuibr If you want to reproduce the issue, as @lzgabel mentioned in their comment here #1785 (comment), it seems related to |
These issues seems to be related: |
@ojnas I was able to produce a different issue, but I would think that it has the same root cause. A Warm Shutdown hangs with the following code: import logging
from celery import Celery
from celery.signals import celeryd_init
app = Celery('tasks', broker='redis://localhost:6379/0')
logger = logging.getLogger(__name__)
@app.task
def add(x, y):
return x + y
@celeryd_init.connect
def recover_job(sender=None, conf=None, **kwargs) -> None:
i = app.control.inspect()
running_list = []
if i.active():
for worker_list in i.active().values():
for task_item in worker_list:
name = task_item.get('name')
args = task_item.get('args')
job_id = args[1]
if job_id:
running_list.append(job_id)
logger.info(f'lost job : {running_list}') Adding a breakpoint here in def close(self):
self._closing = True
if self._in_poll:
try:
breakpoint()
self._brpop_read()
except Empty:
pass
if not self.closed:
# remove from channel poller.
self.connection.cycle.discard(self)
# delete fanout bindings
client = self.__dict__.get('client') # only if property cached
if client is not None:
for queue in self._fanout_queues:
if queue in self.auto_delete_queues:
self.queue_delete(queue, client=client)
self._disconnect_pools()
self._close_clients()
super().close() |
I also notice that we get stuck here:
but we make it past that with the breakpoint:
|
I am wondering if disconnecting all redis connections after an inspect operation will mitigate this issue, but for some reason the redis connections are not disconnecting when I call @celeryd_init.connect
def recover_job(sender=None, conf=None, **kwargs) -> None:
#breakpoint()
i = app.control.inspect()
running_list = []
# breakpoint()
# Get number of active_connections from redis
client = app.connection().channel().client
info = client.info()
active_connections = info.get("connected_clients", 0)
# breakpoint()
i.ping()
# Disconnect all redis connections
client.connection_pool.disconnect()
info = client.info()
active_connections = info.get("connected_clients", 0)
breakpoint() |
It appears that more than one of the app.conf.broker_transport_options = {'socket_timeout': 5} |
👋 Hi All. FYI, since 5.3.2 was released, we encountered multiple celery fork pool workers don't work, so we rolled back to 5.3.1 and everything returned to normal.
Version: 5.3.2
Version: 5.3.1
🤔 We compared the
kombu
version changes, and when we reverted this PR: #1733 in version 5.3.2, all workers worked normally.cc @auvipy @Nusnus @mfaw @mbierma.
The text was updated successfully, but these errors were encountered: