-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AiiDA will no longer work with rabbitmq>3.7 by default #5105
Comments
I feel maybe we can put this in the
Two question:
thoughts @sphuber? |
trying it out in #5106 |
I remember looking into the default timeouts a long time ago and I think it is not a value that can be configured from the client. This has to be configured on the server itself. There even was a maximum defined that could not be surpassed. So even if you put a value above it in the config, it would be capped at the hardcoded value. This may have been for older versions of RabbitMQ (around 3.5) and not sure if that is still there. All there reasoning is that the main use case for RabbitMQ is that these should be "quick" jobs on the order of seconds. |
yeh cheers #5106 does not appear to fail rabbitmq, but obviously no idea yet if it is actually having any affect |
Hmm, yeh no joy yet; trying to set consumer_timeout to 1 in #5106, but that doesn't seem to fail anything |
Yeh no I guess it is not part of https://www.rabbitmq.com/uri-query-parameters.html#tls 😒 I asked about adding it: rabbitmq/rabbitmq-server#2990 (comment), or maybe I should open an actual issue if they don't respond |
Ok opened: rabbitmq/rabbitmq-server#3344 🤞 |
Well that was a dead end (we kinda use rabbitmq in a way it is not designed for) So why don't we just remove it entirely 😉 chrisjsewell/aiida-process-coordinator#4 |
I just had the same issue - Channel closed error for something running > 30 minutes. I checked and indeed I have rabbitmq 3.8.16. Can we make this requirement more obvious? |
Adding link to another project encountering the same issue: celery/celery#6760 |
After accidentally getting my rabbitmq updated to 3.9.x I also faced this same issue. And I would like to point out that the simplest way to downgrade rabbitmq would be to use conda instead of debian package. Otherwise one needs to manually downgrade all dependencies like erlang which has its own dependencies and it creates a big mess. So for anyone stumbling here, running following is all that's required.
Maybe @giovannipizzi @chrisjsewell we can add this in the wiki where you discuss this issue? |
yeh, as we have just been discussing, I think it is a nicer solution, in terms of dependency management (as opposed to apt or homebrew), but the downside is no automated setup of a background service, using e.g. launchctl (osx), systemd (linux) Out of interest, I have just posted here, to ask about such a feature https://groups.google.com/a/anaconda.com/g/anaconda/c/z36jZTlJG8g |
I've just had the issue with the channel closed error, while running the RabbitMQ v3.9.13. I have increased the consumer_timeout as per the documentation, but the jobs crashed after about 5 hours. I have some even older jobs running now, so I'm not sure if this is related to the timeout. Going through the RabbitMQ documentation, I have noticed a possible mistake in the Aiida documentation. It suggests:
however this appears to actually correspond to 1 hour, which is also what the RabbitMQ documentation says. |
Thanks for the report @Zeleznyj . Indeed, our wiki is incorrect and that is one hour, which would explain the error. Could you try to up it to lets say I will update the wiki now. |
I have tried increasing it, let's see if that helps, but the error is clearly somewhat random. I have encountered the error before and thought it's related to this since I'm running Aiida on laptop, but this time the computer was on the whole time the jobs were running. |
Has anyone ever tried using the
|
@ahkole I tried RabbitMQ 3.11.4 with the advanced config:
and everything worked as expected |
Ok, right now ✔ version: AiiDA v2.6.2.post0 I don't know in which PR this was solved, but I think we can close in here.. |
Alright, just found it. Adding here for the record: |
well It's up to you, but... I would say that is the solution to the "symptom", not the underlying problem (that rabbitmq is absolutely is really not intended to be used this way) 😅 |
In rabbitmq/rabbitmq-server#2990 a
consumer_timeout
has been introduced and set to 15 minutes, meaning that any process task that takes longer than 15 minutes will be cancelled 😬(there is people in that PR none too happy that this was introduced in a minor version)
The quick fix for this for users is either (a) use rabbitmq 3.7 or lower, or (b) configure
consumer_timeout
to false. (see also https://www.rabbitmq.com/consumers.html#acknowledgement-timeout)As is literally the last comment in that PR, at the time of writing, it is unclear to me off-hand if this can be done using the API (i.e. something aiida-core can handle automatically)?
The text was updated successfully, but these errors were encountered: