Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RetryLater AutoDelete queue is causing problems #238

Closed
gingertez opened this issue Jul 5, 2017 · 8 comments
Closed

RetryLater AutoDelete queue is causing problems #238

gingertez opened this issue Jul 5, 2017 · 8 comments

Comments

@gingertez
Copy link

We have a high-throughput system for which we expect some messages to fail and be re-queued and so we are using the AdvancedMessageContext to retry in a couple of seconds.

The creation of the rety_in_2000_ms queue happens automatically when we call context.RetryLater(timespan) and then the queue is automatically deleted when the queued messages have completed.

However, the next time we try to call context.RetryLater(timespan) we get an exception in our stack trace:
The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=404, text="NOT_FOUND - no exchange 'rety_in_2000_ms' in vhost '/'", classId=60, methodId=40, cause=

Doing a Google search turns up this which indicates that the issue might be related to the AutoDelete setting on the retry queue: http://rabbitmq.1065348.n5.nabble.com/Send-problem-receiver-closes-exchange-td25702.html

Is my guess correct that it is down to the AutoDelete queue and if so is there any way to override this?

@pardahlman
Copy link
Owner

Hello, @gingertez - thanks for reporting this 👋 .

All topology related calls (declaring queue/exchange and binding queue/exchange) is handled by the registered ITopologyProvider, which has a default implementation TopologyProvider. Each time a is published, the exchange is declared via a call to DeclareExchangeAsync.

It would be costly to declare the exchange each time, so the client tries to evaluate if the exchange has already been declared, and if so does not declare it. However, looking at that particular code, it looks like it does not take AutoDelete into consideration.

So my guess at what's happening is that it works the first time, then the exchange is removed but is still in the list of known declared exchanges and therefor not declared again, which is why the publish fails.

The retry later feature has been re-implemented in the next version of the client (2.x), which is my main focus now. If you want to, you could try to register a custom implementation of the topology provider (based on the default one) where you check for auto delete and see if it helps. If it does, feel free to create a PR with the fix.

PS. There is a PR for the typo in the retry queue name 😉 .

@pardahlman
Copy link
Owner

Hang on... looks like the the exchange is not added to the list of known exchanges if it uses autodelete: https://github.com/pardahlman/RawRabbit/blob/stable/src/RawRabbit/Common/TopologyProvider.cs#L217

@gingertez
Copy link
Author

Thanks Pär

I'm not 100% sure how to go about what you suggested in your first reply. Any pointers?

FYI we're currently on version 1.10.3.

Cheers,
Terry.

@gingertez
Copy link
Author

Based on your second comment - does that mean that amending/declaring my own TopologyProvider won't resolve the issue?

It looks to me like the problem is not that RawRabbit isn't re-creating the queue the second time, the problem is that when RawRabbit tries to create the queue RabbitMQ won't allow it because it's already been deleted.

Does that sound plausible?

If so, what can be done about it?

@pardahlman
Copy link
Owner

Hello again, @gingertez! My first hypothesis was that the problem was due to the client "caching" of known declared topology features and the fact that some of these features can have AutoDelete. However, looking at the code again I realized that this was not the case. So, as for solution proposal you can scrap what I said in my previous comment 😉

I've looked into the problem some more and was occasionally able to reproduce it. I believe that the key is this line of code, where the retry exchange is declared with AutoDelete set to true. This means that the exchange will be removed if no queues are bound to it. As you can see a few lines below the retry queues are un-bound from the exchange once the message is received (in order to not get any more messages routed to that queue that might have been published to a different exchange). However, if the retry later method is called concurrently multiple times, the following sequence may occur:

  1. Thread A declares exchange
  2. Thread B declares same exchange (no problem)
  3. Thread A binds queue to exchange, publishes message and unbounds. This will remove the exchange
  4. Exception thrown, as the exchange has been removed (due to AutoDelete)

You should be able to get around this by implementing a custom IContextEnhancer that declares the exchange with AutoDelete set to false.

@pardahlman
Copy link
Owner

Closing this ticket for now, feel free to re-open if you have any more input on the topic!

@gingertez
Copy link
Author

Your last message indicated that you found the bug in your code... is there any chance you could actually fix the bug for future consumers?

Cheers,
Terry.

@pardahlman
Copy link
Owner

I'm focusing my development efforts in making 2.0 ready for launch. Up until recently, there has been no planed release of the 1.x branch of RawRabbit. However, #257 is something that I want fixed so it's possible that this issue will be fixed in the same release. I can not commit to dates, sorry! Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants