Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit to support Azure Service Bus #1438

Merged
merged 4 commits into from
Mar 18, 2021

Conversation

damonsiusta
Copy link
Contributor

Submitted on behalf of Spektrix to support the feature described in #28 (Support Azure Service Bus)

@holytshirt
Copy link
Member

@penderi You maybe interested in this too

@holytshirt
Copy link
Member

Amazing and thank you @damonsiusta going to try review this afternoon!

@iancooper
Copy link
Member

Wow,thanks. Will try to review ASAP

@penderi
Copy link
Contributor

penderi commented Mar 16, 2021

Thanks @holytshirt & hello mate! This truly is amazing.
Yes I knew - I responded to the callout for brighter support esp ASB and asked @mustodomenico if we could help with getting his stuff in and adding to the ASB support! He contacted @damonsiusta and here we are!

We're (Flagstone) primed to help you guys with the PR and from next sprint (2 weeks time) can commit a couple of stories.

We'll take a look and see if we can come up with some focused stories to help!! Exciting and thanks so much @damonsiusta

@mustodomenico
Copy link

So @damonsiusta correct if I am wrong, this has been running in production for around 5/6 months ?
I think it would be helpful to share some of the azure service bus specific bugs we had and the why of some of the design decision.
One thing that jumps to mind is the fact that there are no integration tests as there is no (easy) way of running azure service bus locally

@damonsiusta
Copy link
Contributor Author

Glad to help, as @mustodomenico mentioned it may not be 100% feature-complete but we've been using it in production since last year with no issues with the features we have implemented.

The most significant change from the version we use (Brighter 8.1.1036) is our current version of AzureServiceBusChannelFactory.CreateChannel takes a Connection parameter but I can see that's now changed to be a Subscription type (had to also create a new AzureServiceBusSubscription class). I've updated this in the PR and also updated the corresponding test so that should hopefully be all good.

You'll also notice we've used a lot of wrappers around Service Bus components, there wasn't an easy way for us to mock these classes so hence the reason for this approach.

I know the preference from the contribution guidelines is to use MSpec but noticed other projects use xUnit so we converted the existing tests from nUnit instead. As I'm sure you'll see our tests do add a dependency onto Moq.

@damonsiusta
Copy link
Contributor Author

Also just to say that it wasn't just me that worked on developing this, it was very much a team effort from some very talented engineers at Spektrix (but especially @mustodomenico!)

Copy link
Member

@iancooper iancooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, thanks for this. it's a great contribution and really helps us. Most of the comments below are items that we should fix, at some point, but don't block taking this PR. Thanks also for putting in the effort to bring this up to V9.

Required) is something we need to fix. N(ice to have) is something we can fix at some point

Some notes:

1: R. Can you sign the CLA? If you do, feel free to put your name on the copyright for the boilerplate MIT info for a file.
2: N. AzueServiceBusConsumer.cs Usually we end up with a factory to create a Brighter message from a platform message (*Creator). This helps us implement a Tolerant Reader that copes with a variety of failures around the contract by the sender.
3: N. AzueServiceBusConsumer.cs If Acknowledge does ReceiveAndDelete the risk is that work that we don't process successfully will be deleted. This results in message loss if the consumer crashes for example. Hence our usual strategy is to only ack once processed. This would mean you also need to implement Reject as delete, to kill a poison message. As is, it works, it just makes different guarantees to our normal ones.
4: N. AsureServiceBusMessagePrroducer.cs Again, idiomatically, we tend to have a factory factory (*Publisher) to create platform messages from a Brighter message rather than inline. We are strict about what we send.
5: N. AsureServiceBusMessagePrroducer.cs If checking for a topic is expensive, as in with cloud services, we may choose to do that once, in the constructor, over on each call, as the latency of topic existence checking will apply to every message otherwise. For code running in a datacenter with RMQ this is less important so we check each time there (although increasingly we are seeing cloud deployments where this goes over a WAN and is expensive). It is also a no-op in code that tends to check each time
6: R ThreadSafeInMemoryMessageStore.cs. Not sure where this gets used. We have a thread safe in memory outbox and inbox in v9. Is this doing the same thing? Might be useful to understand if we need this?

@iancooper
Copy link
Member

We need to update our guidance on MSpec as we moved away from it. Thanks for spotting that.

@damonsiusta
Copy link
Contributor Author

Missed Dom's comment earlier, so looking in the history of our component the last commit to this was in September 2020 so we've been running it at least in one service since then. Other's may have already come across this but one significant issue we had (and continue to have) is that there is no local emulator for Azure Service Bus. We've been forced into creating individual developer Service Bus instances in Azure (along with corresponding scripts to support the creation and teardown of these). You could write integration tests following this approach (takes around 5 minutes for a new instance).

Just having a look through some of the commits -

  • I think early on we failed to take into account transient connection failures you can get with Service Bus, this was resolved by adding appropriate Polly policies.
  • There was a lot (a lot!) of debate about the correct usage of async methods, in particular when to use GetAwaiter().GetResult(). Before we changed this to its current form, the service would appear to hang and not process any messages until we restarted it.
  • As mentioned in my earlier comment, the reason for all the wrappers is because the Service Bus library we use doesn't expose any interfaces for us to easily Mock (e.g. ManagementClient, TopicClient etc).

Also just remembered that we only use ReceiveMode.ReceiveAndDelete and don't give the option to use ReceiveMode.PeekLock. This is something that will probably need to be implemented as other consumers may want to use this pattern instead.

@iancooper
Copy link
Member

@mustodomenico With the SNS/SQS we call out to AWS for tests (we put the secrets into GitHub's secret manager). I can probably set up an Azure account if we want to have some tests, and pop the relevant secrets into GitHub. If we just use environment variables in the tests then anyone can run locally against their own account by setting up the environment variables, and we can run the tests as part of CI using our own creds.

@iancooper
Copy link
Member

iancooper commented Mar 16, 2021

@damonsiusta @mustodomenico And now I see that CLAHub is bust and has cyber-squatters . Alright let me figure out what to do :-D

clahub/clahub#174

It looks as as though: https://github.com/cla-assistant/cla-assistant may be an alternative

@mustodomenico
Copy link

@iancooper @damonsiusta in regards to point number 3 (ReceiveMode.PeekLock)
The first implementation was using ReceiveMode.PeekLock and appropriately acknowledging (or rejecting) the message once processed. We moved to ReceiveMode.ReceiveAndDelete in order to minimise network communication between the clients and azure service bus. This also results in costs saving as azure service bus charges you based on the amount of API requests. In the particular scenario at Spektrix potential loss of messages was not an issue.
An improvement could be to make this configurable

@iancooper
Copy link
Member

iancooper commented Mar 16, 2021

An improvement could be to make this configurable

Yeah. That is really what the new subscription derived classes are for, so that we have more control over platform specific behavior. So we could have a flag to let the consumer choose whether they want to take the risk of ack on read in return for reduced costs or the safety of ack on complete, but document the cost

@penderi
Copy link
Contributor

penderi commented Mar 16, 2021

We can look to implement a similar integration test approach to SNS - we have azure subs available and we'd follow a similar pattern with secrets.
We could look to add a configurable strategy for ack.
We could look at MSI connections.
We could also look to make creation of topics/subscriptions configurable (for reduced rights deployment).

@iancooper
Copy link
Member

We can look to implement a similar integration test approach to SNS - we have azure subs available and we'd follow a similar pattern with secrets.
We could look to add a configurable strategy for ack.
We could look at MSI connections.
We could also look to make creation of topics/subscriptions configurable (for reduced rights deployment).

That would certainly be a valuable set of things to knock off the shopping list :-)

@iancooper
Copy link
Member

@penderi It would also be valuable to get it to run as part of the CI build in GitHub Actions

@damonsiusta
Copy link
Contributor Author

Just pushed a small update to remove ConcurrentInMemoryMessageStore and ThreadSafeInMemoryMessageStoreTests, looking through the project these weren't used and were probably left behind from an earlier implementation.

@damonsiusta
Copy link
Contributor Author

@iancooper I'm not sure what to do about the CLA? I went to cla-assistant.io but I think this needs configuring on the repo side?

@damonsiusta
Copy link
Contributor Author

On point 5 (checking for a topic), we only check this once on the first call to Send after which _topicCreated field is set to true. We did have an early bug with this where it ended up checking on every message and costs ending up being significantly more.

@penderi
Copy link
Contributor

penderi commented Mar 16, 2021

On point 5 (checking for a topic), we only check this once on the first call to Send after which _topicCreated field is set to true. We did have an early bug with this where it ended up checking on every message and costs ending up being significantly more.

Not a problem. I'm selfishly looking at our deployment scenario for optionally altering the topic/subscription default behaviour.
Also, for context, I'm far from a core maintainer or contributor on brighter. Over the ages I've been involved but very much an eager and willing helper for the maintainers! My comments are merely suggestions for the team!

@iancooper
Copy link
Member

@iancooper I'm not sure what to do about the CLA? I went to cla-assistant.io but I think this needs configuring on the repo side?

I need to fix it up. Just spotted that the old tooling we used is bust. It will take me a day or so then I will ping you. Bit if a formality, but we need to get it signed before we merge.

@iancooper
Copy link
Member

Just pushed a small update to remove ConcurrentInMemoryMessageStore and ThreadSafeInMemoryMessageStoreTests, looking through the project these weren't used and were probably left behind from an earlier implementation.

Thanks. They were probably needed at one point, when we forced you to have a message store, if you wanted one that was in-memory. We have newer in-memory inboxes and outboxes in v9, with TTL and reclaiming of space if folks need that

@iancooper
Copy link
Member

On point 5 (checking for a topic), we only check this once on the first call to Send after which _topicCreated field is set to true. We did have an early bug with this where it ended up checking on every message and costs ending up being significantly more.

Yeah, the support for: assume its there; validate its there; make it if missing is new in v9.

When we started with data-center based message-oriented middleware you tended to declare what you needed and it was a no-op if it was there (RMQ frx). Increasingly as we deploy to cloud we discover that folks are building their infra with Terraform, Pulumi etc and don't want the client to do it.

So V9 now has a flag on a subscription and a pubication to say: what should we do. It would be good to fold this into the ASB transport so that it is equivalent. (We still don't do this for MSSQL or Redis, but slightly different worlds)

@iancooper
Copy link
Member

PS @damonsiusta and @penderi we are always excited to get new contributors or old contributors coming back to help

@CLAassistant
Copy link

CLAassistant commented Mar 16, 2021

CLA assistant check
All committers have signed the CLA.

@iancooper
Copy link
Member

@iancooper I'm not sure what to do about the CLA? I went to cla-assistant.io but I think this needs configuring on the repo side?

I need to fix it up. Just spotted that the old tooling we used is bust. It will take me a day or so then I will ping you. Bit if a formality, but we need to get it signed before we merge.

@damonsiusta Should be fixed, you can sign now

Copy link
Member

@iancooper iancooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to suggest we merge and fix

@iancooper
Copy link
Member

Going to suggest we merge and fix @holytshirt I have suggested to @penderi that he creates a new branch and PR for the nice-to-haves. Are you good with that?

@iancooper
Copy link
Member

PS We need to add @damonsiusta to our contributor list, but is there anyone else to credit?

@damonsiusta
Copy link
Contributor Author

@iancooper So for contributors besides myself we have:
@alesgn
@sagarSahay
@FBryant87
and
@mustodomenico

@penderi
Copy link
Contributor

penderi commented Mar 17, 2021

If you can merge ASAP we can look to create at least 1 PR, and possibly 4 - 1 for each feature mentioned above:

  • at least 1 integration test with ability to run against azure config info in secrets/env variables as per SQS tests
  • add ability to specify MSI connection on ASB Publication/Config
  • config to allow creation of topics/subscriptions to be optional
  • configurable behaviour of ack on read/write (in Channels? I'm going to need a little more info on that)

2/4 seem straightforward. The last 2 will need more thought, so we may need a little guidance. Separating the above and opening separate discussions/draft PRs seem sensible.

@holytshirt holytshirt merged commit 79002eb into BrighterCommand:master Mar 18, 2021
@holytshirt
Copy link
Member

Merged to master.
Amazing thank @damonsiusta and team so much for this, and thanks for all the background information!

@penderi it's on master now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Azure Service Bus
6 participants