Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support idempotent writes #26

Open
asosMikeGore opened this issue Nov 10, 2017 · 7 comments
Open

Support idempotent writes #26

asosMikeGore opened this issue Nov 10, 2017 · 7 comments

Comments

@asosMikeGore
Copy link

When writing streams, we currently do no checks to determine if an event already exists in a stream. This means the application logic has to implement this functionality which generally ends up as lots of boilerplate code.

The way we can improve this is to allow writes to have an idempotent check to determine if the write has already occurred and ignore it if it has. The current thinking for how to do this is

  • For writes with a single event, query the stream for an event with matching id and ignore it
  • For writes with multiple events, query the stream based only on the 1st event. The benefits of doing it this way are that it'll reduce RU consumption, plus if the write is truly idempotent all the other events should be the same.
  • Make the option configurable so people can opt-in/out to the check. Current thinking is the feature should be off by default as it'll increase RU consumption.

Clients would have to use deterministic GUIDs when appending events to the stream, otherwise the check would never work.

@PeterStephenson
Copy link

I think there's an issue using the ID to handle idempotency...

Say we have a message handler subject to "at least once delivery" if we receive the same message twice then the processing will run twice, but typically we would assign a new guid to any resultant events. This would mean the ids always differ.
I actually can't think of a scenario where 2 events created by running a code section twice would result in the same ID :s

@asosMikeGore
Copy link
Author

That's why we'd have to use deterministic GUIDs for this to work. As you say doing Guid.NewGuid() is always going to generate a new value.

Check this Stack Overflow article for an example:
https://stackoverflow.com/questions/2642141/how-to-create-deterministic-guids

You end up creating a GUID based on a string hash - that way you can get predictable results each time an input command/even is processed.

@DaveAurionix
Copy link

DaveAurionix commented Nov 10, 2017

We would definitely look at using this if it was added, but the toggle on/off config would be essential.

I'm a little worried about:

  1. The guid generation being open to flawed implementations (e.g. think of the standard things to watch around GetHashCode implementations). Generally I feel that deterministic hash codes allow a bucketing optimisation (e.g. in dictionaries) but are always backed up with a detailed property-level comparison between objects as a sanity check. As long as the guid was generated based on something simple then the risk of bugs and the resulting collision would be very low. If we decide that the "duplicate event" decision needs to examine several properties then we're reliant on mathematically sound ways to generate that guid uniquely, including allowing for nulls, empty-strings, and other default values without introducing bugs. Collision begins unlikely, but becomes more likely as bugs are introduced - and bugs that may be hard to spot.
  2. Different versions of an event may need to change the data fed into the guid generation / hashing mechanism which may inadvertently break the idempotency checks. This would be mitigated by good unit test coverage around the guid generation, with a very careful approach to backwards compatibility and refactoring unit tests very carefully to cover all combinations of data from historical scenarios (events now modified ... but exist in the stream in an older form). Are we too dependent on very careful code reviews and pairing, rather than being safe-by-default?
  3. Although we do have the boilerplate code problem, it at least makes it really obvious to people what the idempotency check is, and that there is one. It's very easy to unit test. It makes people think of idempotency needs constantly. There's no magic.
  4. The current boilerplate approach applies the idempotency check at the beginning of event handlers. The proposed new approach would apply the check at the end, when a stream was saved. It would perhaps not be easily readable from the code but developers would need to constantly remember that code will always execute twice before one route fails a check. This solution therefore wouldn't remove boilerplate code for methods that have any side-effects, or those side-effects would need to be buffered and "acted out / applied" if the Save operation succeeded ... which then raises worse rollback/compensation questions if the Save succeeds but side-effects fail ... so wouldn't be an answer. Whilst this weakness still exists with the boilerplate code, it is very very unlikely to happen as two handlers have to fire at exactly the same time with exactly the same data.

Regardless of the above ... this next idea looks a little naff in storage, but is it worth making the event id a string not a guid and allowing the consumer to build up a string that is easier to debug, e.g. "{EventName}" might be a default implementation, or "{EventName}-{DateTime}-{Amount}" might represent some sort of payment event. I'm assuming that event-id only needs to be unique within a stream, not across streams. It would remove the guid determinism and collision concerns and it would be easier to debug and support multiple versions. It makes it obvious what's going on. On the other hand, it looks horrible in storage and it duplicates data found elsewhere on the event. It couldn't be compressed if the storage engine is responsible for the idempotency check.

Alternatively, I'm not sure if the boilerplate code issue could be minimised avoiding all of the above with attributes rather than as an event-store concern, e.g.

[IfNotAlreadyApplied("PaymentRequested-{Customer.Id}-{DateTime}-{Payment.Amount}")]
public void AuthorisePayment(PaymentRequested paymentRequestedEvent)
{
  // Do stuff, then ...
  Apply(paymentRequestedEvent);
}

that still needs "boilerplate" code ... but actually is very easy to read, to see what's happening, to manage event versions and it can provide the idempotency check before the method executes. The elements between braces are properties on PaymentRequested. The above might be implemented using reflective code in a library which wouldn't look that pretty, but the API to the consumer seems attractive and simple? The business logic is easy to read? Or it could be implemented using post-processing and Roslyn code generation (but I'd consider that road carefully before taking it).

@DaveAurionix
Copy link

Actually this would be better, where PaymentRequested implements an interface returning a string EventId ... but that means all persisted events would need to implement an interface from a library, which sucks - at the moment all ours are POCO):

[IfNotAlreadyApplied(typeof(PaymentRequested))]
public void AuthorisePayment(PaymentRequested paymentRequestedEvent)
{
  // Do stuff, then ...
  Apply(paymentRequestedEvent);
}

@PeterStephenson
Copy link

@asosMikeGore I think I'd be worried about GUID collisions if we were using a deterministic guid as they are no longer cryptographic ally random, correct? Do you have any testing to show that if we were to do this with our throughput we would not see any collisions? It seems like a collision on this ID would be very bad?

@asosMikeGore
Copy link
Author

@DaveAurionix @PeterStephenson
Current thinking would be that opt-in/out would be on a per-append basis. Worth mentioning a few things around this:

  • The idempotency check would only ever check within a stream, never across streams.
  • If you set the GUID namespace for a name based GUID to be the GUID of the input command/event being processed this reduces the risk of collision within a stream even more. I do think it's worth spiking this out and researching collision rates as I've not really had a use case to use deterministic GUIDs before.
  • I don't see the value in comparing fields or worrying about versions of events. Either it's the same GUID or not.
  • The idempotency feature can't be used in isolation if your command/event handler has side-effects that are themselves not idempotent e.g. refunding a customer. You'd still need to do an upfront check. In that scenario you may as well not use the check, but it could be very helpful where you are taking a message and producing events from it e.g. process managers.
  • I know other people are doing this with event sourcing (it's how it works with Event Store), but I'm not too sure to what scale.
  • If command/handler logic is side effect free, you just append the corresponding event to the stream after reading the latest revision of the stream. If we had a feature in to disable optimistic concurrency, but have idempotency check in place, then you are literally just appending an event, no read required.
  • While an app-centric idempotency check is easy to code & unit test, there is an associated cost, so really with this I'm just trying to see what we can do to make things easier for people.

@DaveAurionix
Copy link

DaveAurionix commented Nov 15, 2017

Thank you Mike.

I think the event-version question is relevant. "Same GUID" needs defining and that creates the issue. If an event (say V1 for ease) is represented by a certain deterministic guid, and later a new property is introduced (V2 of the same event, conceptually but with versionless events it's more of a label) then people would need to think carefully about how that new property affects idempotency. If its value is included in the definition of that specific business event then the algorithm used to generate the deterministic guid suddenly needs very careful attention so that it generates the same value for older events without that new property whilst at the same time generating a guid that takes into account the new property for newer events. If it broke backwards compatibility for older events then the idempotency check in the storage layer would fail. A stream with the V1 event, but consumed by a software version where handlers now use a newer guid generation algorithm, would not flag up a V2 as being a duplicate, because the generated guid would be different. Worse - if they received another "V1" deserialised as a V2 (because: versionless), they still might not flag it as duplicate if the default value for that field doesn't contribute to producing the same guid value as the V1 algorithm did without the property factored in at all.

Unit testing and very careful thinking would I'm sure come up with a solution - but I think that even the need to think so carefully about determinism is non-obvious, let alone the unit test cases and subsequent implementation required.

Underpinning the above is the assumption that an idempotency check on some events may need to examine more than just the event name. If we're able to operate solely in the simpler problem space then I agree wholeheartedly that the guid determinism complexity goes away / is irrelevant.

Doing away with the initial read sounds powerful, I very much like the sound of that. That said, I struggled to find many examples where we could. I'm certain that it's my lack of imagination, but it seems that it would only be possible in handlers that needed no input besides the triggering event, plus perhaps used web api GETs (which we discourage in our micro-services) or that accessed an internal database (which negates the benefit of removing the read, and the event store would typically be that source of data anyway), plus were side-effect free ... so in my mind I arrive at a place where skipping the read and instead checking-on-save is a combination that is only possible for simple side-effect-free single-event-in handlers that only publish one or more events out using no additional data (or perhaps just GET from a 3rd party). What am I missing here? There's an elephant and I can't see it even when it's treading on my foot.

To me, the storage-layer idempotency check sounds great and I want to find ways to use it more, but it would initially be a "nice-to-have" to create new options for us when designing solutions. I doubt we'd launch into using it all over the place, but again that might just be my initial lack of imagination. Would be very interested to see applications that use it well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants