Skip to content
This repository has been archived by the owner on Jun 1, 2021. It is now read-only.

Causal stream merge stage #342

Open
krasserm opened this issue Oct 27, 2016 · 0 comments
Open

Causal stream merge stage #342

krasserm opened this issue Oct 27, 2016 · 0 comments
Labels

Comments

@krasserm
Copy link
Contributor

Provide an Akka Stream stage that can merge n input streams (produced by DurableEventSources) by preserving causality i.e. the merged stream should still be a valid linear extension of the potential causality relation of DurableEvents. Merging can be done with linear complexity by scanning the input streams only once.

Causal stream merge is relevant when merging events from two or more sources that causally depend on each other. For example, if events from log A have been processed into log B, B causally depends on A. A plain merge of A and B would ignore this causal relationship and produce a stream in which effects may appear before their causes. Causal stream merge preserves causality and guarantees that effects only appear after their causes in the merged stream.

A use case is processing a stream, merged from A and B, with a single stateful DurableEventProcessor that writes processed events to target log C. Without doing a causal stream merge, the stream processing setup would be equivalent to two parallel processing pipelines, one from A to C and the other B to C each having its own processor instance. This however breaks state computation as state cannot be shared between processor instances.

An application could still decide to ignore the causal relationship of events from logs A and B by using a plain merge. The generated vector timestamps would just indicate their concurrent processing which is completely fine for some use cases. However, applications that want to do sequential processing of merged streams with a single processor should use causal stream merge, especially if the processor is a stateful processor whose processing logic relies on causal event delivery.

Causal stream merge works for arbitrary processing networks including cyclic ones. The processing network may overlap with replication networks if they are not filtered. Filtered replication connections may break the causal merge algorithm.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant