Implement /restate/invocation/{id}/attach and /restate/invocation/{id}/output #1503

slinkydeveloper · 2024-05-13T12:09:59Z

slinkydeveloper · 2024-05-13T12:11:50Z

The important bits of this PR are the new Commands in the wal protocol, and the changes to the ingress dispatcher.

slinkydeveloper · 2024-05-13T13:51:25Z

Tested with restatedev/e2e#334 locally and it works

tillrohrmann

Thanks for creating this PR @slinkydeveloper. The changes look good to me. I mainly left minor comments. The one comment we probably should discuss is about linearizable reads and the need to complete requests by request type. I think with this approach we will not be able to provide linearizable reads of the result because the result might originate from before a read request was sent. What was the reasoning behind batch completing requests (not having to store multiple request ids on the PP side)?

crates/bifrost/src/bifrost.rs

crates/ingress-dispatcher/src/dispatcher.rs

crates/types/src/ingress.rs

crates/types/src/invocation.rs

crates/ingress-dispatcher/src/lib.rs

tillrohrmann · 2024-05-15T09:58:41Z

crates/ingress-dispatcher/src/dispatcher.rs

+    waiting_responses: DashMap<
+        (InvocationIdOrIdempotencyId, RequestType),
+        HashMap<IngressResponseWaiterId, IngressInvocationResponseSender>,
+    >,


Grouping an fulfilling requests by request type instead by IngressResponseWaiterId can have some undesired side effects. Assume for example the following scenario: We have a service invocation that calls another service on completion which writes to an external DB. The user is observing this effect and then sends a get output request to the system in anticipation to get the fulfilled result of the first invocation. Now assume that there was an earlier get output request before the service invocation completed. If the non-completed get output response is slow, then this response might arrive after the second get output request has been created. Consequently, the non-completed get output response would also complete the second get output request even though one would expect to see a result.

So long story short with this implementation we won't be able to achieve linearizable reads because the result of a read might originate from before the read request was sent.

slinkydeveloper · 2024-05-16T15:55:50Z

@tillrohrmann this is ready for another implementation. To make the rebase simpler, i had to squash everything together, i'm sorry.

Among the changes I've made:

Feedback from your previous review
Use a direct storage reader for the get output feature
Added /workflow/../attach and /workflow/../output (we need these for the workflow sdk clients)
Send attach notification for workflow send calls

tillrohrmann

Thanks for updating this PR @slinkydeveloper. The changes make sense to me. There is one thing I would like to better understand and one comment we might want to address:

Why is it not possible to model the workflow calls as idempotent calls where the idempotency key is derived from the workflow name and key? If it were possible, then we might be able to simplify some special cases in the partition processor state machine.
Giving the ingress direct access to the PartitionStoreManager is not ideal since we want to make the ingress independent of the worker (sooner than later). Instead, it would be better to use the network to let the ingress communicate with the worker to retrieve the output results.

crates/ingress-dispatcher/src/dispatcher.rs

crates/ingress-dispatcher/src/lib.rs

crates/ingress-http/src/handler/workflow.rs

crates/types/src/invocation.rs

tillrohrmann · 2024-05-19T19:44:18Z

crates/worker/src/ingress_integration.rs

+    }
+}
+
+impl InvocationStorageReader for InvocationStorageReaderImpl {


I think it would be great if the ingress wouldn't get direct access to the PartitionStoreManager because we want to make the ingress independent of the worker (being able to start it as a separate role). Instead, it would be great if the ingress would talk to the respective node via our network to ask for the output data. Otherwise, we cement a bit the co-location of the ingress and the worker role.

As said below, this is purely an implementation detail of the InvocationStorageReaderImpl and we can change this later.

tillrohrmann · 2024-05-19T19:56:36Z

crates/worker/src/partition/state_machine/command_interpreter/mod.rs

+        if let Some(idempotency_id) = &idempotency_id {
            if service_invocation.invocation_target.invocation_target_ty()
                == InvocationTargetType::Workflow(WorkflowHandlerType::Workflow)
            {
                warn!("The idempotency key for workflow methods is ignored!");


Can you help me again understanding why we cannot model workflow calls as idempotent invocations where the idempotency key is derived from the workflow name and key? Maybe this is also something that we can document somewhere (differences between workflow calls and normal invocations).

See my comment below. Where to document this, IDK, feel free to suggest it.

crates/worker/src/partition/state_machine/mod.rs

crates/types/src/invocation.rs

slinkydeveloper · 2024-05-20T06:20:39Z

Why is it not possible to model the workflow calls as idempotent calls where the idempotency key is derived from the workflow name and key? If it were possible, then we might be able to simplify some special cases in the partition processor state machine.

Because:

the idempotency key really doesn't make sense in the workflow call case, i would find it surprising to see it in that table.
the workflow calls do "other" special things, e.g. on cleanup they clean state and promises.

I also don't want to over-generalize now, because I think we're not completely sure about the workflow semantics, and keeping it specialized it's true it's not elegant and adds special cases, but makes it easier to tune for now.

Giving the ingress direct access to the PartitionStoreManager is not ideal since we want to make the ingress independent of the worker (sooner than later). Instead, it would be better to use the network to let the ingress communicate with the worker to retrieve the output results.

Agree, but this can be hidden behind that "reader" abstraction. If that's ok for you, i would like to go ahead with the current implementation and iterate on that later.

* Minor stuff * We now correctly implement the distinction between get output and attach responses. This makes sure we don't complete pending attach with get output responses. * Also deduplicated the ingress response types by adding a wrapping type for the target node. * Another test, just in case, plus make sure the ingress dispatcher will try to unblock waiting responses both with idempotency id or invocation id * Implement /restate/invocation/{id}/attach and /restate/invocation/{id}/output. * Now the invocation id returned on /send with idempotency id is the one of the original invocation. * Implement the new commands and the invocation id notification in the PP * New ingress response variant to notify the invocation_id of an existing idempotent invocation. This is required to make sure that on /send idempotent requests requests, we return back the invocation_id of the invocation we "attach to" for a given idempotent invocation. * New state machine command to attach to an existing invocation Second part of the PR: * Add InvocationStorageReader to implement GetOutput, remove the GetOutput command * Send attach notification for workflow calls. * Feedback Add /workflow/.../output and /workflow/.../attach

tillrohrmann · 2024-05-20T19:54:46Z

Because:

the idempotency key really doesn't make sense in the workflow call case, i would find it surprising to see it in that table.

the workflow calls do "other" special things, e.g. on cleanup they clean state and promises.

I also don't want to over-generalize now, because I think we're not completely sure about the workflow semantics, and keeping it specialized it's true it's not elegant and adds special cases, but makes it easier to tune for now.

Thanks for the clarification. The different behavior wrt to state makes sense.

Agree, but this can be hidden behind that "reader" abstraction. If that's ok for you, i would like to go ahead with the current implementation and iterate on that later.

Alrighty. Let's tackle this step after the release.

tillrohrmann

LGTM. +1 for merging :-)

tillrohrmann self-requested a review May 13, 2024 13:13

tillrohrmann reviewed May 15, 2024

View reviewed changes

slinkydeveloper force-pushed the issues/1494 branch 3 times, most recently from c807cce to bfcecc5 Compare May 16, 2024 15:53

slinkydeveloper requested a review from tillrohrmann May 16, 2024 15:53

tillrohrmann reviewed May 19, 2024

View reviewed changes

crates/types/src/invocation.rs Outdated Show resolved Hide resolved

slinkydeveloper force-pushed the issues/1494 branch from b9a7d0e to b1ecf03 Compare May 20, 2024 07:31

slinkydeveloper added 4 commits May 20, 2024 09:33

Rebase

3644643

Feedback

80a6ef9

Rebase

9e10002

slinkydeveloper force-pushed the issues/1494 branch from b1ecf03 to 9e10002 Compare May 20, 2024 07:34

tillrohrmann approved these changes May 20, 2024

View reviewed changes

slinkydeveloper merged commit ad2db81 into restatedev:main May 21, 2024
6 checks passed

slinkydeveloper deleted the issues/1494 branch May 21, 2024 07:09

tillrohrmann mentioned this pull request May 21, 2024

Swallowing submit notification when enqueuing into the inbox #1535

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement /restate/invocation/{id}/attach and /restate/invocation/{id}/output #1503

Implement /restate/invocation/{id}/attach and /restate/invocation/{id}/output #1503

slinkydeveloper commented May 13, 2024

slinkydeveloper commented May 13, 2024

slinkydeveloper commented May 13, 2024 •

edited

Loading

tillrohrmann left a comment

tillrohrmann May 15, 2024

tillrohrmann May 15, 2024

slinkydeveloper commented May 16, 2024

tillrohrmann left a comment

tillrohrmann May 19, 2024

slinkydeveloper May 20, 2024

tillrohrmann May 19, 2024

slinkydeveloper May 20, 2024

slinkydeveloper commented May 20, 2024 •

edited

Loading

tillrohrmann commented May 20, 2024

tillrohrmann left a comment

Implement /restate/invocation/{id}/attach and /restate/invocation/{id}/output #1503

Implement /restate/invocation/{id}/attach and /restate/invocation/{id}/output #1503

Conversation

slinkydeveloper commented May 13, 2024

slinkydeveloper commented May 13, 2024

slinkydeveloper commented May 13, 2024 • edited Loading

tillrohrmann left a comment

Choose a reason for hiding this comment

tillrohrmann May 15, 2024

Choose a reason for hiding this comment

tillrohrmann May 15, 2024

Choose a reason for hiding this comment

slinkydeveloper commented May 16, 2024

tillrohrmann left a comment

Choose a reason for hiding this comment

tillrohrmann May 19, 2024

Choose a reason for hiding this comment

slinkydeveloper May 20, 2024

Choose a reason for hiding this comment

tillrohrmann May 19, 2024

Choose a reason for hiding this comment

slinkydeveloper May 20, 2024

Choose a reason for hiding this comment

slinkydeveloper commented May 20, 2024 • edited Loading

tillrohrmann commented May 20, 2024

tillrohrmann left a comment

Choose a reason for hiding this comment

slinkydeveloper commented May 13, 2024 •

edited

Loading

slinkydeveloper commented May 20, 2024 •

edited

Loading