Fixing hasNext behaviour #1745

Geal · 2022-09-12T10:07:57Z

we previously returned an empty graphql response at the end of the
response stream to set the hasNext field to false, to indicate that no
more responses will come.

That empty response is causing issues in some clients, so 24a00e6 was a
fix to set the hasNext on a deferred response from inside query
planner execution, but it does not account for parallel deferred
response executions, so one response might come with hasNext to false
then get another one.

This commit uses another approach, where we go through an
intermediate task that checks if the response stream is closed.

Fixes #1687

…onse (#1736)" This reverts commit 24a00e6.

we previously returned an empty graphql response at the end of the response stream to set the `hasNext` field to false, to indicate that no more responses will come. That empty response is causing issues in some clients, so 24a00e6 was a fix to set the `hasNext` on a deferred response from inside query planner execution, but it does not account for parallel deferred response executions, so one response might come with `hasNext` to false then get another one. This commit attempts another solution, where we go through an intermediate task that checks if the response stream is closed (it is a channel, so it implements `FusedStream`). Unfortuantely, right now it fals to recognize when the stream is closed

bnjjj · 2022-09-12T10:12:50Z

apollo-router/src/services/execution_service.rs


-            let stream = once(ready(first)).chain(rest).boxed();
+            let (mut sender2, receiver2) = futures::channel::mpsc::channel(10);


Can we find a better name ? :p

that's "WiP"

I'll find a better name once I get it working :D

since the stream is marked as terminated from inside poll_next, we need to call it a second time after getting a message, to check if it is closed, but we cannot do that with an async method, since we need to send the current message ASAP. So we call `try_next`, and depending on its result, we either send the current message, or set hasNext on the current message and send it if the channel is closed, or send the current message, get the next one and try again to see if there's another one

abernix · 2022-09-13T06:44:31Z

@jpvajda I removed the bug label as we don't put bug labels on PRs but rather the issues that PRs close with fixes. Since #1687 is the issue this PR aims to close, the bug label is over there now 🪰 .

Geal · 2022-09-13T14:15:20Z

so this solution with channels appears to be working when testing manually, but not in integration tests. I am testing another solution using an atomic counter, but it is currently very racy.

Some issues I'm encountering here:

some deferred responses can be entirely created from the primary query, but their actual content is generated in the supergraph service, in the format_response call. In some cases, that can result in an empty deferred response that should not be sent. But at this point we might have already decided that this was the last response and set has_next on it
some deferred responses can be generated in parallel, and decrease the counter faster than the other end can keep up. That leads to cases where we're waiting for 2 deferred responses (counter = 2), both finish at the same time (counter = 0), now the response filter sees the counter at 0 and sets has_next to false on the first one, then it receives the second deferred response

Geal · 2022-09-13T14:35:41Z

I will need #1640 to land first: to make the behaviour more coherent, I will move the response formatting step to the execution service, so the execution service will always return correct responses

Geal · 2022-09-13T15:45:32Z

This is now working and can be reviewed (I'll clean up the remaining println calls tomorrow)

BrynCooke · 2022-09-14T08:28:29Z

apollo-router/src/services/execution_service.rs

+    receiver
+}
+
+async fn consume_responses(


I think there may be a race condition here.

In the case where stream.try_next() is called and returns an error as there may be more items in the stream, if the stream is then closed before the call to next in filter_stream, there will be no final empty response with has_next: false

try_next is like doing next but without await: if there's a message in the stream, it will be returned by try_next, if not then try_next will return an error and the next call to next would wait for it.

In the case you are describing, try_next would not return an error if there are in flight messages. In the case where try_next would return an error, then somehow between returning from consume_responses and the call to next the stream gets new messages then is closed, then next would return a message, some calls to try_next would return messages, then when there's nothing remaining try_next would return Ok(None)`.

The one possible race I worry about is if messages are received and re-sent, then we await on next, then for whatever reason the stream is disconnected (maybe all the senders are dropped). Then we would need to add a final has_next = false response. But I don't see how this could play out

@BrynCooke that last cvase should be addressed by 5210765

there could be a race condition where we consume and send all messages, then await for the next one, then the stream is closed and we don't have any message on which we would set `has_next`. So we detect that case and add a last message

Geal added 2 commits September 12, 2022 11:23

Revert "Set correctly hasNext for the last chunk of a deferred resp…

88f24e5

…onse (#1736)" This reverts commit 24a00e6.

bnjjj reviewed Sep 12, 2022

View reviewed changes

Geal mentioned this pull request Sep 12, 2022

eager hasNext: false in @defer payloads #1687

Closed

jpvajda added the bug label Sep 12, 2022

cleanup

1d83d78

abernix removed the bug label Sep 13, 2022

Geal added 4 commits September 13, 2022 10:40

Merge branch 'main' into geal/fix-hasnext

be8a0ab

remove println

0093897

try to fix

227e0cb

Merge branch 'main' into geal/fix-hasnext

31b2735

Geal added 2 commits September 13, 2022 16:24

use Sender::disconnect to notify early that the sender will not be used

1efd9a4

filter empty responses

17e383a

fix

fda59d3

Geal changed the title ~~WIP: attempt at fixing hasNext~~ Fixing hasNext behaviour Sep 13, 2022

Geal marked this pull request as ready for review September 13, 2022 15:43

bnjjj approved these changes Sep 13, 2022

View reviewed changes

abernix assigned Geal Sep 14, 2022

BrynCooke reviewed Sep 14, 2022

View reviewed changes

Geal added 2 commits September 14, 2022 11:36

Merge branch 'main' into geal/fix-hasnext

2ddb5e9

BrynCooke approved these changes Sep 14, 2022

View reviewed changes

changelog

d91fa00

Geal enabled auto-merge (squash) September 14, 2022 09:57

Geal merged commit be96131 into main Sep 14, 2022

Geal deleted the geal/fix-hasnext branch September 14, 2022 10:10

abernix mentioned this pull request Sep 14, 2022

release: v1.0.0-rc.0 #1775

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing hasNext behaviour #1745

Fixing hasNext behaviour #1745

Geal commented Sep 12, 2022 •

edited

Loading

bnjjj Sep 12, 2022

Geal Sep 12, 2022

Geal Sep 12, 2022

abernix commented Sep 13, 2022

Geal commented Sep 13, 2022

Geal commented Sep 13, 2022

Geal commented Sep 13, 2022

BrynCooke Sep 14, 2022 •

edited

Loading

Geal Sep 14, 2022

Geal Sep 14, 2022

BrynCooke Sep 14, 2022


		let stream = once(ready(first)).chain(rest).boxed();
		let (mut sender2, receiver2) = futures::channel::mpsc::channel(10);

Fixing hasNext behaviour #1745

Fixing hasNext behaviour #1745

Conversation

Geal commented Sep 12, 2022 • edited Loading

bnjjj Sep 12, 2022

Choose a reason for hiding this comment

Geal Sep 12, 2022

Choose a reason for hiding this comment

Geal Sep 12, 2022

Choose a reason for hiding this comment

abernix commented Sep 13, 2022

Geal commented Sep 13, 2022

Geal commented Sep 13, 2022

Geal commented Sep 13, 2022

BrynCooke Sep 14, 2022 • edited Loading

Choose a reason for hiding this comment

Geal Sep 14, 2022

Choose a reason for hiding this comment

Geal Sep 14, 2022

Choose a reason for hiding this comment

BrynCooke Sep 14, 2022

Choose a reason for hiding this comment

Geal commented Sep 12, 2022 •

edited

Loading

BrynCooke Sep 14, 2022 •

edited

Loading