Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quiche: handle stream blockage during decodeHeaders and decodeTrailers #16128

Merged
merged 10 commits into from
Apr 29, 2021

Conversation

danzh2010
Copy link
Contributor

@danzh2010 danzh2010 commented Apr 22, 2021

Commit Message: use posted callback to block/unblock quic stream in EnvoyQuicStream::readDisable() with weak_ptr to stream to handle stream life time issue.
Additional Description: currently if readDisabled() is called in decodeHeaders|Trailers(), the stream will be blocked right away which breaks the assumption QUICHE has that a stream shouldn't be blocked during OnInitialHeadersComplete() and OnTrailingHeadersComplete().
This change makes the stream state change completely outside of QUICHE call stack. (The unblocking is already outside of QUICHE call stack in existing implementation.) Also simplify the blockage state change logic.

Risk Level: low
Testing: added more unit tests and enabled Http2UpstreamIntegrationTest::ManyLargeSimultaneousRequestWithBufferLimits which was flaky.

Part of #2557 #14829

@danzh2010
Copy link
Contributor Author

/assign @antoniovicente

Signed-off-by: Dan Zhang <[email protected]>
Signed-off-by: Dan Zhang <[email protected]>
@danzh2010
Copy link
Contributor Author

/assign @alyssawilk

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tracking this one down!

// stream can trigger another decoding inside the callstack which messes up stream state, and
// blocking a quic stream in a QUICHE callstack is not expected by QUICHE.
*unblock_posted_ = true;
connection()->dispatcher().post(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: If this is happening with regularity would it make more sense to have an alarm registered at now? It'd handle the deletion of the session by unregistering on destruction so be a bit cleaner and you could check alarm registration rather than have a boolean here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Please consider use of a SchedulableCallback owned by the stream. Call scheduleCallbackNextIteration() to schedule the callback; call cancel() or delete the SchedulableCallback to avoid getting called once you have entered the stream termination sequence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to use SchedulableCallback. Instead of scheduleCallbackNextIteration(), I used scheduleCallbackInCurrentIteration() because all we need is to not call stream SetUnblocked() in place.

sequencer()->SetBlockedUntilFlush();
} else {
ASSERT(read_disable_counter_ == 0, "readDisable called in between.");
sequencer()->SetUnblocked();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at SetBlockedUntilFlush it's pretty clear calling it extra times won't waste CPU or be unsafe. If we read disable then read enable, we may end up calling some spurious SetUnblocked. That looks ok but can you confirm, and comment in code if it's fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, an extra OnDataAvailable() won't cause another push as data should have already been pushed in QUICHE stack or remain blocked by other condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, can you please comment that in code then? The other Envoy read enable/disable need parity (one unblock per block) so I think it's worth calling out that this is different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented at scheduling async_stream_blockage_change_ in readDisable() about the transient block and unblock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that's not quite what I was looking for.
Can we do something like comment in switchStreamBlockState that unlike the envoy readDisable calls, that if the underlying QUIC code has received N block calls, it will still unblock on one block call (so parity between block and unblock calls is not necessary)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// stream will be spuriously unblocked and call OnDataAvailable(). This call shouldn't take any
// effect because any available data should have been processed already upon arrival or they
// were blocked by some condition other than flow control, i.e. Qpack decoding.
async_stream_blockage_change_->scheduleCallbackCurrentIteration();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use scheduleCallbackNextIteration

Use of scheduleCallbackCurrentIteration can result in an infinite chain of callbacks which starve every other connection in the proxy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

DavidSchinazi
DavidSchinazi previously approved these changes Apr 28, 2021
Copy link
Contributor

@DavidSchinazi DavidSchinazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Feel free to ignore the nit if that's the only change before submission to avoid having to rerun CI just for that.

EXPECT_EQ("200", headers->getStatusValue());
quic_stream_->readDisable(true);
}));
if (quic::VersionUsesHttp3(quic_version_.transport_version)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if (quic_version_.UsesHttp3()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo one comment nit

sequencer()->SetBlockedUntilFlush();
} else {
ASSERT(read_disable_counter_ == 0, "readDisable called in between.");
sequencer()->SetUnblocked();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, can you please comment that in code then? The other Envoy read enable/disable need parity (one unblock per block) so I think it's worth calling out that this is different.

Signed-off-by: Dan Zhang <[email protected]>
Signed-off-by: Dan Zhang <[email protected]>
Signed-off-by: Dan Zhang <[email protected]>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for your patience here :-)

@alyssawilk alyssawilk merged commit 9e263d8 into envoyproxy:main Apr 29, 2021
gokulnair pushed a commit to gokulnair/envoy that referenced this pull request May 6, 2021
envoyproxy#16128)

Commit Message: use posted callback to block/unblock quic stream in EnvoyQuicStream::readDisable() with weak_ptr to stream to handle stream life time issue.
Additional Description: currently if readDisabled() is called in decodeHeaders|Trailers(), the stream will be blocked right away which breaks the assumption QUICHE has that a stream shouldn't be blocked during OnInitialHeadersComplete() and OnTrailingHeadersComplete().
This change makes the stream state change completely outside of QUICHE call stack. (The unblocking is already outside of QUICHE call stack in existing implementation.) Also simplify the blockage state change logic.

Risk Level: low
Testing: added more unit tests and enabled Http2UpstreamIntegrationTest::ManyLargeSimultaneousRequestWithBufferLimits which was flaky.

Part of envoyproxy#2557 envoyproxy#14829

Signed-off-by: Dan Zhang <[email protected]>
Co-authored-by: Dan Zhang <[email protected]>
Signed-off-by: Gokul Nair <[email protected]>
gokulnair pushed a commit to gokulnair/envoy that referenced this pull request May 6, 2021
envoyproxy#16128)

Commit Message: use posted callback to block/unblock quic stream in EnvoyQuicStream::readDisable() with weak_ptr to stream to handle stream life time issue.
Additional Description: currently if readDisabled() is called in decodeHeaders|Trailers(), the stream will be blocked right away which breaks the assumption QUICHE has that a stream shouldn't be blocked during OnInitialHeadersComplete() and OnTrailingHeadersComplete().
This change makes the stream state change completely outside of QUICHE call stack. (The unblocking is already outside of QUICHE call stack in existing implementation.) Also simplify the blockage state change logic.

Risk Level: low
Testing: added more unit tests and enabled Http2UpstreamIntegrationTest::ManyLargeSimultaneousRequestWithBufferLimits which was flaky.

Part of envoyproxy#2557 envoyproxy#14829

Signed-off-by: Dan Zhang <[email protected]>
Co-authored-by: Dan Zhang <[email protected]>
Signed-off-by: Gokul Nair <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants