-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
op-batcher: adjust error handling on pending-channels after close #7683
Conversation
Semgrep found 3
Please create a Linear ticket for this TODO. Ignore this finding from todos_require_linear. |
12f7465
to
8f7f94d
Compare
c57fc21
to
ea42a17
Compare
fixed semgrep todo issues in base PR, semgrep comment is stale |
Semgrep found 1
Potential Semgrep found 4
Please create a Linear ticket for this TODO. Ignore this finding from todos_require_linear. |
This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
8f1b2fa
to
04faea3
Compare
WalkthroughWalkthroughThe updates across various Go files in the Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ? TipsChat with CodeRabbit Bot (
|
Dismissing my review since I took over this PR.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #7683 +/- ##
============================================
- Coverage 34.61% 20.18% -14.44%
============================================
Files 167 88 -79
Lines 7162 2101 -5061
Branches 1212 481 -731
============================================
- Hits 2479 424 -2055
+ Misses 4532 1649 -2883
+ Partials 151 28 -123
Flags with carried forward coverage won't be shown. Click here to find out more. |
04faea3
to
3932880
Compare
Semgrep found 1
When working with web applications that involve rendering user-generated content, it's important to properly escape any HTML content to prevent Cross-Site Scripting (XSS) attacks. In Go, the |
Co-authored-by: Adrian Sutton <[email protected]>
Test added that validates that in rare circumstances this is needed. This happens in scenarios where a block is written to the compressor, but not flushed yet to the output buffer. If we don't call outputFrames in channelManager.Close, this test fails.
3932880
to
7ad152a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a couple of suggestions to avoid log messages sending me down a rabbit hole in the future. :)
- clarify that pending channels will be submitted - use same key "id" for channel ids everywhere
if err != nil { | ||
l.Log.Error("error closing the channel manager", "err", err) | ||
if errors.Is(err, ErrPendingAfterClose) { | ||
l.Log.Warn("Closed channel manager on shutdown with pending channel(s) remaining - submitting") | ||
} else { | ||
l.Log.Error("Error closing the channel manager on shutdown", "err", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicate error handling seems to indicate that closing the channel manager with pending work isn't really an error at all, right?
Can Close
return how much remaining work it has along with the error? That way its signal can be handled without these errors.Is
calls
var ErrPendingAfterClose = errors.New("pending channels remain after closing channel-manager") | ||
|
||
// Close clears any pending channels that are not in-flight already, to leave a clean derivation state. | ||
// Close then marks the remaining current open channel, if any, as "full" so it can be submitted as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we're marking non-full channels as full specifically so they'll be handled in a certain way. Is that an appropriate use of the marking? I see elsewhere that we emit errors about a "error while closing full channel", and worry that will be confusing if the channel isn't actually full.
…hereum-optimism#7683) * op-batcher: adjust error handling on pending-channels after close * op-batcher: fix comment * Capitalize start of log messages Co-authored-by: Adrian Sutton <[email protected]> * op-batcher: Add NonCompressor for testing purposes * op-node/rollup/derive: Return ErrChannelOutAlreadyClosed in SpanChannelOut * op-batcher: Add back outputFrames call to channelManager.Close Test added that validates that in rare circumstances this is needed. This happens in scenarios where a block is written to the compressor, but not flushed yet to the output buffer. If we don't call outputFrames in channelManager.Close, this test fails. * op-batcher: Improve logging - clarify that pending channels will be submitted - use same key "id" for channel ids everywhere --------- Co-authored-by: Sebastian Stammler <[email protected]> Co-authored-by: Adrian Sutton <[email protected]>
…hereum-optimism#7683) * op-batcher: adjust error handling on pending-channels after close * op-batcher: fix comment * Capitalize start of log messages Co-authored-by: Adrian Sutton <[email protected]> * op-batcher: Add NonCompressor for testing purposes * op-node/rollup/derive: Return ErrChannelOutAlreadyClosed in SpanChannelOut * op-batcher: Add back outputFrames call to channelManager.Close Test added that validates that in rare circumstances this is needed. This happens in scenarios where a block is written to the compressor, but not flushed yet to the output buffer. If we don't call outputFrames in channelManager.Close, this test fails. * op-batcher: Improve logging - clarify that pending channels will be submitted - use same key "id" for channel ids everywhere --------- Co-authored-by: Sebastian Stammler <[email protected]> Co-authored-by: Adrian Sutton <[email protected]>
Description
This PR depends on #7682
This fixes a CI flake by adjusting the error reporting of pending-channels.
CI would previously fail when there is remaining pending channel-data that does not fit all in one channel, due to the compressor-full status.
Here's the rabbit-hole I went down to find this issue:
ChannelOut.Close()
is being called twice on a closed state, which should never happen.CI Log snippet:
Batcher :
ChannelOut.Close()
is called as part ofchannelBuilder.closeAndOutputAllFrames()
,closing channel out
is logged.channelBuilder.closeAndOutputAllFrames()
is called as part ofoutputFrames()
,creating frames with channel builder
is logged.outputFrames()
is called both as part ofchannelManager.Close()
andchannelManager.TxData()
channelManager.Close()
is only called likeBatchSubmitter.state.Close()
, mutually exclusive driver loop events:error closing the channel manager
is logged.channelManager.TxData()
is only calledpublishTxToL1()
publishTxToL1
is called in a go-routine bypublishStateToL1
, but caller always blocks till it is done publishing.publishStateToL1
is called after closing the state and logging closing error on shutdown, in the driver loop.publishStateToL1
is also called in more driver-loop places.Given that
publishStateToL1
waits till it is done, there seems to be no concurrency issue, and it's called many times over in regular operation.But
closeAndOutputAllFrames
followed byClose
seems to be the issue: only then we can hit the log from CI fail.On
channelManager.Close()
, thechannelBuilder.Close
happens, but this only really marks it as "full", withErrTerminated
However, the
ChannelOut
of thechannelBuilder
should have been removed if it were completely unutilized, and have nothing to submit.The test-cases, when not dropping the error, then indicate the exact issue: the compressor is "full", and when an existing pending channel is "fully" submitted, it actually does not get fully submitted, and will appear already-closed.
Changes
Do not make the channel-managerEdit(seb): OnlyClose
func try to output frames: this happens as part of the laterTxData
calls anyway.Close
a channel andoutputFrames
inchannelManager.Close
if the channel wasn't already full. This was causing the double-closed error, when things didn't fit in a single channel.Tests
Close
errors in the channel-manager tests are no longer dropped.ChannelManagerClosePendingChannel
tested the exact case of remaining pending-channels work after channel-manager close, but ignored the error.Added new
NonCompressor
that can be used in testing. When writing to it, it first flushes data from any previous write. This makes it behave more predicable which is better suited for tests.Using this one, added a new test
ChannelManager_Close_PartiallyPendingChannel
that shows that theoutputFrames
call is necessary inchannelManager.Close
in case there are unflushed blocks in the compressor.