Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

op-service, op-batcher, op-proposer: Active sequencer follow mode #8585

Merged
merged 44 commits into from
Dec 19, 2023

Conversation

EvanJRichard
Copy link
Contributor

@EvanJRichard EvanJRichard commented Dec 12, 2023

Description

This PR proposes to add dial.ActiveL2EndpointProvider and dial.ActiveL2RollupProvider. These endpoint providers will check that they are pointing to the active sequencer before providing rollup clients and l2 ethClients to callers. They are enabled by passing a comma-separated list as the rollup/l2 ethclient URL string. Additionally, this PR adds infrastructure for unit testing these structs, such as testutils.MockRollupClient.

Tests
See new unit test file, op-service/dial/active_l2_provider_test.go, about half the changeset is that file. Most unit tests follow the pattern where an ActiveL2RollupProvider/ActiveL2EndpointProvider is pointing at an active sequencer, "notices" that the sequencer goes inactive, and then dials a new sequencer that reports it is active.

This also successfully failed over in a devnet-like test run locally on my machine. It has also been tested to work on our internal cloud devnet during a ZDD.

Metadata

@EvanJRichard EvanJRichard self-assigned this Dec 12, 2023
Copy link
Contributor

semgrep-app bot commented Dec 13, 2023

Semgrep found 1 nil-check-after-call finding:

Potential p2pmetrics nil dereference when NewBandwidthCounter is called

Ignore this finding from nil-check-after-call.

@EvanJRichard EvanJRichard force-pushed the evan/batcher-active-seq branch from 4aad2e9 to 3fa961b Compare December 13, 2023 21:23
@EvanJRichard EvanJRichard force-pushed the evan/batcher-active-seq branch from 3fa961b to 9fb14eb Compare December 13, 2023 21:29
@EvanJRichard EvanJRichard marked this pull request as ready for review December 13, 2023 21:33
@EvanJRichard EvanJRichard requested a review from a team as a code owner December 13, 2023 21:33
Copy link
Contributor

coderabbitai bot commented Dec 13, 2023

Walkthrough

Walkthrough

An update has been made across several files to support multiple URL configurations for L2 Ethereum RPC and Rollup RPC endpoints. This includes handling comma-separated lists and ensuring URL counts match. A new ActiveL2EndpointProvider and ActiveL2RollupProvider have been introduced to manage multiple endpoints and facilitate failover. The changes also include new validation logic, additional test cases, and the import of the strings package to handle string operations.

Changes

File Path Summary
.../batcher/config.go
.../proposer/service.go
Updated CLIConfig to allow comma-separated lists for L2EthRpc and RollupRpc, added validation for URL counts, and imported strings. Also, conditionally used NewActiveL2RollupProvider based on comma presence in cfg.RollupRpc.
.../batcher/service.go Modified initRPCClients to handle multiple URLs for RPC clients.
.../dial/active_l2_provider.go
.../dial/active_rollup_provider.go
Introduced new types and functions for managing multiple L2 and Rollup endpoints, including failover logic.
.../dial/active_l2_provider_test.go Added new test cases for failover scenarios, active sequencer selection, and long check durations; modified test harness setup.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • You can reply to a review comment made by CodeRabbit.
  • You can tag CodeRabbit on specific lines of code or files in the PR by tagging @coderabbitai in a comment.
  • You can tag @coderabbitai in a PR comment and ask one-off questions about the PR and the codebase. Use quoted replies to pass the context for follow-up questions.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

@EvanJRichard EvanJRichard changed the title [DRAFT] op-service, op-batcher, op-proposer: Active sequencer follow mode op-service, op-batcher, op-proposer: Active sequencer follow mode Dec 13, 2023
op-service/dial/rollupclient_interface.go Show resolved Hide resolved
op-batcher/batcher/config.go Show resolved Hide resolved
op-service/testutils/mock_eth_client.go Outdated Show resolved Hide resolved
op-service/dial/dial.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_rollup_provider.go Outdated Show resolved Hide resolved
op-service/testutils/mock_rollup_client.go Outdated Show resolved Hide resolved
op-service/testutils/mock_rollup_client.go Outdated Show resolved Hide resolved
op-service/dial/active_rollup_provider.go Outdated Show resolved Hide resolved
op-proposer/proposer/service.go Outdated Show resolved Hide resolved
Copy link
Member

@sebastianst sebastianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going in a good direction. Maybe we want to simplify further by not doing retries after having iterated over the whole list once. We then offload the retry driving to the callers. Might work well, should check.

op-service/dial/active_l2_provider.go Show resolved Hide resolved
op-service/testutils/mock_rollup_client.go Outdated Show resolved Hide resolved
op-service/dial/active_rollup_provider.go Outdated Show resolved Hide resolved
op-service/dial/active_rollup_provider.go Show resolved Hide resolved
op-service/dial/active_l2_provider.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-batcher/batcher/service.go Outdated Show resolved Hide resolved
Copy link
Contributor

semgrep-app bot commented Dec 14, 2023

Semgrep found 1 import-text-template finding:

  • op-bindings/gen/main.go: L14

When working with web applications that involve rendering user-generated content, it's important to properly escape any HTML content to prevent Cross-Site Scripting (XSS) attacks. In Go, the text/template package does not automatically escape HTML content, which can leave your application vulnerable to these types of attacks. To mitigate this risk, it's recommended to use the html/template package instead, which provides built-in functionality for HTML escaping. By using html/template to render your HTML content, you can help to ensure that your web application is more secure and less susceptible to XSS vulnerabilities.

Ignore this finding from import-text-template.

Semgrep found 1 ifs-tampering finding:

  • op-challenger/scripts/list_claims.sh: L17

The special variable IFS affects how splitting takes place when expanding unquoted variables. Don't set it globally. Prefer a dedicated utility such as 'cut' or 'awk' if you need to split input data. If you must use 'read', set IFS locally using e.g. 'IFS="," read -a my_array'.

Ignore this finding from ifs-tampering.

EvanJRichard and others added 2 commits December 18, 2023 14:35
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Member

@sebastianst sebastianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be confused, don't we want to retry a new sequencer if dialing one fails?

We might need to add functionality to the test setup to let the dialers error on dial.

op-service/dial/active_rollup_provider.go Outdated Show resolved Hide resolved
op-service/dial/active_rollup_provider.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
EvanJRichard and others added 5 commits December 19, 2023 10:07
- Preserve the invariant that the index and current rollup/eth
  client match.
- Dial at the start of the loop instead of at the end.
Copy link
Member

@sebastianst sebastianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good to go! There are some comments to fix in the regression test. Also proposed another 1-2 tests that we might have missed.

op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_l2_provider_test.go Outdated Show resolved Hide resolved
op-service/dial/active_rollup_provider.go Show resolved Hide resolved
op-service/dial/active_l2_provider.go Show resolved Hide resolved
Copy link
Member

@sebastianst sebastianst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✨

@sebastianst sebastianst added this pull request to the merge queue Dec 19, 2023
Merged via the queue into develop with commit c04cefe Dec 19, 2023
59 checks passed
@sebastianst sebastianst deleted the evan/batcher-active-seq branch December 19, 2023 22:47
roberto-bayardo pushed a commit to roberto-bayardo/optimism that referenced this pull request Dec 21, 2023
…hereum-optimism#8585)

* op-service: Add ActiveL2EndpointProvider.

* Fix bug in initialization, and handle case where no ethUrls are provided.

* Split active L2 provider into active rollup and active L2 provider.

* Re-duplicate some code until tests are passing.

* op-proposer: Add ability to enable active provider.

* op-batcher: Add ability to enable active provider.

* Add an empty test skeleton.

* Add an empty test skeleton.

* op-service: add, but do not yet use, RollupClientInterface and EthClientInterface.

* op-service: update mocks and interfaces for endpoint provider testing.

* op-service - WIP on Active L2 Providers: unit tests pass, design and impl contains TODOs.

* op-service: restore design in Active Endpoint Providers that only keeps one client open at a time.

* op-service: when dialing a new sequencer, close() the old connection.

* op-service: obey coderabbit suggestion around safer handling of p.currentIndex in Active L2 Providers.

* op-service, op-batcher, op-proposer: address review comments in PR#8585.

* op-service: Active L2 Provider - add test case for a sequencer returning an error.

* op-service: Active L2/Rollup Providers: improve unit testing and logging.

* op-service, op-batcher: address review comments in 8585 regarding first-startup behavior and testing.

* op-service: address review comments through adding more tests, and moving "nil client" behavior from client getter to constructor.

* op-service: minor error message change in active endpoint providers.

* Update op-service/dial/active_l2_provider.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* op-service: obey linter in rabbit-provided error message change.

* Update op-service/dial/active_l2_provider.go

Co-authored-by: Sebastian Stammler <[email protected]>

* op-service active L2 provider tests: assertAllExpectations after most tests.

* op-service: more elegantly handle startup in active l2 providers, and improve testing.

* Change remaining longDurationTests to be able to use ept.assertAllExpectations.

* use new error errSeqUnset.

* Add test for scenario where many sequencers are inactive, and only the last is active.

* Readability change: move the on-creation initialization to its own function.

* Move extra one-time dial to constructor.

* Update op-service/dial/active_rollup_provider.go

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Add nil check to active l2 provider.

* Update op-service/dial/active_rollup_provider.go

Co-authored-by: Sebastian Stammler <[email protected]>

* Address review comment: change many-inactive tests to many-undialable tests.

* Add test that reproduces internal state corruption.

* op-service: Improve active seq provider

- Preserve the invariant that the index and current rollup/eth
  client match.
- Dial at the start of the loop instead of at the end.

* Fix some tests.

* Move usage of ExpectClose to MaybeClose, we don't want to enforce a particular close behavior in these tests.

* add a missing call to assertAllExpectations.

* Test even the case where the active providers are managing a list of 1 element.

* Revert experimental hunk in active_l2_provider.

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Stammler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants