Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc: add limit for batch request and response size #26681

Merged
merged 32 commits into from
Jun 13, 2023

Conversation

mmsqe
Copy link
Contributor

@mmsqe mmsqe commented Feb 14, 2023

This PR adds server-side limits for JSON-RPC batch requests. Before this change, batches
were limited only by processing time. The server would pick calls from the batch and
answer them until the response timeout occurred, then stop processing the remaining batch
items.

Here, we are adding two additional limits which can be configured:

  • the 'item limit': batches can have at most N items
  • the 'response size limit': batches can contain at most X response bytes

These limits are optional in package rpc. In Geth, we set a default limit of 1000 items
and 25MB response size.

When a batch goes over the limit, an error response is returned to the client. However,
doing this correctly isn't always possible. In JSON-RPC, only method calls with a valid
id can be responded to. Since batches may also contain non-call messages or
notifications, the best effort thing we can do to report an error with the batch itself is
reporting the limit violation as an error for the first method call in the batch. If a batch is
too large, but contains only notifications and responses, the error will be reported with
a null id.

The RPC client was also changed so it can deal with errors resulting from too large
batches. An older client connected to the server code in this PR could get stuck
until the request timeout occurred when the batch is too large. Upgrading to a version
of the RPC client containing this change is strongly recommended to avoid timeout issues.

For some weird reason, when writing the original client implementation, @fjl worked off of
the assumption that responses could be distributed across batches arbitrarily. So for a
batch request containing requests [A B C], the server could respond with [A B C] but
also with [A B] [C] or even [A] [B] [C] and it wouldn't make a difference to the
client.

So in the implementation of BatchCallContext, the client waited for all requests in the
batch individually. If the server didn't respond to some of the requests in the batch, the
client would eventually just time out (if a context was used).

With the addition of batch limits into the server, we anticipate that people will hit this
kind of error way more often. To handle this properly, the client now waits for a single
response batch and expects it to contain all responses to the requests.

@mmsqe mmsqe marked this pull request as ready for review February 14, 2023 02:34
@mmsqe mmsqe marked this pull request as draft February 14, 2023 02:37
@mmsqe mmsqe marked this pull request as ready for review February 14, 2023 02:52
Copy link
Contributor

@fjl fjl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good!

I think it would be better to make these limits configurable somehow. My suggestion for that would be a method like SetBatchLimits(length int, size int) on Server and Client.

rpc/errors.go Show resolved Hide resolved
@mmsqe
Copy link
Contributor Author

mmsqe commented Feb 14, 2023

This looks very good!

I think it would be better to make these limits configurable somehow. My suggestion for that would be a method like SetBatchLimits(length int, size int) on Server and Client.

Since the limit make more sense to add in server side, I only move set batch limit related change in Server and keep Client unchanged.

@fjl
Copy link
Contributor

fjl commented Feb 14, 2023

The client can also accept requests from the server (and the server is implemented using Client) :)

@mmsqe
Copy link
Contributor Author

mmsqe commented Feb 15, 2023

The client can also accept requests from the server (and the server is implemented using Client) :)

May I ask how to add config to Client, which doesn't have access to this config right? And newClient would have breaking change by adding new params.

@@ -786,6 +786,18 @@ var (
Usage: "Allow for unprotected (non EIP155 signed) transactions to be submitted via RPC",
Category: flags.APICategory,
}
BatchRequestLimit = &cli.IntFlag{
Name: "batch.request-limit",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO these flags should be somwhere in the rpc. namespace

@mmsqe mmsqe requested a review from fjl February 16, 2023 16:01
@holiman holiman added this to the 1.11.2 milestone Feb 17, 2023
@holiman holiman added this to the 1.12.1 milestone May 25, 2023
holiman and others added 4 commits May 31, 2023 03:32
This changes how we handle batch responses in order to be able to react to situations
where the server does not provide a response for every request batch element.

For some weird reason, when writing the original client implementation, I worked off of
the assumption that responses could be distributed across batches arbitrarily. So for a
batch request containing requests [A, B, C], the server could respond with [A B C] but
also with [A B] [C] or even [A] [B] [C] and it wouldn't make a difference to the client.

So in the implementation of BatchCallContext, the client waited for all requests in the
batch individually. If the server didn't respond to all requests in the batch, the client
would eventually just time out.

With the addition of batch limits into the server, I anticipate that people will hit this
kind of error way more often. To handle this properly, the client now waits for a single
response batch and expects it to contain all responses to the requests.
@fjl
Copy link
Contributor

fjl commented Jun 8, 2023

I was finally able to fix this up properly. With the latest changes, the client will now properly handle wrong size batch responses and return early without relying on the timeout.

In JSON-RPC, only method calls with a non-null "id" can be responded to. Since batches can
contain non-call messages or notifications, the best effort thing we can do to report an
error with the batch itself is the first method call message.
@fjl
Copy link
Contributor

fjl commented Jun 8, 2023

@holiman PTAL

fjl added 4 commits June 8, 2023 15:49
I think adding these limits might break people's setups, so best
to avoid configuring them by default in package rpc.
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

@fjl fjl merged commit f3314bb into ethereum:master Jun 13, 2023
mmsqe added a commit to mmsqe/go-ethereum that referenced this pull request Jun 13, 2023
)

This PR adds server-side limits for JSON-RPC batch requests. Before this change, batches
were limited only by processing time. The server would pick calls from the batch and
answer them until the response timeout occurred, then stop processing the remaining batch
items.

Here, we are adding two additional limits which can be configured:

- the 'item limit': batches can have at most N items
- the 'response size limit': batches can contain at most X response bytes

These limits are optional in package rpc. In Geth, we set a default limit of 1000 items
and 25MB response size.

When a batch goes over the limit, an error response is returned to the client. However,
doing this correctly isn't always possible. In JSON-RPC, only method calls with a valid
`id` can be responded to. Since batches may also contain non-call messages or
notifications, the best effort thing we can do to report an error with the batch itself is
reporting the limit violation as an error for the first method call in the batch. If a batch is
too large, but contains only notifications and responses, the error will be reported with
a null `id`.

The RPC client was also changed so it can deal with errors resulting from too large
batches. An older client connected to the server code in this PR could get stuck
until the request timeout occurred when the batch is too large. **Upgrading to a version
of the RPC client containing this change is strongly recommended to avoid timeout issues.**

For some weird reason, when writing the original client implementation, @fjl worked off of
the assumption that responses could be distributed across batches arbitrarily. So for a
batch request containing requests `[A B C]`, the server could respond with `[A B C]` but
also with `[A B] [C]` or even `[A] [B] [C]` and it wouldn't make a difference to the
client.

So in the implementation of BatchCallContext, the client waited for all requests in the
batch individually. If the server didn't respond to some of the requests in the batch, the
client would eventually just time out (if a context was used).

With the addition of batch limits into the server, we anticipate that people will hit this
kind of error way more often. To handle this properly, the client now waits for a single
response batch and expects it to contain all responses to the requests.

---------

Co-authored-by: Felix Lange <[email protected]>
Co-authored-by: Martin Holst Swende <[email protected]>
mmsqe added a commit to mmsqe/go-ethereum that referenced this pull request Jun 14, 2023
)

This PR adds server-side limits for JSON-RPC batch requests. Before this change, batches
were limited only by processing time. The server would pick calls from the batch and
answer them until the response timeout occurred, then stop processing the remaining batch
items.

Here, we are adding two additional limits which can be configured:

- the 'item limit': batches can have at most N items
- the 'response size limit': batches can contain at most X response bytes

These limits are optional in package rpc. In Geth, we set a default limit of 1000 items
and 25MB response size.

When a batch goes over the limit, an error response is returned to the client. However,
doing this correctly isn't always possible. In JSON-RPC, only method calls with a valid
`id` can be responded to. Since batches may also contain non-call messages or
notifications, the best effort thing we can do to report an error with the batch itself is
reporting the limit violation as an error for the first method call in the batch. If a batch is
too large, but contains only notifications and responses, the error will be reported with
a null `id`.

The RPC client was also changed so it can deal with errors resulting from too large
batches. An older client connected to the server code in this PR could get stuck
until the request timeout occurred when the batch is too large. **Upgrading to a version
of the RPC client containing this change is strongly recommended to avoid timeout issues.**

For some weird reason, when writing the original client implementation, @fjl worked off of
the assumption that responses could be distributed across batches arbitrarily. So for a
batch request containing requests `[A B C]`, the server could respond with `[A B C]` but
also with `[A B] [C]` or even `[A] [B] [C]` and it wouldn't make a difference to the
client.

So in the implementation of BatchCallContext, the client waited for all requests in the
batch individually. If the server didn't respond to some of the requests in the batch, the
client would eventually just time out (if a context was used).

With the addition of batch limits into the server, we anticipate that people will hit this
kind of error way more often. To handle this properly, the client now waits for a single
response batch and expects it to contain all responses to the requests.

---------

Co-authored-by: Felix Lange <[email protected]>
Co-authored-by: Martin Holst Swende <[email protected]>
Tristan-Wilson added a commit to OffchainLabs/go-ethereum that referenced this pull request Aug 3, 2023
This reverts commit 5061259.

Timeout handling for batches was improved upstream in this PR
ethereum/go-ethereum#26681 which will be merged
in in the next commit. Reverting this commit to make the merge cleaner.
Tristan-Wilson added a commit to OffchainLabs/go-ethereum that referenced this pull request Aug 3, 2023
rpc/handler.go: conflict with upstream's PR
(ethereum/go-ethereum#26681) to add batch
request and response size limits, and our own implementation of it.
Removed our implementation
(#198, which was moved
inside batchCallBuffer.pushResponse in the v1.11.2 merge
#205) in favor of
upstream.
github-merge-queue bot pushed a commit to dogechain-lab/dbsc that referenced this pull request Oct 31, 2023
### Description

replacing `rpc` module with the upstream version

### Changes

Focus PR: 

*  ethereum/go-ethereum#26681
*  ethereum/go-ethereum#27447

---------

Co-authored-by: Brandon Liu <[email protected]>
Co-authored-by: Adrian Sutton <[email protected]>
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
)

This PR adds server-side limits for JSON-RPC batch requests. Before this change, batches
were limited only by processing time. The server would pick calls from the batch and
answer them until the response timeout occurred, then stop processing the remaining batch
items.

Here, we are adding two additional limits which can be configured:

- the 'item limit': batches can have at most N items
- the 'response size limit': batches can contain at most X response bytes

These limits are optional in package rpc. In Geth, we set a default limit of 1000 items
and 25MB response size.

When a batch goes over the limit, an error response is returned to the client. However,
doing this correctly isn't always possible. In JSON-RPC, only method calls with a valid
`id` can be responded to. Since batches may also contain non-call messages or
notifications, the best effort thing we can do to report an error with the batch itself is
reporting the limit violation as an error for the first method call in the batch. If a batch is
too large, but contains only notifications and responses, the error will be reported with
a null `id`.

The RPC client was also changed so it can deal with errors resulting from too large
batches. An older client connected to the server code in this PR could get stuck
until the request timeout occurred when the batch is too large. **Upgrading to a version
of the RPC client containing this change is strongly recommended to avoid timeout issues.**

For some weird reason, when writing the original client implementation, @fjl worked off of
the assumption that responses could be distributed across batches arbitrarily. So for a
batch request containing requests `[A B C]`, the server could respond with `[A B C]` but
also with `[A B] [C]` or even `[A] [B] [C]` and it wouldn't make a difference to the
client.

So in the implementation of BatchCallContext, the client waited for all requests in the
batch individually. If the server didn't respond to some of the requests in the batch, the
client would eventually just time out (if a context was used).

With the addition of batch limits into the server, we anticipate that people will hit this
kind of error way more often. To handle this properly, the client now waits for a single
response batch and expects it to contain all responses to the requests.

---------

Co-authored-by: Felix Lange <[email protected]>
Co-authored-by: Martin Holst Swende <[email protected]>
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants