Skip to content

Commit

Permalink
SIMD-0186: Transaction Data Size Specification
Browse files Browse the repository at this point in the history
  • Loading branch information
2501babe committed Oct 21, 2024
1 parent 58ab7ca commit a364ce7
Showing 1 changed file with 121 additions and 0 deletions.
121 changes: 121 additions & 0 deletions proposals/0186-transaction-data-size-specification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
---
simd: '0186'
title: Transaction Data Size Specification
authors:
- Hanako Mumei
category: Standard
type: Core
status: Review
created: 2024-10-20
feature: (fill in with feature tracking issues once accepted)
---

## Summary

Before a transaction can be executed, every account it may read from or write to
must be loaded, including any programs it may call. The amount of data a
transaction is allowed to load is capped, and if it exceeds that limit, loading
is aborted. This functionality is already implemented in the validator. The
purpose of this SIMD is to explicitly define how transaction size is calculated.

## Motivation

Transaction data size accounting is currently unspecified, and the
implementation-defined algorithm used in the Agave client exhibits some
surprising behaviors:

* BPF loaders required by top-level invoked programs are counted against
transaction data size. BPF loaders required by CPI invoked programs are not. If
a required BPF loader is also invoked or included in the accounts list, it is
counted twice.
* The size of a program owned by the upgradeable BPF loader (henceforth
LoaderV3) may or may not include the size of its programdata depending on how it
is used on the transaction, in addition to counting programdata if it itself is
included on the transaction. This means programdata may be counted zero, one, or
two times.

All validator clients must arrive at precisely the same transaction data size
for all transactions because a difference of one byte can determine whether a
transaction is executed or failed. Also, we want the calculated transaction data
size to correspond closely to the actual amount of data the transaction
requests.

Therefore, this SIMD seeks to specify an algorithm that is straightforward to
implement in a client-agnostic way, while also accurately accounting for the
total data required by the transaction.

## New Terminology

N/A

## Detailed Design

The proposed algorithm is as follows:

1. Every account explicitly included on the transaction accounts list is counted
once and only once.
2. A program owned by LoaderV3 also includes the size of its programdata.
3. Other than point 2, no accounts are implicitly added to the total data size.

Transactions may include a
`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define
a data size limit for the transaction. Otherwise, the default limit is 64MiB
(`64 * 1024 * 1024` bytes). In the future, this default may be changed by
amending this SIMD.

If a transaction exceeds its data size limit, account loading is aborted and the
transaction is failed. Fees will be charged once
`enable_transaction_loading_failure_fees` is enabled.

Read-only and writable accounts are treated the same. In the future, when direct
mapping is enabled, this SIMD may be amended to count them differently.

As a consequence of 1 and 2, programdata is counted twice if a transaction
includes both programdata and the program account itself in the accounts list.
This is partly done for ease of implementation: we always want to count
programdata when the program is included, and there is no reason for any
transaction to include both accounts except during initial deployment.

There is no special handling for programs owned by the native loader or the
non-upgradeable BPF loaders.

Account size for programs owned by LoaderV4 is left undefined. This SIMD should
be amended before LoaderV4 is enabled.

## Alternatives Considered

* Transaction data size accounting is already enabled, so the null option is to
enshrine the current Agave behavior in the protocol. This is undesirable because
the current behavior is highly idiosyncratic, and LoaderV3 program sizes are
routinely undercounted.
* Builtin programs are backed by accounts that only contain the program name as
a string, typically making them 15-40 bytes. We could make them free when not
instruction accounts, since they're part of the validator. However this
adds complexity for no real benefit.
* We include LoaderV3 programdata size in program size because almost all
transactions will use the program account, which forces a load of programdata,
and not use programdata directly. To be truly consistent, we might want to count
LoaderV1 and LoaderV2 programs twice if they're instruction accounts, since the
data does have to be loaded twice. However this adds complexity for what may be
an Agave-specific implementation detail, and these programs are rarely used.

## Impact

The primary impact is this SIMD makes correctly implementing transaction data
size accounting much easier for other validator clients.

It makes transactions which include program accounts for CPI somewhat larger,
but given the generous 64MiB limit, it is unlikely that any existing users will
be affected.

## Security Considerations

Security impact is minimal because this SIMD merely simplifies an existing
feature.

This SIMD requires a feature gate.

## Backwards Compatibility

Transactions that call LoaderV3 programs via CPI and are extremely close to the
64MiB limit may now exceed it.

0 comments on commit a364ce7

Please sign in to comment.