From a364ce762ca5f6a6c5d4f20d79f1e01f4211cb60 Mon Sep 17 00:00:00 2001 From: hanako mumei <81144685+2501babe@users.noreply.github.com> Date: Sun, 20 Oct 2024 21:48:54 -0700 Subject: [PATCH] SIMD-0186: Transaction Data Size Specification --- ...186-transaction-data-size-specification.md | 121 ++++++++++++++++++ 1 file changed, 121 insertions(+) create mode 100644 proposals/0186-transaction-data-size-specification.md diff --git a/proposals/0186-transaction-data-size-specification.md b/proposals/0186-transaction-data-size-specification.md new file mode 100644 index 000000000..b3b1ed334 --- /dev/null +++ b/proposals/0186-transaction-data-size-specification.md @@ -0,0 +1,121 @@ +--- +simd: '0186' +title: Transaction Data Size Specification +authors: + - Hanako Mumei +category: Standard +type: Core +status: Review +created: 2024-10-20 +feature: (fill in with feature tracking issues once accepted) +--- + +## Summary + +Before a transaction can be executed, every account it may read from or write to +must be loaded, including any programs it may call. The amount of data a +transaction is allowed to load is capped, and if it exceeds that limit, loading +is aborted. This functionality is already implemented in the validator. The +purpose of this SIMD is to explicitly define how transaction size is calculated. + +## Motivation + +Transaction data size accounting is currently unspecified, and the +implementation-defined algorithm used in the Agave client exhibits some +surprising behaviors: + +* BPF loaders required by top-level invoked programs are counted against +transaction data size. BPF loaders required by CPI invoked programs are not. If +a required BPF loader is also invoked or included in the accounts list, it is +counted twice. +* The size of a program owned by the upgradeable BPF loader (henceforth +LoaderV3) may or may not include the size of its programdata depending on how it +is used on the transaction, in addition to counting programdata if it itself is +included on the transaction. This means programdata may be counted zero, one, or +two times. + +All validator clients must arrive at precisely the same transaction data size +for all transactions because a difference of one byte can determine whether a +transaction is executed or failed. Also, we want the calculated transaction data +size to correspond closely to the actual amount of data the transaction +requests. + +Therefore, this SIMD seeks to specify an algorithm that is straightforward to +implement in a client-agnostic way, while also accurately accounting for the +total data required by the transaction. + +## New Terminology + +N/A + +## Detailed Design + +The proposed algorithm is as follows: + +1. Every account explicitly included on the transaction accounts list is counted +once and only once. +2. A program owned by LoaderV3 also includes the size of its programdata. +3. Other than point 2, no accounts are implicitly added to the total data size. + +Transactions may include a +`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define +a data size limit for the transaction. Otherwise, the default limit is 64MiB +(`64 * 1024 * 1024` bytes). In the future, this default may be changed by +amending this SIMD. + +If a transaction exceeds its data size limit, account loading is aborted and the +transaction is failed. Fees will be charged once +`enable_transaction_loading_failure_fees` is enabled. + +Read-only and writable accounts are treated the same. In the future, when direct +mapping is enabled, this SIMD may be amended to count them differently. + +As a consequence of 1 and 2, programdata is counted twice if a transaction +includes both programdata and the program account itself in the accounts list. +This is partly done for ease of implementation: we always want to count +programdata when the program is included, and there is no reason for any +transaction to include both accounts except during initial deployment. + +There is no special handling for programs owned by the native loader or the +non-upgradeable BPF loaders. + +Account size for programs owned by LoaderV4 is left undefined. This SIMD should +be amended before LoaderV4 is enabled. + +## Alternatives Considered + +* Transaction data size accounting is already enabled, so the null option is to +enshrine the current Agave behavior in the protocol. This is undesirable because +the current behavior is highly idiosyncratic, and LoaderV3 program sizes are +routinely undercounted. +* Builtin programs are backed by accounts that only contain the program name as +a string, typically making them 15-40 bytes. We could make them free when not +instruction accounts, since they're part of the validator. However this +adds complexity for no real benefit. +* We include LoaderV3 programdata size in program size because almost all +transactions will use the program account, which forces a load of programdata, +and not use programdata directly. To be truly consistent, we might want to count +LoaderV1 and LoaderV2 programs twice if they're instruction accounts, since the +data does have to be loaded twice. However this adds complexity for what may be +an Agave-specific implementation detail, and these programs are rarely used. + +## Impact + +The primary impact is this SIMD makes correctly implementing transaction data +size accounting much easier for other validator clients. + +It makes transactions which include program accounts for CPI somewhat larger, +but given the generous 64MiB limit, it is unlikely that any existing users will +be affected. + +## Security Considerations + +Security impact is minimal because this SIMD merely simplifies an existing +feature. + +This SIMD requires a feature gate. + +## Backwards Compatibility + +Transactions that call LoaderV3 programs via CPI and are extremely close to the +64MiB limit may now exceed it.