-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proposal for transactions v2 and address map program #17103
Merged
Merged
Changes from 13 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
ae45cd4
Add proposal for supporting big transactions
jstarry 9a5dabf
account index program
jstarry 006608e
fix formatting
jstarry 94d9a74
review feedback
jstarry b77af6b
Add cost changes section
jstarry d29d11b
Add cost section and more attack details
jstarry bdbc2c4
fix lint
jstarry 54d8fdf
document metadata changes
jstarry ad42eb5
nit
jstarry 50dd231
rpc details
jstarry 03aaf4c
add index meta struct
jstarry 39e2cbd
add additional proposal and chagne title
jstarry 70de74c
rename proposal file
jstarry 8a858f6
rename to address map and rewrite tx format
jstarry f76b243
no more appends, limit mapping size to 256
jstarry d5db732
update dos section
jstarry 31d96c0
add note about readonly
jstarry 1a2432d
restructure message to use enum
jstarry abada96
cleanup
jstarry File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,267 @@ | ||
# Transactions v2 - Compressed account inputs | ||
|
||
## Problem | ||
|
||
Messages transmitted to Solana validators must not exceed the IPv6 MTU size to | ||
ensure fast and reliable network transmission of cluster info over UDP. | ||
Solana's networking stack uses a conservative MTU size of 1280 bytes which, | ||
after accounting for headers, leaves 1232 bytes for packet data like serialized | ||
transactions. | ||
|
||
Developers building applications on Solana must design their on-chain program | ||
interfaces within the above transaction size limit constraint. One common | ||
work-around is to store state temporarily on-chain and consume that state in | ||
later transactions. This is the approach used by the BPF loader program for | ||
deploying Solana programs. | ||
|
||
However, this workaround doesn't work well when developers compose many on-chain | ||
programs in a single atomic transaction. With more composition comes more | ||
account inputs, each of which takes up 32 bytes. There is currently no available | ||
workaround for increasing the number of accounts used in a single transaction | ||
since each transaction must list all accounts that it needs to properly lock | ||
accounts for parallel execution. Therefore the current cap is about 35 accounts | ||
after accounting for signatures and other transaction metadata. | ||
|
||
## Proposed Solution | ||
|
||
Introduce a new on-chain account indexing program which stores account address | ||
mappings and add a new transaction format which supports concise account | ||
references through on-chain account indexes. | ||
|
||
### Account Indexing Program | ||
|
||
Here we describe a contract-based solution to the problem, whereby a protocol | ||
developer or end-user can create collections of related accounts on-chain for | ||
concise use in a transaction's account inputs. This approach is similar to page | ||
tables used in operating systems to succinctly map virtual addresses to physical | ||
memory. | ||
|
||
After addresses are stored on-chain in an index account, they may be succinctly | ||
referenced from a transaction using an index rather than a full 32 byte address. | ||
This will require a new transaction format to make use of these succinct indexes | ||
as well as runtime handling for looking up and loading accounts from the | ||
on-chain indexes. | ||
|
||
#### State | ||
|
||
Index accounts must be rent-exempt and may not be deleted. Stored addresses | ||
should be append only so that once an address is stored in an index account, it | ||
may not be removed later. Index accounts must be pre-allocated since Solana does | ||
not support reallocation for accounts yet. | ||
|
||
Since transactions use a u16 offset to look up addresses, index accounts can | ||
store up to 2^16 addresses each. Anyone may create an index account of any size | ||
as long as its big enough to store the necessary metadata. In addition to | ||
stored addresses, index accounts must also track the latest count of stored | ||
addresses and an authority which must be a present signer for all index | ||
additions. | ||
|
||
Index additions require one slot to activate and so the index data should track | ||
how many additions are still pending activation in on-chain data. | ||
|
||
```rust | ||
struct IndexMeta { | ||
// authority must sign for each addition | ||
authority: Pubkey, | ||
// incremented on each addition | ||
len: u16, | ||
// always set to the current slot | ||
last_update_slot: Slot, | ||
// incremented if current slot is equal to `last_update_slot` | ||
last_update_slot_additions: u16, | ||
} | ||
``` | ||
|
||
#### Cost | ||
|
||
Since index accounts require caching and special handling in the runtime, they should incur | ||
higher costs for storage. Cost structure design will be added later. | ||
|
||
#### Program controlled indexes | ||
|
||
If the authority of an index account is controlled by a program, more | ||
sophisticated indexes could be built with governance features or price curves | ||
for new index addresses. | ||
|
||
### Versioned Transactions | ||
|
||
In order to allow accounts to be referenced more succinctly, the structure of | ||
serialized transactions must be modified. The new transaction format should not | ||
affect transaction processing in the Solana VM beyond the increased capacity for | ||
accounts and instruction invocations. Invoked programs will be unaware of which | ||
transaction format was used. | ||
|
||
The new transaction format must be distinguished from the current transaction | ||
format. Current transactions can fit at most 19 signatures (64-bytes each) but | ||
the message header encodes `num_required_signatures` as a `u8`. Since the upper | ||
bit of the `u8` will never be set for a valid transaction, we can enable it to | ||
denote whether a transaction should be decoded with the versioned format or not. | ||
|
||
#### New Transaction Format | ||
|
||
```rust | ||
pub struct VersionedMessage { | ||
/// Version of encoded message. | ||
/// The max encoded version is 2^7 - 1 due to the ignored upper disambiguation bit | ||
pub version: u8, | ||
pub header: MessageHeader, | ||
/// Number of read-only account inputs specified thru indexes | ||
pub num_readonly_indexed_accounts: u8, | ||
#[serde(with = "short_vec")] | ||
pub account_keys: Vec<Pubkey>, | ||
/// All the account indexes used by this transaction | ||
#[serde(with = "short_vec")] | ||
pub account_indexes: Vec<AccountIndex>, | ||
pub recent_blockhash: Hash, | ||
/// Compiled instructions stay the same, account indexes continue to be stored | ||
/// as a u8 which means the max number of account_indexes + account_keys is 256. | ||
#[serde(with = "short_vec")] | ||
pub instructions: Vec<CompiledInstruction>, | ||
} | ||
|
||
pub struct AccountIndex { | ||
pub account_key_offset: u8, | ||
// 1-3 bytes used to lookup address in index account | ||
pub index_account_offset: CompactU16, | ||
} | ||
``` | ||
|
||
#### Size changes | ||
|
||
- Extra byte for version field | ||
- Extra byte for number of total account index inputs | ||
- Extra byte for number of readonly account index inputs | ||
- Most indexes will be compact and use 2 bytes + index address | ||
- Cost of each additional index account is ~2 bytes | ||
|
||
#### Cost changes | ||
|
||
Accessing an index account in a transaction should incur an extra cost due to | ||
the extra work validators need to do to load and cache index accounts. | ||
|
||
#### Metadata changes | ||
|
||
Each account accessed via an index should be stored in the transaction metadata | ||
for quick reference. This will avoid the need for clients to make multiple RPC | ||
round trips to fetch all accounts referenced in a page-indexed transaction. It | ||
will also make it easier to use the ledger tool to analyze account access | ||
patterns. | ||
|
||
#### RPC changes | ||
|
||
Fetched transaction responses will likely require a new version field to | ||
indicate to clients which transaction structure to use for deserialization. | ||
Clients using pre-existing RPC methods will receive error responses when | ||
attempting to fetch a versioned transaction which will indicate that they | ||
must upgrade. | ||
|
||
The RPC API should also support an option for returning fully decompressed | ||
transactions to abstract away the indexing details from downstream clients. | ||
|
||
### Limitations | ||
|
||
- Max of 256 accounts may be specified in a transaction because u8 is used by compiled | ||
instructions to index into transaction message account keys. | ||
Indexes can hold up to 2^16 keys. Smaller indexes is ok. Each index is then u16 | ||
- Transaction signers may not be specified using an on-chain account index, the | ||
full address of each signer must be serialized in the transaction. This ensures | ||
that the performance of transaction signature checks is not affected. | ||
- Hardware wallets will probably not be able to display details about accounts | ||
referenced with an index due to inability to verify on-chain data. | ||
- Only single level indexes can be used. Recursive indexes will not be supported. | ||
|
||
## Security Concerns | ||
|
||
### Resource consumption | ||
|
||
Enabling more account inputs in a transaction allows for more program | ||
invocations, write-locks, and data reads / writes. Before indexes are live, we | ||
need transaction-wide compute limits and increased costs for write locks and | ||
data reads. | ||
|
||
### Front running | ||
|
||
If the addresses listed within an index account are modifiable, front running | ||
attacks could modify which index accounts are accessed from a later transaction. | ||
For this reason, we propose that any stored address is immutable and that index | ||
accounts themselves may not be removed. | ||
|
||
Additionally, a malicious actor could try to fork the chain immediately after a | ||
new index account is added to a block. If successful, they could add a different | ||
unexpected index account in the fork. In order to deter this attack, clients | ||
should wait for indexes to be finalized before using them in a transaction. | ||
Clients may also append integrity check instructions to the transaction which | ||
verify that the correct accounts are used. | ||
|
||
### Denial of service | ||
|
||
Index accounts will be read very frequently and will therefore be a more high | ||
profile target for denial of service attacks through write locks similar to | ||
sysvar accounts. | ||
|
||
Since stored accounts inside index accounts are immutable, reads and writes | ||
to index accounts could be parallelized as long as all referenced addresses | ||
are for indexes less than the current number of addresses stored. | ||
|
||
### Duplicate accounts | ||
|
||
If the same account is referenced in a transaction by address as well as through | ||
an index, the transaction should be rejected to avoid conflicts when determining | ||
if the account is a signer or writeable. | ||
|
||
## Other Proposals | ||
|
||
1) Account prefixes | ||
|
||
Needing to pre-register accounts in an on-chain index is cumbersome because it | ||
adds an extra step for transaction processing. Instead, Solana transactions | ||
could use variable length address prefixes to specify accounts. These prefix | ||
shortcuts can save on data usage without needing to setup on-chain state. | ||
|
||
However, this model requires nodes to keep a mapping of prefixes to active account | ||
addresses. Attackers can create accounts with the same prefix as a popular account | ||
to disrupt transactions. | ||
|
||
2) Transaction builder program | ||
|
||
Solana can provide a new on-chain program which allows "Big" transactions to be | ||
constructed on-chain by normal transactions. Once the transaction is | ||
constructed, a final "Execute" transaction can trigger a node to process the big | ||
transaction as a normal transaction without needing to fit it into an MTU sized | ||
packet. | ||
|
||
The UX of this approach is tricky. A user could in theory sign a big transaction | ||
but it wouldn't be great if they had to use their wallet to sign multiple | ||
transactions to build that transaction that they already signed and approved. This | ||
could be a use-case for transaction relay services, though. A user could pay a | ||
relayer to construct the large pre-signed transaction on-chain for them. | ||
|
||
In order to prevent the large transaction from being reconstructed and replayed, | ||
its message hash will need to be added to the status cache when executed. | ||
|
||
3) Epoch account indexes | ||
|
||
Similarly to leader schedule calculation, validators could create a global index | ||
of the most accessed accounts in the previous epoch and make that index | ||
available to transactions in the following epoch. | ||
|
||
This approach has a downside of only updating the index at epoch boundaries | ||
which means there would be a few day delay before popular new accounts could be | ||
referenced. It also needs to be consistently generated by all validators by | ||
using some criteria like adding accounts in order by access count. | ||
|
||
4) Address lists | ||
|
||
Extend the transaction structure to support addresses that, when loaded, expand | ||
to a list of addresses. After expansion, all account inputs are concatenated to | ||
form a single list of account keys which can be indexed into by instructions. | ||
Address lists would likely need to be immutable to prevent attacks. They would | ||
also need to be limited in length to limit resource consumption. | ||
|
||
This proposal can be thought of a special case of the proposed index account | ||
garious marked this conversation as resolved.
Show resolved
Hide resolved
|
||
approach. Since the full account list would be expanded, there's no need to add | ||
additional offsets that use up the limited space in a serialized | ||
transaction. However, the expected size of an address list may need to be | ||
encoded into the transaction to aid the sanitization of account indexes. | ||
Additionally, special attention must be given to watch out for accounts that | ||
exist in multiple account lists. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The internal format of a transaction does not have to match the OTA format. Incoming transactions could be converted to a format that is consolidated (no extra map account data needed) and more naturally handled and stored by the runtime. One option to minimize runtime changes could be to convert the v2 tx + maps to a v1 tx. The runtime would probably need to change very little if this conversion could be done early enough. Or maybe create a new internal evolving trait based tx format that v1, v2, vx gets converted into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true. With multiple formats there's the question about which bytes are actually going to be signed and verified from the ledger:
If the bytes of the OTA format are signed to produce the tx signature, then we need to make sure that validators also receive that OTA format in the ledger blocks so that they can verify the integrity of transactions. The consolidated transaction info isn't broadcasted, it's always derived.
If the bytes of the consolidated format (all maps resolved) are signed, then the block producers have to do an extra step of transaction expansion before doing signature verification. But then the consolidated transaction bytes can be stored in the ledger directly and validated more easily.
The current proposal is to do 1) and then both block producers and validators need to expand the OTA format themselves to get the consolidated format which the runtime can handle. By storing the extra metadata, RPC nodes will also know how to expand those transactions before sending to clients. We already do this for logs, inner instructions, and balances.
If we go with 2) instead, we don't need any of that extra metadata and validators don't need to expand transactions but this would also introduce an extra step before signature verification for expanding incoming transactions. This might be ok since all address maps would be cached. It would also allow us to reference signer addresses with address maps since they would be expanded before signature verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sakridge does this framing make sense and do you prefer one approach over the other?