Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measure accounts loading and storing cost #31411

Closed
tao-stones opened this issue Apr 28, 2023 · 18 comments
Closed

Measure accounts loading and storing cost #31411

tao-stones opened this issue Apr 28, 2023 · 18 comments
Assignees
Labels
stale [bot only] Added to stale content; results in auto-close after a week.

Comments

@tao-stones
Copy link
Contributor

tao-stones commented Apr 28, 2023

Problem

Need to measure accounts loading and storing performance in us, and in CU.

Motivation is to set transaction accounts read/write lock cost, which can be used as input for base fee calculation.

Few parameters for measuring: type of accounts (eg program account, data account and PDA), size of accounts,

Proposed Solution

using ledger-tool to replay mainnet-beta ledger, collect measure_us for accounts load and store at different circumstances.
Note: loading program accounts should be more expensive than loading normal data and PDA accounts.

@tao-stones tao-stones self-assigned this Apr 28, 2023
@tao-stones tao-stones changed the title measure different accounts loading and storing performance measure different type of accounts' loading and storing performance Apr 28, 2023
@tao-stones tao-stones changed the title measure different type of accounts' loading and storing performance measure loading and storing performance for different account types Apr 28, 2023
@tao-stones tao-stones changed the title measure loading and storing performance for different account types Measure accounts loading and storing costa Oct 24, 2023
@KirillLykov
Copy link
Contributor

KirillLykov commented Oct 24, 2023

Regarding the way CU per tx could be computed, I'm thinking about using empirical approach.
In this particular case, as input we have:

  • number of accounts n,
  • size of each account s.

The observable function of choice is time per tx, call it t(s, n).

This function is different from cost per transaction, lets denote by c(s, n).
But if the idea of 48M CU per block is to fit into 400ms block construction time, we can say that c(s, n) = 48M/0.4 [CU/sec] * t(s, n), where in the [] I specify units.

We can do simulations and find triples (s_i, n_i) -> t_i and with this data estimate parameters of function t which I think can be described by t(s, n) = a[sec]*n + b[sec/mb]*n*s.

  • a is "time cost" of write lock of an account (assuming it doesn't depend on the size),
  • b is "time cost" of writing 1MB of data.

So all we need is to find estimations for a and b.


UPD: c(s, n) = 48M/0.4/4 [CU/sec] * t(s, n) = 30[CU/sec] * t(s,n), because there are 4 threads

@tao-stones
Copy link
Contributor Author

But if the idea of 48M CU per block is to fit into 400ms block construction time, we can say that c(s, n) = 48M/0.4 [CU/sec] * t(s, n), where in the [] I specify units.

Assuming there are 4 parallel processes during replay, currently the CU per nano-sec is defined as

/// Cluster averaged compute unit to micro-sec conversion rate
pub const COMPUTE_UNIT_TO_US_RATIO: u64 = 30;

so equation can be simplified to c(s, n) = 30[CU/us] * t(s, n)

@tao-stones tao-stones changed the title Measure accounts loading and storing costa Measure accounts loading and storing cost Oct 24, 2023
@tao-stones
Copy link
Contributor Author

function t which I think can be described by t(s, n) = a[sec]*n + b[sec/mb]*n*s.

  • a is "time cost" of write lock of an account (assuming it doesn't depend on the size),
  • b is "time cost" of writing 1MB of data.

Few points I am not entirely clear:

  • are there "time cost" of loading accounts? Especially loading cold account?
  • what does "time cost of write lock" mean exactly, is it the wall clock of acquiring write lock on account (which should be cheap)? or includes waiting time till the account becomes available for write lock?
  • will there be similar "time cost" of read lock?

ig I'm thinking alone the line of:
cost of read-only account = cost of loading
cost of write-lock account = cost of loading + cost of write-back (eg storing).
Not clear how to define "cost of read/write lock" yet, but if we do, they shall be added to above accordingly.

@KirillLykov
Copy link
Contributor

  • are there "time cost" of loading accounts? Especially loading cold account?

To be honest, I didn't dig much into this thing. What I observe is that loading time reported here is quite high (1/2 of execute time). Didn't consider that there is a cache for loaded accounts. Need to check the size of the cache to say if my accounts are hot or not.

  • what does "time cost of write lock" mean exactly, is it the wall clock of acquiring write lock on account (which should be cheap)? or includes waiting time till the account becomes available for write lock?
  • will there be similar "time cost" of read lock?

I though about artificial scenario with minimized number of lock contention, so just acquiring the lock.

@KirillLykov
Copy link
Contributor

In regard to this, I'm not sure that my mental model for the accounts load is accurate. I have in mind something like:

  1. for each account participating in the transaction, try locking it
  2. copy the accounts data from accountsdb to some cache
  3. during execution of tx, fetch account from this cache if requested and copy it.
  4. write to the account and write it back though the cache to the accountsdb

So from (3), if we compare program that only reads accounts and one that writes, the load phase of both should be the same but the execution will be longer for write program because, I think, the time to copy modified accounts data is part of execution.

@alessandrod could you tell me if it fits the reality?

@KirillLykov
Copy link
Contributor

ig I'm thinking alone the line of:
cost of read-only account = cost of loading
cost of write-lock account = cost of loading + cost of write-back (eg storing).
Not clear how to define "cost of read/write lock" yet, but if we do, they shall be added to above accordingly.

Well, if we say that lock time is negligible, cost of write-lock account = cost of loading + cost of write-back makes sense. From the simulation point of view, cost of read-only account and cost of write-lock account might be both estimated empirically

@tao-stones
Copy link
Contributor Author

From the simulation point of view, cost of read-only account and cost of write-lock account might be both estimated empirically

I played a bit on timing on storing accounts. Somewhat surprised to notice that same transaction took very different time to store accounts between ledger replays. The first replay took about 14ms to store accounts for a transaction, the second (re)run took only 1ms. Not sure why, @brooksprumo do you have a pointer to where should I look? Also, I thought you did accounts loading/storing performance measure, are there something we can use here?

@brooksprumo
Copy link
Contributor

I played a bit on timing on storing accounts. Somewhat surprised to notice that same transaction took very different time to store accounts between ledger replays. The first replay took about 14ms to store accounts for a transaction, the second (re)run took only 1ms. Not sure why, @brooksprumo do you have a pointer to where should I look?

Let me take a look. The first transaction of a slot does some extra stuff to initialize various data structures. For example, the accounts write cache has a map from slot to accounts. So the first store to a new slot will add a new entry to the map for that new slot with an initially-empty set of accounts. But I don't think this alone would account for over 10 milliseconds of difference...

Also, I thought you did accounts loading/storing performance measure, are there something we can use here?

I did some histograms for loading accounts, but not on storing accounts. So I don't think that'll apply here (based on this and other threads, seems that writes/stores are the issue).

@brooksprumo
Copy link
Contributor

Let me take a look.

Nothing jumps out in the code. I'll look at metrics to see if that elucidates anything.

@tao-stones
Copy link
Contributor Author

tao-stones commented Oct 27, 2023

I did some histograms for loading accounts, but not on storing accounts. So I don't think that'll apply here (based on this and other threads, seems that writes/stores are the issue).

Data on loading is helpful too.

The overarching idea is loading/storing/locking accounts, as part of execution, are underpriced, so it leads to either few transactions are packing into block, or elevate replay time. @KirillLykov correct me if i'm wrong.

This issue aims to identify following:

  • micro-second/10k-bytes for loading, on average
  • micro-second/10k-bytes for storing, on average
  • and micro-second/locking, both read & write.
  • And, if above metrics should be different based on account types (eg app account, data account, pda)

@brooksprumo
Copy link
Contributor

Data on loading is helpful too.

Gotcha. Maybe this PR can be helpful to start with: #33220

@tao-stones
Copy link
Contributor Author

@KirillLykov to follow up on our call, my understanding of "loading account", which could be wrong, consists two steps:

  1. load account from db to read-only-cache.
  2. serialize and memcpy into VM, so it's available for program;
    first step happens at accounts_db.load(), Brooks PR above reports this operation's time, it is also folded into metrics replay-slot-end-to-end-stats.load_us
    I am not clear about second step, maybe it is part of "execution"?

As for cost of "storing account", I assume is a reverse process: deserialize and memcpy from VM to accounts' write-cache, then store to accounts-db here. metrics replay-slot-end-to-end-stats.store_us captures it.

@alessandrod
Copy link
Contributor

¡

@KirillLykov to follow up on our call, my understanding of "loading account", which could be wrong, consists two steps:

1. load account from db to read-only-cache.

2. serialize and memcpy into VM, so it's available for program;
   first step happens at [accounts_db.load()](https://github.com/solana-labs/solana/blob/master/accounts-db/src/accounts_db.rs#L5032), Brooks PR above reports this operation's time, it is also folded into metrics `replay-slot-end-to-end-stats.load_us`
   I am not clear about second step, maybe it is part of "execution"?

As for cost of "storing account", I assume is a reverse process: deserialize and memcpy from VM to accounts' write-cache, then store to accounts-db here. metrics replay-slot-end-to-end-stats.store_us captures it.

Yep this is correct, the serialize* and deserialize* execution metrics track the time to write accounts into the vm and then parse them back respectively.

@KirillLykov
Copy link
Contributor

KirillLykov commented Nov 28, 2023

@taozhu-chicago my understanding of the metrics is the following:

Cost of load:

  1. Cost of read from AccountsDB
  • checks in write and later read accounts caches. So we should observe spike in this metric when accounts are first used. Need to find the size of caches and if there is a metric for cache hit.
  • metric is replay-slot-end-to-end-stats.load_us
  1. Cost of serialization
  • happens before VM executes the program
  • includes cost of copying accounts data into the buffer used by VM see write_data
  • metric is replay-slot-end-to-end-stats.execute_details_serialize_us
  1. Cost of execution
  • metric for time to execute instruction without serialization/deserialization is replay-slot-end-to-end-stats.execute_details_execute_inner_us
  • But there also other metrics which include this one. In the order of call stack:
    execute_acessories_process_message -> execute_accessories_process_instruction_total_us -> execUte_acessories_process_instruction_executable_chain_us -> execute_details_execute_inner_us
  1. Cost of deserialization
  • happens after execution
  • includes cost of copying accounts data
  • metric is replay-slot-end-to-end-stats.execute_details_deserialize_us
  1. Cost of storing into AccountsDB
  • happens in banks.rs::commit_transaction
  • metric is replay-slot-end-to-end-stats.store_us
  • there are more detailed metrics in accounts_db_store_timings: store_accounts, update_index and other.

NOTE: replay-slot-end-to-end-stats metrics are for longest running replay threads, while replay-slot-stats metrics are for sum of all the executions.

@tao-stones
Copy link
Contributor Author

Amazing! perhaps you can also add replay-slot-end-to-end-stats.store_us as cost of storing?

@KirillLykov
Copy link
Contributor

KirillLykov commented Nov 30, 2023

Some preliminary results for the run when I create blocks of transactions such that each transaction modifies 1 10MB account. There are 256 accounts in total modified in round-robing manner. The unit is us everywhere, I double checked the code and it is really always as_us().

Metric name Value
replay-slot-end-to-end.load_us 79.5
replay-slot-end-to-end.execute_us 69.64K
execute_accessories_process_message 69.59K
process_instructions_total_us 69.57K
replay-slot-end-to-end.execute_accessories_process_instructions_process_executable_chain_us 69.55K
replay-slot-end-to-end.execute_details_execute_inner_us 69.17K
replay-slot-end-to-end.execute_details_serialize_us/deserialize_us 18.23/5.9
replay-slot-end-to-end.store_us 102.7

Preliminary because I just picked time for 2s when the invalidator was a leader, it is not averaged over long enough period of time (because for that I need to develop some tools).

What is weird with it:

  1. load/store time is so small. Also expected that accounts_db_store_timings.store_accounts is time to store accounts in accounts_db should be part of the replay-slot-end-to-end.store_us but it is ~600K so seems i don't quite understand what is this metric for.
  2. I would expected that load+execute+store will be around 400ms because I thought that we are bounded here by the time to execute transactions. Otherwise, why would we form blocks made of only 35txs?

@alessandrod
Copy link
Contributor

alessandrod commented Nov 30, 2023

Otherwise, why would we form blocks made of only 35txs?

Because the scheduler will estimate CU costs, see high costs, and pack a block with enough transactions as not to exceed the max block CU cost. (EDIT: I'm assuming that the CU cost for these txs is high)

@tao-stones
Copy link
Contributor Author

Otherwise, why would we form blocks made of only 35txs?

Because the scheduler will estimate CU costs, see high costs, and pack a block with enough transactions as not to exceed the max block CU cost. (EDIT: I'm assuming that the CU cost for these txs is high)

Mmm, possible but not likely. Scheduler uses actual consumed CU when packing, not the amount TX requests. Given only 25 TXs/block, each tx need to have total actual CU = 48,000,000/35 = 1.4M CU/tx, which is the up limit. I'll look into ledger once it's available.

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Dec 2, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

No branches or pull requests

4 participants