Network Stability rollout for quic/qos/fee markets #23211

aeyakovenko · 2022-02-17T06:33:49Z

t-nelson · 2022-02-17T06:42:00Z

Can you elaborate "loaded"? AFAIK we have several notions

Physical network link
Block "CU" cap
Per-writer "CU" cap
Banking throughput
Replay throughput
...

It seems unlikely that these all have the same solution. I'm not necessarily convinced that these are root causes or even have a single root cause at this point

aeyakovenko · 2022-02-17T06:46:42Z

@t-nelson

the pipeline between the TPU and where banking state puts an entry of TXs into PoH, any place in there could be saturated to the point that it has more stuff to process then time before PoH ends the block. we want TXs to be prioritized as early as possible by 'fee/cu'. If forwarders are doing the prioritization right, and leaders QoS by forwarder stake weight, then valid txs that have a verified fee payer that can afford the TX should get ahead of the queue on the leader most of the time.

t-nelson · 2022-02-17T07:45:35Z

pull txs proportionally from staked nodes - the goal is to prevent anyone from starving the pipeline for proposing txs to the leader. There should be some mechanism for a validator to set a preferred RPC node that overrides the staked list.

This will require some changes to TX submission since validators only send to TPUFwd. They shouldn't be a major contributor if at all on the TPU port today, so stake-weighted priority will have little effect

sort txs by (additional_fee + base fee = total fee)/compute_units. CU's should be deterministic since they are completely specified by the ComputeBudget instruction. Users that are willing to pay more, and devs that can make their TXs more efficient, should be prioritized first.

I think we need to be deliberate with wording here that these are requested CUs. The effectiveness of this treatment will hinge on both the coverage and assigned-cost accuracy of our measure of true CU usage. Given that "CU" has become a fairly nebulous term, it's hard to be confident that that's the case or even that we're always talking about the same thing.

fill the account buckets up to the account write CU limits and the block CU limits.

keep filling account buckets and forward each bucket to avoid starving accounts on forwards. Naive implementation would forward highest paid TXs first, which are likely to be only for a single hot account.

"Forward" here is to the next leader or down the pipeline?

Any thoughts on how we'll balance fee-based priority with parallelization maximization?

Need an RPC api to estimate the fee user should pay. Sample last 150 slots of MAX(lowest fee for the block, lowest fee of the writable accounts in the TX), show user the top 5%, 50%, 95% paid per CU across those 150 slots.

Since fees are no longer fully deterministic, it seems like there are a lot of games both validators and users can play to influence this. I'll have to think on it before elaborating

aeyakovenko · 2022-02-17T17:49:25Z

This will require some changes to TX submission since validators only send to TPUFwd. They shouldn't be a major contributor if at all on the TPU port today, so stake-weighted priority will have little effect

both TPU and fwd ports

I think we need to be deliberate with wording here that these are requested CUs. The effectiveness of this treatment will hinge on both the coverage and assigned-cost accuracy of our measure of true CU usage. Given that "CU" has become a fairly nebulous term, it's hard to be confident that that's the case or even that we're always talking about the same thing.

yep, CUs need to be as accurate as gas is in eth.

"Forward" here is to the next leader or down the pipeline?

forward to the next leader

Any thoughts on how we'll balance fee-based priority with parallelization maximization?

keep filling account buckets and forward each bucket to the next leader to avoid starving accounts on forwards.

The idea here is to keep sorting txs into account buckets and then pull from N queues.

Since fees are no longer fully deterministic, it seems like there are a lot of games both validators and users can play to influence this. I'll have to think on it before elaborating

They are deterministic, user signs exactly what they pay for. CUs are specified in the TX, what they request is what they get charged for.

tao-stones · 2022-02-18T17:49:11Z

sort txs by (additional_fee + base fee = total fee)/(requested compute units).

where does base fee come from? @aeyakovenko

t-nelson · 2022-02-19T00:50:17Z

sort txs by (additional_fee + base fee = total fee)/(requested compute units).

where does base fee come from? @aeyakovenko

Should be the CU bin they bought

t-nelson · 2022-02-19T02:35:09Z

Any thoughts on how we'll balance fee-based priority with parallelization maximization?

keep filling account buckets and forward each bucket to the next leader to avoid starving accounts on forwards.

The idea here is to keep sorting txs into account buckets and then pull from N queues.

Yeah the sorting is what I'm interested in details about. Seems like we need to sort by multiple things they may not have the same ideal weights in all conditions

Since fees are no longer fully deterministic, it seems like there are a lot of games both validators and users can play to influence this. I'll have to think on it before elaborating

They are deterministic, user signs exactly what they pay for. CUs are specified in the TX, what they request is what they get charged for.

Right. What they're going to spend is still deterministic, but whether the fee they specify is sufficient for block inclusion is not

tao-stones · 2022-02-22T16:50:52Z

Yeah the sorting is what I'm interested in details about. Seems like we need to sort by multiple things they may not have the same ideal weights in all conditions

In #23257, assuming one packet has one transaction, packets with same fee-per-cu are put into same bucket, keeping the same order of stake weighted shuffle. So if two transactions have same fee-per-cu, the one from higher staked validator has better chance to first picked.

tao-stones · 2022-02-22T16:55:52Z

sort txs by (additional_fee + base fee = total fee)/(requested compute units).

where does base fee come from? @aeyakovenko

Should be the CU bin they bought

does https://github.com/solana-labs/solana/blob/master/runtime/src/bank.rs#L3257-L3274 returns total_fee that includes additional_ff and base_fee?

aeyakovenko · 2022-02-22T19:50:20Z

@jackcmay ^

t-nelson · 2022-02-22T20:01:36Z

In #23257, assuming one packet has one transaction, packets with same fee-per-cu are put into same bucket, keeping the same order of stake weighted shuffle. So if two transactions have same fee-per-cu, the one from higher staked validator has better chance to first picked.

This doesn't address my concern. I'm wondering how parallelization will contribute to the sort. We shouldn't find ourselves in a situation that whether through malice or incentives motivated usage, users are buying sub-optimal parallelization

jackcmay · 2022-02-22T20:09:43Z

sort txs by (additional_fee + base fee = total fee)/(requested compute units).

where does base fee come from? @aeyakovenko

Should be the CU bin they bought

does https://github.com/solana-labs/solana/blob/master/runtime/src/bank.rs#L3257-L3274 returns total_fee that includes additional_ff and base_fee?

Total fee, aka, includes additional fee

tao-stones · 2022-02-22T20:36:34Z

sort txs by (additional_fee + base fee = total fee)/(requested compute units).

where does base fee come from? @aeyakovenko

Should be the CU bin they bought

does https://github.com/solana-labs/solana/blob/master/runtime/src/bank.rs#L3257-L3274 returns total_fee that includes additional_ff and base_fee?

Total fee, aka, includes additional fee

It takes a bank to calculate total_fee, mainly because it get fee_structure from bank. This works fine except leader would have not secured a bank when it needs fee-per-cu to prioritize buffered packets. Is it OK to use FeeStructure::default() instead of bank.fee_structure for fee calculation?

aeyakovenko · 2022-02-22T21:10:35Z

@taozhu-chicago @t-nelson I was thinking we sort by fee, then bucket by account/CU limit. then forward by bucket. so if there are N buckets, we round robin between each bucket up to the block limit and take some X CU's for forward.

tao-stones · 2022-02-22T21:50:27Z

@taozhu-chicago @t-nelson I was thinking we sort by fee, then bucket by account/CU limit. then forward by bucket. so if there are N buckets, we round robin between each bucket up to the block limit and take some X CU's for forward.

Right, after sorting the transactions in buffer by fee-per-cu, leader starts to fill block from highest fee-per-cu TXs, to ensure each write-accounts are always filled higher paid TXs first. We don't really need to create account bucket per se.

wrt forwarding, I am thinking to do following:
create a temp cost_tracker, fill it with sorted txs from buffer, same as filling actual block. when hits block_limit, or buffer is exhausted, forward txs in that cost_tracker in same order. Essentially forwarding a block of Txs that prioritized by fee, bucket by write-accounts.

jackcmay · 2022-02-22T21:50:46Z

Is it OK to use FeeStructure::default() instead of bank.fee_structure for fee calculation?

FeeStructure is not fixed and will be modified at epoch boundaries over time. It probably falls into a similar category as the compute_budget and congestion where both of those will also be bank-specific based on various conditions (feature activation, congestion, etc...)

tao-stones · 2022-02-22T22:03:23Z

Is it OK to use FeeStructure::default() instead of bank.fee_structure for fee calculation?

FeeStructure is not fixed and will be modified at epoch boundaries over time. It probably falls into a similar category as the compute_budget and congestion where both of those will also be bank-specific based on various conditions (feature activation, congestion, etc...)

Understood. Was hoping to calculate fee-per-cu for each packet upfront when it was received, so the results can be reused. It'd be easier to do if not requiring a bank.
Guess I can cache "last working_bank", and use it for calculate fee.

jackcmay · 2022-02-22T22:08:29Z

Or just cache the things you need and yes that is probably the best predictor without looking at current state

tao-stones · 2022-02-22T22:10:56Z

Yea, so just cache fee_structure and lamports_per_signature

carllin · 2022-02-23T02:17:24Z

In #23257, assuming one packet has one transaction, packets with same fee-per-cu are put into same bucket, keeping the same order of stake weighted shuffle. So if two transactions have same fee-per-cu, the one from higher staked validator has better chance to first picked.

I was thinking we sort by fee, then bucket by account/CU limit. then forward by bucket. so if there are N buckets, we round robin between each bucket up to the block limit and take some X CU's for forward.

@taozhu-chicago @aeyakovenko if this round robin is done naively, does this open up a way for low transaction fees to starve higher transaction fee transactions?

Imagine I had two buckets, with pending transactions of the form (Write Accounts, Fee/CU)

Bucket 1: [(A, 1000000), (AB, 1000000)]
Bucket 2: [(B, 1), (A, 1), (B, 1), (A, 1), repeats, these are long running computations where Task B takes much longer than task A]

If we are doing round robin by first nonconflicting transaction, then we could get a schedule like

Schedule (A, 1000000)
Can't schedule (AB, 1000000), conflicts on A
Schedule (B, 1),
Task 1. finishes running, can't schedule (AB, 1000000), conflicts on B from Task 3. Instead
schedule first non-conflicting transaction (A, 1)
Task 3. finishes running. Because Task B takes much longer than Task A, then task 4 is still running. This means we can't schedule (AB, 1000000), because it conflicts on A from Task 4. Instead
schedule first non-conflicting transaction (B, 1)
Repeat steps 4 and 5 over and over, starving (AB, 1000000)

Seems like to prevent this, you should also have higher fee transactions that are blocked prevent other lower fee transactions from grabbing the same accounts it needs, even if those lower fee transactions could run without conflict at this moment

tao-stones · 2022-02-23T03:49:30Z

the bucket is by writable-account.
In your example above, after sorting, we'd have a view of buffer like this:
[(A, 1000000), (AB, 1000000), (A, 1), (B, 1), (A, 1), (B, 1), (A, 1), (B, 1) ...]
assume we have chunk_size=4, it'd submitted following chunks for processing, in sequence:

[(A, 1000000), (AB, 1000000), (A, 1), (B, 1)] --> result (B, 1) to be retried due to account_in_use
[(A, 1), (B, 1), (A, 1), (B, 1)]. --> result [(A, 1), (B, 1)] to be retried due to account_in_use

This is assume account_limit >= 2000000;
If account_limit =1000000, with same chunk_size, then:

[(A, 1000000), (AB, 1000000), (A, 1), (B, 1)] --> result [(AB, 1000000), (A, 1)] to be retried due to account_limit and account_in_use
[(A, 1), (B, 1), (A, 1), (B, 1)]. --> result [(A, 1), (B, 1)] to be retried due to account_in_use
at the end of buffer iteration, what's remaining in buffer will be:
[(AB, 1000000), (A, 1), (A, 1), (B,1)]
the second iteration of buffer will first pick up (AB, 1000000)

carllin · 2022-02-23T05:53:49Z

the bucket is by writable-account.

Hmmm, does this mean every account that tries to grab a write lock on A will fall into the same bucket? What if transactions grab write locks on multiple accounts, like A and B. For example if we had three transactions that each grabbed two write locks:

AB, BC, CD

How would these be organized into buckets?

assume we have chunk_size=4, it'd submitted following chunks for processing, in sequence:

[(A, 1000000), (AB, 1000000), (A, 1), (B, 1)] --> result (B, 1) to be retried due to account_in_use

Hmm in this example, why is only B retried due to account in use, won't (AB, 1000000), (A, 1) be retried because they both try to grab the lock on account A, but A is being locked by the first transaction (A, 1000000)?

tao-stones · 2022-02-23T08:49:59Z

the bucket is by writable-account.

Hmmm, does this mean every account that tries to grab a write lock on A will fall into the same bucket? What if transactions grab write locks on multiple accounts, like A and B. For example if we had three transactions that each grabbed two write locks:

AB, BC, CD

How would these be organized into buckets?

Right now, if (AB, 1000) is included in a block, it will be counted into both A and B buckets. So both A and B buckets will increase by 1000 CUs in this example.

assume we have chunk_size=4, it'd submitted following chunks for processing, in sequence:

[(A, 1000000), (AB, 1000000), (A, 1), (B, 1)] --> result (B, 1) to be retried due to account_in_use

Hmm in this example, why is only B retried due to account in use, won't (AB, 1000000), (A, 1) be retried because they both try to grab the lock on account A, but A is being locked by the first transaction (A, 1000000)?

Oops, mistakenly flipped outcome. Sorry ... you are right (B, 1) will be included, but any TXs with A will be retied due to AccountInUse

carllin · 2022-02-24T08:00:41Z

Right now, if (AB, 1000) is included in a block, it will be counted into both A and B buckets. So both A and B buckets will increase by 1000 CUs in this example.

In your example above, after sorting, we'd have a view of buffer like this:
[(A, 1000000), (AB, 1000000), (A, 1), (B, 1), (A, 1), (B, 1), (A, 1), (B, 1) ...]"

So from the above statement, the buckets would actually looks like this?
Bucket 1: [(A, 1000000), (AB, 1000000), (A, 1), (A, 1), (A, 1) ...]
Bucket 2: [(B, 1), (B, 1), (B, 1)]

How do we pick which buckets to schedule from? Do we use any information about the fees of the individual transactions in each of the buckets to prioritize which bucket to choose from?
Does this open the starvation of (AB, 1000000) scenario I described earlier? What happens if (A, 1000000) is scheduled, then (B, 1), then (A, 1) (((AB, 1000000) is skipped because it conflicts with the ongoing transaction (B, 1)), and this keeps repeating

tao-stones · 2022-02-24T16:20:39Z

Bucket by write lock is to gate TXs not to exceed max_account_limit. (A minor correction is AB will be in bucket 2 as well, as it will also take that amount of CU from B) However, TXs are submitted to bank for execution in order of fee-per-cu, the higher paid TXs first.
AB will be starved, same as in current version. However, with fee-per-cu, it can be prioritized to guarantee a spot in block if it pays enough in additional-fee.

Go through the example in steps:

transactions are sorted by fee-per-cu in following order, where each entry is:

(Write Accounts, Fee/CU, TX_CUs)
(A, 1000000, 100), 
(AB, 1000000, 100), 
(A, 1, 1), 
(B, 1, 2), 
(A, 1, 3), 
(B, 1, 4), 
...

Assume account_max_limit=101, block_limit=202; TXs will be fitted/filtered by these limits:

(A, 1000000, 100), // scheduled in block, bucket_A is filled by 100cu, 1cu left;
(AB, 1000000, 100), // not scheduled for this block, would exceed bucket_A limit, will retry;
(A, 1, 1), // scheduled in block, bucket_A is filled by 1cu, 0 left;
(B, 1, 2), // scheduled in block, bucket_A is filled by 2cu, 99 left;
(A, 1, 3), // not scheduled for this block, would exceed bucket_A limit, will retry;
(B, 1, 4), // scheduled in block, bucket_A is filled by 3cu, 95 left;
...

this continues until either block_limit is met, or went through entire buffer. At the end of iteration, these transactions are submitted to bank for execution, and their outcomes:

(A, 1000000, 100),  // commit to bank;
(A, 1, 1), // account-in-use, retry;
(B, 1, 2), // commit to bank;
(B, 1, 4), // account-in-use, retry;
...

And the remaining buffer (after sorted by fee-per-cu) looks like this:

(AB, 1000000, 100), 
(A, 1, 1), 
(A, 1, 3), 
(B, 1, 4), 
...

for the next slot:

(AB, 1000000, 100),  // will be scheduled for execution, and will be committed to new block;
(A, 1, 1),  // will be scheduled, but not executed due to account-in-use, therefore retry
(A, 1, 3), // will not be scheduled due to exceed bucket_A limit
(B, 1, 4), // will be scheduled for execution, and will be committed to new block;
...

Note, if AB was paid more so that its fee/cu > 1000000, then it would be at the top of the sorted list (in step 1), wherefore would land in first block (in step 2).

Also note, this same sorting/fitting/filtering process apply to both block producing in current leader, and packets forwarding.

t-nelson · 2022-02-24T19:35:54Z

It's starting to feel like this needs a full fledged proposal rather than a bunch of adhoc discussions

carllin · 2022-02-24T21:45:53Z

Bucket by write lock is to gate TXs not to exceed max_account_limit. (A minor correction is AB will be in bucket 2 as well, as it will also take that amount of CU from B) However, TXs are submitted to bank for execution in order of fee-per-cu, the higher paid TXs first.

Got it, thanks!

AB will be starved, same as in current version.

Hmm I think with the higher fees, it's no longer sufficient to just uphold current guarantees. We have to guarantee that higher fees land with much higher priority than lower fee transactions whenever possible.

Assume account_max_limit=101, block_limit=202; TXs will be fitted/filtered by these limits:
(A, 1000000, 100), // scheduled in block, bucket_A is filled by 100cu, 1cu left;
(AB, 1000000, 100), // not scheduled for this block, would exceed bucket_A limit, will retry;
(A, 1, 1), // scheduled in block, bucket_A is filled by 1cu, 0 left;
(B, 1, 2), // scheduled in block, bucket_A is filled by 2cu, 99 left;
(A, 1, 3), // not scheduled for this block, would exceed bucket_A limit, will retry;
(B, 1, 4), // scheduled in block, bucket_A is filled by 3cu, 95 left;
... 
this continues until either block_limit is met, or went through entire buffer. At the end of iteration, these transactions are submitted to bank for execution, and their outcomes:

I think in your example where the account limit is hit, then the delay on the (AB, 1000000, 100) transaction is reasonable. But what if the issue is not the block/account limits being hit, but one of pure contention on the write locks?

In this example, supposed the account_max_limit and block_limit are infinitely large, so we can just ignore them.

Say now we have the same queue:

(A, 1000000, 100), // scheduled
(AB, 1000000, 100), // not scheduled because contention on `A`
(A, 1, 1),  // not scheduled because contention on `A`
(B, 1, 2), // scheduled
(Task A ends, (AB, 1000000, 100) once again not scheduled because contention on `B`)
(A, 1, 3), // scheduled
(Task B ends, (AB, 1000000, 100) once again not scheduled because contention on `A`)
(B, 1, 2), // scheduled
... repeats

This seems like a starvation scenario we should actively avoid.

tao-stones · 2022-02-24T22:49:59Z

Hmm, isn't this why we have block-limit and account-limit to help such scenario? Plus allowing additional-fee to give AB better chance?

I feel I am missing something. Can you describe what do you see about to actively avoid this starvation scenario?

carllin · 2022-02-25T05:23:23Z

Hmm, isn't this why we have block-limit and account-limit to help such scenario? Plus allowing additional-fee to give AB better chance?

Based on my understanding thus far, it doesn't seem these limits will help.

Can you describe what do you see about to actively avoid this starvation scenario?

I was thinking maybe you would need to reserve accounts that are touched by higher paying transactions, even if you could schedule another non-conflicting transaction.

So going back to the example above:

(A, 1000000, 100), // scheduled
(AB, 1000000, 100), // not scheduled because contention on `A`
(Reserve A and B)
(A, 1, 1),  // not scheduled because `A` is reserved
(B, 1, 2), // not scheduled because `B` is reserved, even though there is nobody else currently using `B`

It may also be good to factor in estimated compute. For instance it might be fine to schedule B if we're sure the time to execute is less than the remaining time on (A, 1000000, 100)

carllin · 2022-02-25T07:27:16Z

@behzadnouri do you have any input on this?

This seems like a variation of the job-shop scheduling problem with n tasks and m machines, where:

We have an estimate of time needed for each task
Each task has a weight w (the fee)
Only constraint is certain tasks that need an account write lock cannot run at the same time as any other tasks that need a lock on the same account

And we need to find a valid schedule that maximizes the total weight of the tasks executed. Are you familiar with any of the latest heuristic based algorithms for this type of problem?

tao-stones · 2022-02-25T15:02:52Z

I am taking it as a simple scheduling issue.

I interpret the issues as: within banking stage, 1. there are many packets want to be scheduled to be processed by bank, 2. there are way more packets than block capacity;

Proposed solution is simple: to prioritized buffered packets with their fee/cu. Higher paid packet get to "see" bank first.

Want to clarify that the prioritization, eg sorting by fee/cu, is done before loading and locking accounts, which is before bank executing.

tao-stones · 2022-02-25T15:08:31Z

Hmm, isn't this why we have block-limit and account-limit to help such scenario? Plus allowing additional-fee to give AB better chance?

Based on my understanding thus far, it doesn't seem these limits will help.

With block and account limits, AB would have change to be attempted for process again, as early example shown, it'll be processed. So that'd solve the starvation issue, no?

Can you describe what do you see about to actively avoid this starvation scenario?

I was thinking maybe you would need to reserve accounts that are touched by higher paying transactions, even if you could schedule another non-conflicting transaction.

So going back to the example above:
(A, 1000000, 100), // scheduled
(AB, 1000000, 100), // not scheduled because contention on `A`
(Reserve A and B)
(A, 1, 1),  // not scheduled because `A` is reserved
(B, 1, 2), // not scheduled because `B` is reserved, even though there is nobody else currently using `B`
It may also be good to factor in estimated compute. For instance it might be fine to schedule B if we're sure the time to execute is less than the remaining time on (A, 1000000, 100)

"reserve" isn't really a status for lock. I mean what does "reserved" different from "read locked"?

aeyakovenko · 2022-02-25T15:23:47Z

@taozhu-chicago @carllin @t-nelson

This shouldn't be that complicated. The sorter creates N queues, each queue represents a writable account. The forwarder should forward in RR from the N queues, taking some amount of CU's up to the block limit

On the leader side we should get a mix of the highest paying TXs from the top accounts from all the validators so we shouldn't see anyone starved.

high paying fees shouldn't be starved by any other txs
high paying saturated accounts shouldn't starve every one else
small staked nodes should be able to propose their txs so leader can observe high paying fees (1 and 2) from everyone on the network

Basically, however we are packing the block now, we can continue to do so, except instead of moving TXs into banking stage they are just forwarded, and the current node observes the same block limits, etc...

carllin · 2022-02-25T18:39:55Z

@aeyakovenko

high paying fees shouldn't be starved by any other txs

For this to be true, In this round robin, for each queue Q we have to ensure then that we never replay a later transaction in Q if an earlier one has not been replayed then right?

For instance if you have two queues:

Queue A: {A, 10000}, {AB, 10000}
Queue B: {AB, 10000} (same transaction as in Queue A}, {B, 1}

If the order of events is doing round robin:

{A, 10000} from Queue A scheduled
{AB, 10000} from Queue B not scheduled because there's contention on account A
We then do not schedule {B, 1} in Queue B, even if it was able to be replayed without contention because an earlier transaction in Queue B was not played yet.

carllin · 2022-02-25T18:44:34Z

The sorter creates N queues, each queue represents a writable account. The forwarder should forward in RR from the N queues, taking some amount of CU's up to the block limit

Seems like you also need top level prioritization of which bucket to look at first based on fees, it shouldn't be a blind round robin. You could easily have 100,000 buckets for different write accounts, majority of them being low fees, and only a couple containing high fees transactions that we need to prioritize

tao-stones · 2022-02-25T18:52:18Z

Thinking it might be simpler to select TXs from sorted list that can fill up a block (with account limits), without worrying about account contention yet. Then submit selected TXs to bank for execution, from this point on, it works the same way as it is today - TX can't get its account(s) locked will be put back to buffered as retryable, therefore to be reconsidered for next bank.

jstarry · 2022-02-26T03:50:47Z

@taozhu-chicago @carllin @t-nelson

This shouldn't be that complicated. The sorter creates N queues, each queue represents a writable account. The forwarder should forward in RR from the N queues, taking some amount of CU's up to the block limit

In my opinion, this problem is inherently complicated because packing conflicting transactions efficiently by fee weight is not trivial. We are going to have to make some tradeoffs here and I think we just need to get aligned on what those tradeoffs are (including attack vectors) in a proposal.

t-nelson · 2022-02-26T04:36:35Z

the fact that we're dealing with an np-hard problem for which we want an online, time-constrained solution, should preclude "simple"

ryoqun · 2022-02-27T14:49:18Z

(hi, let's make the party even merrier. lol)

here's another rando idea: on-chain per-account auctioned mempool with deterministic scheduling (this is like a pre-proposal; did we consider like this?)

pros

no off-chain mechanism for tx ordering decision. (personally, i want to avoid those gaming factors even if the power is weighted by stakes)
tx with less contentious accoutns will be executed cheaply as was before.

cons

yes, mempool! we're same as ETH in a sense. (but the tx should be very cheap and congestion should be local to specific programs)
introduce single threaded code a bit
rather a big change

current num-of-sigs tx fee is re-interpreted as block-inclusion-fee (bi-fee). this fixed fee is now strictly modeled as sig verify cost and the tx blob propagation cost across the cluster. now, no execution is guaranteed; older contentious txes will be pruned from the mempool according to recent_blockhash without execution at all while the bi-fee (also ee-fee; described below) is debited.

additionally, users can now tip more as expedite-execution-fee (ee-fee) with bidding mechanism detailed below.

background/thoughts:

imo, the underlying problem is that clients can't know their txes are observed or not by the cluster in the first place. so, they blindly sends a ton.
- so, make tx inclustion very reliable and observable.
- finally shift the scheduling problem to the original tx submitters (after all, they should know the value of tx most accurately)
maybe leader's outgoing bandwidth can be better utilized for turbine not for (possibly redundant) tx forwarding. so pipe all the possibly unused txes into the downstream as much as possible (while collecting tx fees from greedy bots)
- there should be little security concern for so many txes in mempools; memory should be cheap and ample, compared to other resources.
no matter what, people will compete for tx prioritization on-/off-chain, (c)overtly. so let's introduce efficient, transparent, and permissionless markets for it.
(bonus) the hard-to-debug my-(dropped)-tx-not-found-on-explorer problem will also be solved.

(for simplicity, assume that only write accounts and tx runtime is rather uniform)

pending txes:

tx1 acc_a, acc_b (ee-fee: 300)
tx2 acc_a, acc_c (ee-fee: 250)
tx3 acc_a (ee-fee: 20)
tx4 acc_b (ee-fee: 400)
tx5 acc_e (ee-fee: 0)

(informative only) mempools for each account (ordered by ee-fee)

mempool acc_a
300 tx1
250 tx2

mempool acc_b
400 tx4
300 tx1

mempool acc_c
250 tx2

mempool acc_d
20 tx3

mempool acc_e
0 tx5

so, a single execution batch packing algorithm is simple:

maintain all pending txes sorted by ee-fee
pick highest paying transaction and lock the accounts and put it into the batch
pick the next tx
if the accounts are already locked, skip
if the accounts not locked, lock them and put it into the batch
continue until some kind of maximum concurrent execution limit (like batch max size) is reached.

lastly, the scheduled batch would be like this:

batch1: tx4, tx2, tx5
batch2: tx1
batch3: tx3

finally, the tx submitters can plan ee-fee for their urgency and prevalent bidding info, and re-adjust ee-fee if their tx are in the mempool for too long.

tao-stones · 2022-02-27T20:50:15Z

Proposal to review Leader QoS #23369
@jstarry @t-nelson @carllin @aeyakovenko

t-nelson · 2022-02-28T16:40:39Z

(hi, let's make the party even merrier. lol)

here's another rando idea: on-chain per-account auctioned mempool with deterministic scheduling (this is like a pre-

We're looking for less complex solutions, ATM. This sounds cool though! Maybe a future effort 😂

carllin · 2022-03-02T05:24:50Z

Hmm, isn't this why we have block-limit and account-limit to help such scenario? Plus allowing additional-fee to give AB better chance?

I feel I am missing something. Can you describe what do you see about to actively avoid this starvation scenario?

Here's a proposal @t-nelson and I were brainstorming on: https://github.com/solana-labs/solana/blob/539cb5c857982444d4d5b238638e8028471e3c70/docs/src/proposals/fee_transaction_priority.md, please take a look as it goes into more detail

tao-stones · 2022-03-02T16:42:33Z

Here's a proposal @t-nelson and I were brainstorming on: https://github.com/solana-labs/solana/blob/539cb5c857982444d4d5b238638e8028471e3c70/docs/src/proposals/fee_transaction_priority.md, please take a look as it goes into more detail

The proposal looks great! 👍🏼 ~~Is there a PR to put commits~~?

+1 for separation of responsibility, putting prioritization algo into a self contained logical model is neat. Potential concern is if requires an additional copying.
+1 for moving up account contention check in the pipeline; I still have question wrt the No.2 point in proposal tho, mainly on benefit/cost. Proposed approach seems to guarantee fee prioritization within each block, with additional gate keeping logic; v.s. what I did to let lower priority TXs in current block even they requires accounts needed by higher priority TX, but to ensure the higher TX has its place in next block (e.g., fee prioritization cross slots).
One thing to point out: it should use (additional_fee + base_fee) / requested_cu as fee-per-cu to prioritize transactions, instead of just using additional_fee, so 100 additional lamports for 1,000 CU transaction should have lower priority compare to 10 additional lamports for 10 CU transaction. The thing is the base_fee (eg signature, write lock etc) are bank dependent, as the fee_schedule changes over epochs.
One question, are we doing away with packet_batchs?

carllin · 2022-03-02T22:26:21Z

Let's move this discussion to the proposal! 😄

gshum · 2022-05-17T23:02:12Z

What about txs involving the same address that need to be ran serially? Is there some way to up fees on them or something like that. Seems like it creates a bottleneck but I've never analyzed Solana traffic so I can't say 100%

kasoutsuuka · 2022-08-02T10:52:44Z

How do we GUARANTEE a leader will process transactions through QoS?

hydrogenbond007 · 2023-03-16T06:40:51Z

Hey had a question been going through this report and saw the test results in quic-go/quic-go#2586 (comment) was a bit a confused isn't quic supposed to be ore performant than TCP? and what are the effects of quic on QOS?
Sorry if the the questions sound dumb still learning

t-nelson mentioned this issue Feb 22, 2022

Leader qos #23257

Closed

ryoqun mentioned this issue Mar 4, 2022

Proposal for transaction scheduler based on fee priority #23438

Merged

ryoqun mentioned this issue Mar 16, 2022

Review Leader QoS #23369

Closed

carllin mentioned this issue Mar 16, 2022

Introduce tiled tx execution model instead of batched entries at banking/replaying #23548

Closed

aeyakovenko mentioned this issue May 10, 2022

Consider increasing fees for writable accounts #21883

Closed

tao-stones mentioned this issue May 21, 2022

forward packets by prioritization in desc order #25406

Merged

solana-labs deleted a comment Jun 13, 2022

aeyakovenko changed the title ~~compute budget/qos/fee priority change summary~~ Network Stability rollout for quic/qos/fee markets Jun 15, 2022

tao-stones mentioned this issue Jun 17, 2022

Prioritization Fee function enabling #26017

Closed

15 tasks

github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Apr 22, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 29, 2024

Network Stability rollout for quic/qos/fee markets #23211

Network Stability rollout for quic/qos/fee markets #23211

Comments

aeyakovenko commented Feb 17, 2022 • edited Loading

Problem

Proposed Solution

TBD: optimizations

t-nelson commented Feb 17, 2022

aeyakovenko commented Feb 17, 2022 • edited Loading

t-nelson commented Feb 17, 2022

aeyakovenko commented Feb 17, 2022

tao-stones commented Feb 18, 2022

t-nelson commented Feb 19, 2022

t-nelson commented Feb 19, 2022

tao-stones commented Feb 22, 2022

tao-stones commented Feb 22, 2022

aeyakovenko commented Feb 22, 2022

t-nelson commented Feb 22, 2022

jackcmay commented Feb 22, 2022

tao-stones commented Feb 22, 2022

aeyakovenko commented Feb 22, 2022 • edited Loading

tao-stones commented Feb 22, 2022 • edited Loading

jackcmay commented Feb 22, 2022

tao-stones commented Feb 22, 2022

jackcmay commented Feb 22, 2022

tao-stones commented Feb 22, 2022

carllin commented Feb 23, 2022 • edited Loading

tao-stones commented Feb 23, 2022

carllin commented Feb 23, 2022

tao-stones commented Feb 23, 2022

carllin commented Feb 24, 2022

tao-stones commented Feb 24, 2022

t-nelson commented Feb 24, 2022

carllin commented Feb 24, 2022 • edited Loading

tao-stones commented Feb 24, 2022

carllin commented Feb 25, 2022 • edited Loading

carllin commented Feb 25, 2022 • edited Loading

tao-stones commented Feb 25, 2022

tao-stones commented Feb 25, 2022

aeyakovenko commented Feb 25, 2022 • edited Loading

carllin commented Feb 25, 2022 • edited Loading

carllin commented Feb 25, 2022 • edited Loading

tao-stones commented Feb 25, 2022

jstarry commented Feb 26, 2022

t-nelson commented Feb 26, 2022 • edited Loading

ryoqun commented Feb 27, 2022

tao-stones commented Feb 27, 2022

t-nelson commented Feb 28, 2022

carllin commented Mar 2, 2022

tao-stones commented Mar 2, 2022 • edited Loading

carllin commented Mar 2, 2022

gshum commented May 17, 2022

kasoutsuuka commented Aug 2, 2022

hydrogenbond007 commented Mar 16, 2023

aeyakovenko commented Feb 17, 2022 •

edited

Loading

aeyakovenko commented Feb 17, 2022 •

edited

Loading

aeyakovenko commented Feb 22, 2022 •

edited

Loading

tao-stones commented Feb 22, 2022 •

edited

Loading

carllin commented Feb 23, 2022 •

edited

Loading

carllin commented Feb 24, 2022 •

edited

Loading

carllin commented Feb 25, 2022 •

edited

Loading

carllin commented Feb 25, 2022 •

edited

Loading

aeyakovenko commented Feb 25, 2022 •

edited

Loading

carllin commented Feb 25, 2022 •

edited

Loading

carllin commented Feb 25, 2022 •

edited

Loading

t-nelson commented Feb 26, 2022 •

edited

Loading

tao-stones commented Mar 2, 2022 •

edited

Loading