Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider increasing fees for writable accounts #21883

Closed
aeyakovenko opened this issue Dec 14, 2021 · 71 comments
Closed

Consider increasing fees for writable accounts #21883

aeyakovenko opened this issue Dec 14, 2021 · 71 comments
Labels
stale [bot only] Added to stale content; results in auto-close after a week.

Comments

@aeyakovenko
Copy link
Member

aeyakovenko commented Dec 14, 2021

Problem

Bots spam the network, often with failed transactions. Well behaving senders are able to avoid failing txs if the senders are signing a transactions that are simulated against recent state. Malicious senders simply flood the chain with pre-formed transactions as fast as possible.

The flood of transactions can occur if there is a known opportunity that is scheduled, like Raydium IDO, or NFT mint. Or opportunistically during high volatility in markets. Programs can't just charge a large flat fee for small trades or all transactions, because attacker can write a custom program to check if there is liquidity and only then execute, but send the TX flood anyways. The flood will take write locks on state, and all other users will be starved. The program can't defend itself against being simulated.

Proposed Solution

This proposal is in addition to the market driven additional_fee signed by users to prioritize access to state. #23211

  • get_active_account_write_lock_fee system call: return the current activated write lock fee for this account

  • update_account_write_lock_fee system call: A system call that allows a program to set a write lock fee for accounts that it owns. The fee is applied at the start of TX processing to the original write locked accounts such that the write lock account data remains immutable, but the additional lamports are committed even when the TX fails. Reducing the fee can be activated immediately, increasing the fee has to wait for at least 240 blockhashes, or the current system wide blockhash timeout.

It is up to the program to distribute the lamports back to successful callers at the end of the call. Program would need to guard that it's not being called twice within the same transaction, and only refund on the first call. A re-entry safe helper function should be provided to the program so it refund the current fee to the transaction fee payer.

  1. total_failed_lamports += max(0, current_lamports - (rent_exempt_lamports + write_lock_fee))
  2. refund = current_lamports - total_failed_lamports
  • total_failed_lamports would need to be tracked by the program's state.

Program can implement an eth eip1559 like mechanism by periodically setting the write_lock_fee with the MIN additional_fee paid by callers. MIN would be the min price a caller had to be included in the block to take this write lock.

Implementation needs to have an activation slots longer than the 240 slot timeout for a blockhash. so users know the expected fee they are signing. Transactions using durable nonces may need to specify the maximum fee they are willing to pay including write lock fees.

Why it’s better then only additional_fee

The huge advantage to this, is that it’s possible for a program to build its own congestion control and capture fees and punish misbehaving senders and refund well behaving senders.

Example NFT drop
  1. Candy machine is deployed
  2. Artist sets the write_lock_fee for the candy machine mint to 0.1 sol
  3. candy machine is configured to refund the caller on a successful mint
  4. any bots that try to spam the network prior to the mint fail and get charged 0.1 sol

with fewer bots bidding, there is less load on the leaders and UX improves.

Example defi market
  1. high volatility causes prices to move and creates tons of arbs
  2. Market continuously sets the write_lock_fee to the MIN additional_fee over the last 10 slots
  3. Every epoch, market does a buy/burn of its marketplace token with the collected fees

Market created the demand for state, and market is now capturing value that otherwise would go to the L1 only.

tag @taozhu-chicago @sakridge @jackcmay

@aeyakovenko aeyakovenko changed the title Consider exponentially increasing fees for unused write accounts Consider exponentially increasing fees for unused writable data accounts Dec 14, 2021
@jstarry
Copy link
Member

jstarry commented Dec 14, 2021

Why does the "unused" part matter? Why not charge increasing fees for every write?

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 14, 2021

@jstarry why didn't the sender mark it as RO then? The example above illustrates how programs can use this to reduce traffic from aggressive senders that do not observe the state often enough before they sign.

@jstarry
Copy link
Member

jstarry commented Dec 14, 2021

I don't think that focusing on unused writable data accounts adequately covers the issues with write-lock contention. It just needs to be expensive in general to write-lock an account that many bots / protocols / users want to use.

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 14, 2021

@jstarry it does. The issue with "invalid" transactions is that they are created against a stale state. It's a much better environment if senders who are sending against stale state can't increase fees on senders who are sending against valid state.

By “much better” is that it allows the developer/Program capture more of the fees. If the network fees are increased indiscriminately that takes away value capture from the layer above.

@sleueth
Copy link

sleueth commented Dec 14, 2021

@aeyakovenko - Good post, and interesting solution. I think it does solve the current problem. But I have a concern: I can imagine lots of valid transactions where an account will be marked as writable but whose data will not change. Any transaction that contains some sort of optional routing will have this feature. The base case:

I invoke a Program that sends funds to account[1] XOR account[2] based on the result of some simple function. In this case, you need to mark both accounts as writable in the tx, but the data will change on only one.

Perhaps the N parameter above can be chosen in a way that blocks high-speed bots and allows human users with this valid use case through. I think you're correct insofar as this is an elegant solution to our current bot problem.

But I'm concerned that this limits the application architecture space for any high-speed application. Once N is chosen, all high-speed applications that run above some baseline k*N frequency are somewhat limited in architecture. Any address marked as writable should always be written to. This translates to a loss in idempotency and idempotency is a crucial architectural attribute available to application developers.

Maybe this is all fine and worth it. And/or maybe this is a decent patch until a more robust solution comes along. Still, worth understanding the tradeoff.

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 14, 2021

@sleueth @aeyakovenko - Good post, and interesting solution. I think it does solve the current problem. But I have a concern: I can imagine lots of valid transactions where an account will be marked as writable but whose data will not change. Any transaction that contains some sort of optional routing will have this feature.

Why can't the router simulate ahead of time and only lock the right accounts? If it's using stale data, or wants a fast path, it needs to back off or pay a higher fee. Especially if its locking a ton of accounts. Think about it, all the accounts it locks but doesn't use prevent another valid user from using that program.

So under normal conditions, the compute limit would allow some of these go through without causing a spike in fees, but if there is a flood it will force the router to slow down, or simulate more accurately. I am not sure the cap should be set at 25% or 50%, but at some level such that we can segregate optimistic and failed traffic from accurate.

This translates to a loss in idempotency and idempotency is a crucial architectural attribute available to application developers.

100%, but it's not "nonces". Bumping the fee just more accurately shares resources on the network.

@sleueth
Copy link

sleueth commented Dec 14, 2021

@aeyakovenko I see, yes, you can choose to make the accounts writable before submitting the tx by running a simulation first. Reasonable solution. This still limits throughput insofar as you're dealing with maybe stale information by the time your tx executes. But I do agree that it's probably required... any finite resource (here, writable accounts) needs some fee to discourage effectively squatting on accounts.

So never mind me, carry on! Thank you as always for building!

@jstarry
Copy link
Member

jstarry commented Dec 14, 2021

What does "signing state" mean?

@aeyakovenko
Copy link
Member Author

What does "signing state" mean?

changed to "if the senders are signing a transaction that is simulated against recent state."

@buffalu
Copy link
Contributor

buffalu commented Dec 15, 2021

my rambling comment...

My initial thought is to KISS. Hotter accounts + write locks -> higher fees. It's almost like account based gas fees.

The "weight" of a block is essentially proportionate to the number of entries + shreds it produces along with the compute used, so it seems like there should be a higher charge for anything that causes that. Perhaps better packing transactions could help. On that note, perhaps trying to pack TXs looking outside of the PACKETS_PER_BATCH would help get less-hot accounts process faster and would kinda cause account-based queues inside the banking stage? need to think more if ideal property or not.

The thing I need to think about more wrt simulation in the aggregator case is MEV. If I see aggregator TX but know it's only going to hit exchange X, can I sandwich them easier? If they're hitting multiple exchanges, that might become a little harder, but I supposed the sandwicher can simulate and know the outcome either way.

We still need dynamic global fees on write locks. But, the huge advantage to this, is that it’s possible for a program to build its own congestion control and capture fees.
A market can track the recent volume of trades and increase its own fees on small txs. Bots can’t avoid these fees via simulation, and they can’t spam optimistically looking for a cheap trade and fail.

can you explain how this would work? a market validators set or enforced app/protocol side? how would this be enforced? i guess dapps want a good experience for their users, validators want to make more money. need to balance that somehow.

@jstarry it does. The issue with "invalid" transactions is that they are created against a stale state. Eth forces every TX to update a nonce, which throttles bots. It's a much better environment if senders who are sending against stale state can't increase fees on senders who are sending against valid state.

By “much better” is that it allows the developer/Program capture more of the fees. If the network fees are increased indiscriminately that takes away value capture from the layer above.

for "binary" bots where outcome is 0 or 1, it seems like they could just mutate the state a tiny bit then you end up back at square 1 where you have the same spam + run against compute limit and need to increase fees.

Eth forces every TX to update a nonce, which throttles bots.

this is a non-problem for any bots; its how they prevent double spends (like blockhash). similar to nonce, any skilled bot writer could keep a deque of blockhashes and do multiples signs + spam.

i'd also want to see access patterns for these things - some data we'll hopefully have soon. these periods of degraded performance last several hours, but there are also probably quick bursts during volatility.

tl;dr: kiss. access patterns that degrade parallelism of system -> charge more.

@buffalu
Copy link
Contributor

buffalu commented Dec 15, 2021

also, we've talked a bit about not even executing txs; just do sigverify + pack blocks.

in that case, it becomes a replay stage/tvu problem. seems like you come back to execution speed/parallelism in that case too.

@ryoqun
Copy link
Member

ryoqun commented Dec 15, 2021

here's my two cents:

some thoughts for the above proposal (= exponential increase fee for unused writable data accounts)

  • there's risk for normal users: victim's stalled transactions (now triggering slippage tolerance yet within recent_blockhash expiration) could be exploited for selfish validators right after 2^n-ing the fee.
  • not fan of burdening these congestion problem to higher level (i.e. dapp devs)
  • the inevitable N-slot-weighted triggering mechanism means some feedback latency to the bursts

instead how about introducing the merciless transaction execution mode, which takes advantage of parallelizable nature of error transactions (thanks to the state rollback/aborted execution from the very definition of tx).

so, when spam activity is going, leaders start to proactively try to execute transactions with the end of previous block state without write lock at the maximum concurrency. and filter out any failed transaction, to be packed into the newly gutter block entry later, which is multiplexed into turbine along side the normal entry shreds.

for successful txes from the proactive execution, leader re-executes normally with write lock with tip of the current block state and pack them into normal entries. until the tick is full, leader packs as much as possible these failed transactions (only with fee payer debiting) into gutter block entry while packing normal transactions into normal block entry.

pros

  • less complex then the above idea? and generic-purpose and fee isn't fiddled dynamically.
  • gpu friendly?
  • pretty quick to react any spikes/peaks.
  • bots experience hefty tx fee burden because the cluster can eat all of 'em without serialized execution.

cons

  • normal transaction needs to be executed twice
  • bankless leader unfriendly
  • dapps should be written to fail first, rather than no-op success tx (this prevents some possibly valuable accounting?)
    risk: normal transaction is susceptible to false positive, if it fails at the parent block's state (otherwise it executes successfully at the tip of the state)
    • (i think this is tolerable?)

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 15, 2021

@buffalu

Kinda rambling

what's confusing? i write like C engineer, structs at the top :)

can you explain how this would work? a market validators set or enforced app/protocol side? how would this be enforced? i guess dapps want a good experience for their users, validators want to make more money. need to balance that somehow.

Fees need to double when the block is full, like total gas is > average load for N blocks, fees double for the next 2*N blocks

for "binary" bots where outcome is 0 or 1, it seems like they could just mutate the state a tiny bit then you end up back at square 1 where you have the same spam + run against compute limit and need to increase fees.

so that program that takes no action will get fifo ordering on whatever lands. That's totally up to the dev. That's the point! Only way to avoid the platform fee is to force control to the program so a program like serum can do its own congestion control and capture value from bots.

tl;dr: kiss. access patterns that degrade parallelism of system -> charge more.

yea, write locks that are not necessary degrade system perf. That's what this does.

this is a non-problem for any bots; its how they prevent double spends (like blockhash). similar to nonce, any skilled bot writer could keep a deque of blockhashes and do multiples signs + spam.

yea, eth nonce thing is a bit of a non sequitur.

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 15, 2021

there's risk for normal users: victim's stalled transactions (now triggering slippage tolerance yet within recent_blockhash expiration) could be exploited for selfish validators right after 2^n-ing the fee.

They know the max they would pay, blockhash signs it, we would need to dump all the old blockhashes for that write account and have users resign. Which is why 8,16 seems reasonable retry time.

@ryoqun, your proposal is rather complex from the runtimes perspective. speculative execution that doesn't mutate state is hard. This is just messing with the fee governor.

The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum.

@buffalu
Copy link
Contributor

buffalu commented Dec 15, 2021

The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum.

ah okay, so you're suggesting that on-chain apps will change fees and those fees go to platform instead of validator? very interesting, need to think about this more. would want to make sure incentives are aligned with validator operators for them to process less txs that don't allow them to earn as much from tx fees

do you have a rough formula for where you see tx fees being derived from? something like:
tx_fee = {number of sigs} + {protocol write lock increasing fee} + {app congestion control}?

if a validator is running MEV software, does that change anything?

@askibin
Copy link
Contributor

askibin commented Dec 16, 2021

Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole.

Maybe it makes sense to keep track of the most actively write-locked accounts (both succeeded and failed), let's say top-100 (I bet they represent 98% of the activity), and have an extra fee that is a linear function of account usage frequency. The extra fee is applied to every such account included in the transaction, but only if the transaction has been failed.

This way, there is no adverse fee effects and no need for additional transaction simulation step for well-behaving clients.

It wouldn't help against the intentional attack, though - someone could just send 1 lamport(or token) to the most active raydium pools hundred times per sec, cheap, and transactions succeed. As a second layer of defense, per account fees could be applied to all transactions if account usage frequency stays too high for too long. It will also force protocol developers to break down their stuff when it becomes too popular.

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 16, 2021

Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole.

if it hits capacity limits, then the chain has to increase fees for that write lock aka dapp/market. so users will pay sol instead of what the program wants. giving control to the program means that they can rebate users, or holders, or do whatever they want.

Maybe it makes sense to keep track of the most actively write-locked accounts (both succeeded and failed), let's say top-100 (I bet they represent 98% of the activity), and have an extra fee that is a linear function of account usage frequency. The extra fee is applied to every such account included in the transaction, but only if the transaction has been failed.

That's effectively what this does. linear doesn't work to force bots to back off though. If the program wants to force a linear fee on usage it can do so because all the transactions that will land will succeed. All a well behaving bot/user has to do is correctly simulate against recent state.

@jstarry
Copy link
Member

jstarry commented Dec 16, 2021

Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole.

if it hits capacity limits, then the chain has to increase fees for that write lock aka dapp/market. so users will pay sol instead of what the program wants. giving control to the program means that they can rebate users, or holders, or do whatever they want.

There are a few issues with the current proposal (there's some overlap here with @ryoqun's comments above too):

  1. It's difficult for programs to detect and monitor congestion themselves. Programs only have control over this matter if a transaction which invokes the program is actually processed. We can only process conflicting transactions sequentially so we need a way to inform the program of how much back pressure there actually is so they know the difference between perfectly tuned processing and heavily congested processing. Programs need to adjust fees quickly enough to prevent DOS attacks on programs from lasting too long but they are hamstrung by only getting info about congestion from what they are actually processing
  2. It doesn't take read-locks into account. Legitimate transactions which are fairly incentivized for a given program can still be read-starved. For example, consider two popular programs which always write lock their own state but read lock their complement's state. These transactions cannot be processed sequentially and so this proposal doesn't give a good generic solution to this problem
Alternate Proposal

What I suggest is that we do the inverse of what you're proposing. Instead of increasing fees for unused writable accounts, always increase fees for used accounts but send a portion of the increased fees to the program so that they can do the rebate. This still gives programs a lot of control over how to incentivize proper behavior but at the same time gives the runtime a general approach for limiting the amount of contentious transactions.

To combat read-lock starvation and other scheduling issues, the runtime could track higher level heuristics about the transactions it has processed and the relationships between processed transactions and even include a proof about transactions it couldn't schedule.

@t-nelson
Copy link
Contributor

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 16, 2021

It's difficult for programs to detect and monitor congestion themselves. Programs only have control over this matter if a transaction which invokes the program is actually processed.

this mechanism forces the bots that invoke the program that fail to back off, because all failures become exponentially more expensive. If it's possible for us to change for successful write locks, it means program logic is being run, so I am not sure how the suggested approach helps.

We can only process conflicting transactions sequentially so we need a way to inform the program of how much back pressure there actually is so they know the difference between perfectly tuned processing and heavily

TX can do a system call on the cost model for the Account, or it can do it sown metering. Trades per slot, Slots since liqudation/oracle update, etc...

. Programs need to adjust fees quickly enough to prevent DOS attacks on programs from lasting too long but they are hamstrung by only getting info about congestion from what they are actually processing

Yea, they can do this logic on their own. if an oracle update is expected every 8 slots, they can 10x the fees immediately.

It doesn't take read-locks into account. Legitimate transactions which are fairly incentivized for a given program can still be read-starved. For example, consider two popular programs which always write lock their own state but read lock their complement's state. These transactions cannot be processed sequentially and so this proposal doesn't give a good generic solution to this problem

We have a ton more flexibility in read lock contention. There is no promise that a write isn't inserted before or after any sequence of reads.

What I suggest is that we do the inverse of what you're proposing. Instead of increasing fees for unused writable accounts, always increase fees for used accounts but send a portion of the increased fees to the program so that they can do the rebate

how do we know how much to increase fees by, how fast to do it and how fast to back off? what if it's not enough for oracles, but too much for traders? Also, the program can control what token the fees are implemented in, and may not want to distribute sol to its users, may want to distribute the market token, or the projects token instead.

With the "failed policy" I feel like we can be fairly aggressive in how fast they are forced to back off, and setting the cap to 25% of the total gives the program a guarantee that some non failed TXs land to talk to it.

@ryoqun
Copy link
Member

ryoqun commented Dec 17, 2021

@ryoqun, your proposal is rather complex from the runtimes perspective. speculative execution that doesn't mutate state is hard.

@aeyakovenko we can just use the same mechanism used by the simulateTransaction rpc method.

put differently, my proposal is exceptionally legalizing including many (probably bot-initiated) erroneous transactions more cheaply, if the transactions fail with the same (frozen) parent bank state. so this should avoid write_lock contention to execute them at once and state mutation is irrelevant (only fee debiting must be done, specially, also exponentially if solana want to punish hard). this is only triggered when under cluster-wide congestion.

This is just messing with the fee governor.

yeah, but shifting this onto dapps?

The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum.

needless to say, this is cool if done correctly.

i'm a bit skeptical for whether being able to do this right on-chain or not. i'm too naive. i just want to retain solana's simple selling point that is is infinitely fixed-cost/fast/cheap from the dapps perspective.

@ryoqun
Copy link
Member

ryoqun commented Dec 17, 2021

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

haha, good call. in that respect, how each competing daap is incentivized to implement their own congestion mechanism properly (without disadvantage to others)? it's like any given service (here congestion controll) must be done from private/public sector in politics. xD

@ryoqun
Copy link
Member

ryoqun commented Dec 17, 2021

also, if targeted program implements any such system, how to prevent bot transaction's calling program from peeking the state and avoid CPI if the cost is too high? then, botters can be rest assured and hit the cluster as much as possible?

@jstarry
Copy link
Member

jstarry commented Dec 17, 2021

also, if targeted program implements any such system, how to prevent bot transaction's calling program from peeking the state and avoid CPI if the cost is too high? then, botters can be rest assured and hit the cluster as much as possible?

I think this is actually the main issue that @aeyakovenko brought up in the issue description. If bots want to conditionally CPI into a program, they still need to write lock the state. If they don't end up doing the CPI because the conditions aren't right, they will still be penalized for not writing to any state accounts owned by the target protocol.

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

Thanks for this @t-nelson, sorry if I've detracted from the specific problem this issue is trying to solve. Can you elaborate on what you meant by the issue of contention "spilling out as a negative externality on the rest of the cluster"? I don't really understand.

In my own words, I would describe the current issue to be a lack of penalty to write locking state that you don't actually mutate. As long as there is no way for programs to balance usage / availability of their state either directly or indirectly, they could be subject to congestion.

this mechanism forces the bots that invoke the program that fail to back off, because all failures become exponentially more expensive. If it's possible for us to change for successful write locks, it means program logic is being run, so I am not sure how the suggested approach helps.

The current proposal incentivizes bots to change their tx behavior to 1) have their transaction succeed and 2) always make some change to the writable accounts they locked to avoid the unused account penalty. Once they figure out a cost effective way to do that, they can carry on spamming the program without any fee increases. Of course, the program can observe the contentious behavior and modify the protocol to prevent any cheap mutations to state.

We have a ton more flexibility in read lock contention. There is no promise that a write isn't inserted before or after any sequence of reads.

If this is the case, then I'm onboard. But doesn't this imply that we can interleave reads with writes? If the max compute units are reached for a single writable account doesn't that imply that access to that state in a block is exhausted? Is the solution allowing transactions to read from stale state to avoid conflicts with writes? (I'm happy to draft up a separate proposal for this, if so)

TX can do a system call on the cost model for the Account, or it can do its own metering. Trades per slot, Slots since liquidation/oracle update, etc...

Cool, a system call like that would probably do the trick.

how do we know how much to increase fees by, how fast to do it and how fast to back off? what if it's not enough for oracles, but too much for traders?

Good points here, maybe we should give even more control to the programs here so that they can set fees directly for the state they manage?

Also, the program can control what token the fees are implemented in, and may not want to distribute sol to its users, may want to distribute the market token, or the projects token instead.

With the "failed policy" I feel like we can be fairly aggressive in how fast they are forced to back off, and setting the cap to 25% of the total gives the program a guarantee that some non failed TXs land to talk to it.

The failed policy is nice and simple but maybe not granular enough and too complicate and burdensome to protocols to tune correctly. I get that you want protocols to be able to charge fees in their preferred tokens but what if we didn't do that and just let programs directly set the read or write cost of each account in SOL? I think this would help protocols separate business logic from congestion control because it would be more easily tunable and automatically enforced by the runtime. If those lock fees go back to the program, the program could then rebate fees in whichever currency they want to the user.

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 20, 2021

@jstarry @taozhu-chicago given this model, I think what makes the most sense is prioritization of write locks > read locks when the cost model has an option. But we need to think about it. I think the optimal would be to put all the reads for a contentious write lock at the start of the block. So the rest of the block the writes go without interruption.

Write A = 100 CUs
Write B read A = B should inherit 100 CUs

@aeyakovenko
Copy link
Member Author

aeyakovenko commented Dec 21, 2021

The failed policy is nice and simple but maybe not granular enough and too complicate and burdensome to protocols to tune correctly

@jstarry why? programs now can assume that all transactions succeed, and can do whatever. Control flow is purely in the programs hands.

I get that you want protocols to be able to charge fees in their preferred tokens but what if we didn't do that and just let programs directly set the read or write cost of each account in SOL?

Why is that better though? How would a program know how much to set read costs and how would it be able to do so if attacker is spamming the account with failed writes? There is a ton of simplicity gained for the program developer if the assumption they operate under is that all txs that call the program succeed.

@mschneider
Copy link
Contributor

Increasing fees on the dex level poses severe challenges for traders. Can't arb well if you can't estimate fees. Can be circumvented by using an on-chain program (with the exact behaviour we want to prevent), but now we just made it 100x harder to start trading arb strategies on the dex.

@aeyakovenko
Copy link
Member Author

@jstarry i really like the idea of depositing the sol into the writable account that is being spammed

@aeyakovenko aeyakovenko changed the title Consider exponentially increasing fees for unused writable data accounts Consider increasing fees for writable accounts May 6, 2022
@aeyakovenko
Copy link
Member Author

@dafyddd

wdyt of

  • Market continuously sets the write_lock_fee to the average additional_fee over the last 100 slots (40 seconds)

this will automatically increase the fees when account is congested.

@nikhayes
Copy link

nikhayes commented May 13, 2022

I wrote this in a thread elsewhere and I'm copying over here: Right now you can just access all the most popular writable accounts in one tx and essentially make the network single threaded. That should be very expensive to do. Solana should charge tx fees based on how often an account is read from or written to. There should be a multiplier for accessing a popular account as writable over readonly. The price for each account can be adjusted dynamically with some pricing formula (maybe some EMA of usage per slot? pricer should be exponential and not linear i think). Then compute used in the tx can be multiplied by the sum of the base charges for each account to give a full charge for the tx.

e.g. I send a tx using accounts A, B as read only and C as writable. The pricer says A = 1, B = 4 and C = 2 and writable has a 5x multplier. And my tx costs 10k compute. So total cost = (1 + 4 + 10) * 10k = 150k lamports. Then you can update the pricer for each account.

I don't know how feasible it is to implement my idea but I like it for a few reasons:

  1. It is still somewhat deterministic on fee pricing. Nobody likes choosing how much gas they want to pay. The mental complexity there really sucks
  2. It comes from the principle that you should be charged the larger your externality to the network. The more popular a resource and the longer you hold it, the more you're charged. The intuition is that someone taking all Serum books, raydium, orca, mango and drift accoutns into one tx and doing complex calculations for 1m compute is doing a lot more damage than someone doing 1m compute on one writable account. This also protects against the problem @mschneider was talking about where 20 liquidators all go for a liquidation and the first guy gets it. The other 19 liqors won't pay much because they'll exit before using much compute.
  3. It incentivizes devs to develop more parallelizable and less spammy mechanisms. Right now, devs are handing out economic value to people who spam a lot and are first. This leads to a lot of congestion that affects other network participants but is still profitable to the spammer. I have a theory that what you really need is distribution of writable accounts to be varied enough that entries created by leader are large.

To add to this discussion, I think what @ruuda mentions here (#24827 (comment)) about the dynamic of having "stale" transactions charged makes a lot of sense. At the beginning of an NFT drop, deterministic congestion fees will be low and start to ramp up, so presumably most transaction senders will try to reference blockhashes that are as old as possible (it doesn't matter if they just time out soon, they can send new ones), and so it might take a while for fees to ramp up enough to lower demand. In the meantime, you'll have bots still try and flood the network with transactions. If you have fee charges on "stale" transactions though, using this strategy of referencing older blockhashes would mean that your transaction has higher risk of being charged a "stale" fee, and so that would put pressure on transaction senders to find a sweet spot between longer transaction life and not getting charged a higher congestion fee. You can base the stale fee on the same deterministic compute charge that the transaction would have used, but just charge the fee payer (lower compute than transfers and highly parallelizable). There are some other ideas in the thread I liked as to how these fees could be distributed among validators. Ultimately, the higher guarantee of getting charged for transactions sent (either normal or "stale") could help disincentivize spammy behavior quite a bit I think, and would also provide validators another revenue source. They can re-share these stale fees captured from spamming with their own stakers (who might be annoyed their transactions sometimes get charged for going stale).

As a side note, priority fees seem like a lot of guesswork and seem like they'll still be kinda spammy with people trying to guess proper gas?

*Just added this which describes this possible system #25211

@ruuda
Copy link
Contributor

ruuda commented May 18, 2022

A note about priorities: the discussion so far has focused on fine-grained fees that can differ per program/account, so congestion on Candy Machine would not make regular transfers more expensive. This is valuable and useful, but not the most important problem to solve right now.

Right now, the network degrades or even goes down ~daily, because bots have no way to get their transaction prioritized, except to spam. As one of the larger validators, for the past month we have been struggling with null-routing issues on the one hand and overly aggressive DoS protection on the other hand. Technical work like QUIC support is valuable, but only moves the problem. Volume will just increase until it reaches the new limit. Variable transaction fees are the only realistic way to deal with variable demand. I think we need to focus on getting variable fees out as soon as possible.

If that means that regular transfers become more expensive when there is congestion for Candy Machine, that’s a shame, but it is something that can be improved upon later, by making the fees more fine-grained as discussed above. I would argue that being able to do transfers at all, is better than not being able to transact because the network is degraded to the point of being unusable. I would introduce global variable fees first, and iterate on it later.

I wrote some thoughts in #24827 for adaptive fees (set by the network, not a priority fee that users can choose), which I think is a less invasive change compared to adding a priority fee or the more fine-grained fees discussed in this issue.

@nikhayes
Copy link

I think they're prety close to having priority fees done but would be nice to hear your input here (#25178 (comment)) @ruuda. There was a comment further below that talking about having queues/batches/threads dedicated to transactions of certain compute ranges, which I think you could apply a more localized EIP1559 mechanism to (congestion pricing specific to transactions falling within certain compute ranges).

@thesoftwarejedi
Copy link

Simplest way to implement refunds for all successful callers would be to refund current lamports - rent exempt lamports for account.

Is it really this simple? current lamports on the account would include fees collected from failing txs, so this proposed simplest way would actually give the first successful caller after a failure all collected fees from preceding failures.

A re-entry safe helper function should be provided to the program so it refund the current fee to the transaction fee payer.

This becomes crucial, otherwise the program would have to use the instructions sysvar on every ix to properly detect reentry.

@wkshare
Copy link

wkshare commented May 24, 2022

Consider using account balance to prioritize?

@t-nelson
Copy link
Contributor

this has been totally rewritten since the time i made any comments above. they no longer apply and i rescind my support for the current proposal

@nikhayes
Copy link

this has been totally rewritten since the time i made any comments above. they no longer apply and i rescind my support for the current proposal

What are your main objections btw?

@mschneider
Copy link
Contributor

As recent changes to tx submission seem to have resolved a lot of the bottlenecks in feeding transactions, this issue now becomes more relevant. I recently looked at the cost_tracker_stats.costlies_account distribution and noticed that all accounts related to Mango's most active perp markets (bids/asks/event queue/market) now regularly hit the 12M cost limit.

A bunch of users started to use the increased compute limit to execute custom programs before placing orders with up to 7-800k CU per TX. We need an efficient way to penalize these actors as they are reducing QOS for all traders due to their lack of incentive to optimize their own code. Their transactions rarely fail, so solutions discussed before that only apply to failed transactions would not be enough.

An ideal solution would allows us to add a larger fee to traders that:
a) trade on highly congested markets (identified by a set of 4 accounts in our case)
b) use over-proportional amounts of compute (2x compute could cause 10x fees)
c) have no strong bias for failing or successful transactions (failing <20%)

@nikhayes
Copy link

nikhayes commented Aug 22, 2022

Adding a base fee per cu cost across all transactions would help too and is pretty simple -- there are probably periods where they don't need to use priority so are only paying for a signature or two on those massive transactions. Adding a base fee per cu would help incentive more efficiency (and pay validators :) ). If bots/rpcs collocate with a leader validator they could also get better latency and get in cheap transactions before more highly prioritized transactions land I'd imagine? If so, they can starve out some compute limits before priority is kicking in (I guess there's transaction forwarding though). I haven't seen anyone post any in depth analysis into how effective priority is.

@tao-stones
Copy link
Contributor

An ideal solution would allows us to add a larger fee to traders that: a) trade on highly congested markets (identified by a set of 4 accounts in our case) b) use over-proportional amounts of compute (2x compute could cause 10x fees) c) have no strong bias for failing or successful transactions (failing <20%)

Furnishing FeeStructure can help, especially an exponential compute_fee_bins, and lamports_per_write_lock. We need help on the actual numbers tho.

@mschneider
Copy link
Contributor

Spent a few hours with @jstarry mapping out the problem & solution space for this kind of fee today. We double checked behavior of both Market Makers as well as Liquidators on Mango v3:

  • Only 1 MM uses fee priority, strong indication that traders don't experience QoS issues (yet)
  • 6 of the top 10 liquidators use fee priority, see below statistic for impact on 30d liquidation volume

Screen Shot 2022-08-24 at 6 13 04 PM

We concluded that building more tools to improve priority fee adoption and usage measurement would help to decide in which situations the current priority fee model does not sufficiently serve users and needs a more granular fee model.

Possible next steps could be:
Integrate fee priority into leading wallets or dapp uis
Gather more statistics on priority fee usage during large liquidation cascades

@godmodegalactus

This comment was marked as off-topic.

@ruuda
Copy link
Contributor

ruuda commented Aug 25, 2022

Before going there, it would be worth sorting transactions into parallelizable batches. Right now they end up in arbitrary batches that lock whatever they need, so there’s a good probability that all batches need to execute serially.

@godmodegalactus
Copy link
Contributor

I guess we can reduce the complexity of sorting these transaction into parallelizable batches by creating a short list of accounts with most write requests for last n slots. This list can be managed by the validator iteself no need to propogate to the network.

@nikhayes
Copy link

I would follow the discussion on the transaction scheduler channel of the Solana Tech discord.. there are a lot of changes planned. If you have any ideas you might get more ideas by posting them in the core research channel as well.

@mschneider
Copy link
Contributor

mschneider commented Sep 4, 2022

Started to gather a bit more qualitative data:

  1. example: arb bot that we might want to punish for spamming, basically write locking super popular spot markets pairwise (serum & orca), simulating profitability on-chain with minimal priority fee: https://explorer.solana.com/tx/Uvqd35FYYER47iGFGL4thWm4iZQRkC7BRr9vTBM1rbLH7HxGZUGV9vn8crGPfCLXdGALMKzD72qdkzpLbF3EHg8

  2. a LP that we would punish for spamming, but it's actually because he's measuring tx forwarding delay and failing the transaction to prevent placing orders with stale prices:
    https://explorer.solana.com/tx/4wTpT38qDPRw1PMuYfVfmrkUK8NVkwj2fpcU7G8LffA4QsFXVdED8kC9TvYE1xT7AdZfnvu31aErtDGGQQBVsNuw

  3. another LP that we would punish for spamming, but this time it's using a custom program to drop transactions that are executed out of order:
    https://explorer.solana.com/tx/21kuas9vTg5dBwhoxMoyEBozUCh6FtPLUs4QuURzt6StMErMQ5XpPBvx1Hz5zE2im3cTni7dxKftRhffEBcevf6P

It would probably save us a ton of compute if we could drop the latter 2 transactions way earlier in the pipeline, even before the validator or mango need to charge fees. 

(2) should be possible if the user would use older blockhashes to send his transactions, but practically that involves a lot of infrastructure on the client side, so improving protocol UX could make it “easier” and hence improve adoption.


(3) requires changes to the protocol afaik, but maybe @buffalu / jito can help.

i'm surfacing this, because I would like to get the users who are actually trying to circumvent short-comings of the transaction submission protocol out of the firing line, before we ramp up fees. even just 2xing fees would be really bad for order book LPs.

@mschneider
Copy link
Contributor

I would follow the discussion on the transaction scheduler channel of the Solana Tech discord.. there are a lot of changes planned. If you have any ideas you might get more ideas by posting them in the core research channel as well.

don't have access to write there, so will keep it to github

@godmodegalactus
Copy link
Contributor

Started to gather a bit more qualitative data:

1. example: arb bot that we might want to punish for spamming, basically write locking super popular spot markets pairwise (serum & orca), simulating profitability on-chain with minimal priority fee: https://explorer.solana.com/tx/Uvqd35FYYER47iGFGL4thWm4iZQRkC7BRr9vTBM1rbLH7HxGZUGV9vn8crGPfCLXdGALMKzD72qdkzpLbF3EHg8

2. a LP that we would punish for spamming, but it's actually because he's measuring tx forwarding delay and failing the transaction to prevent placing orders with stale prices:
   https://explorer.solana.com/tx/4wTpT38qDPRw1PMuYfVfmrkUK8NVkwj2fpcU7G8LffA4QsFXVdED8kC9TvYE1xT7AdZfnvu31aErtDGGQQBVsNuw

3. another LP that we would punish for spamming, but this time it's using a custom program to drop transactions that are executed out of order:
   https://explorer.solana.com/tx/21kuas9vTg5dBwhoxMoyEBozUCh6FtPLUs4QuURzt6StMErMQ5XpPBvx1Hz5zE2im3cTni7dxKftRhffEBcevf6P

It would probably save us a ton of compute if we could drop the latter 2 transactions way earlier in the pipeline, even before the validator or mango need to charge fees. 
 (2) should be possible if the user would use older blockhashes to send his transactions, but practically that involves a lot of infrastructure on the client side, so improving protocol UX could make it “easier” and hence improve adoption.

 (3) requires changes to the protocol afaik, but maybe @buffalu / jito can help.

i'm surfacing this, because I would like to get the users who are actually trying to circumvent short-comings of the transaction submission protocol out of the firing line, before we ramp up fees. even just 2xing fees would be really bad for order book LPs.

I had also noticed the point 3.
There are lot of mm that use the sequencing program to order their orders. I think this issue can be solved if can lock the mutable accounts lazily like instead of locking them for whole transaction we can just lock it when an instruction needs the account. This way all the transactions which are out of sequence will be dropped before the mutable accounts are locked.

@aeyakovenko
Copy link
Member Author

Won't order book LPs get a rebate? Basically the steady state here should be close to a maker rebate and a taker fee.

@mschneider
Copy link
Contributor

most of the blockspace goes to cancel replace, fills are rarely relevant. all lps will both send maker & taker orders when refreshing prices

@hydrogenbond007
Copy link
Contributor

hydrogenbond007 commented Mar 6, 2023

Having the flexibility to have an application/writeable fee is one of the main features of app-specific rollup approach, would be really to have it on solana. As developers are not able to capture much incentives from users in other forms application fee can incentivize them to onboard more individual users rather than just activity.

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Mar 15, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

No branches or pull requests