-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider increasing fees for writable accounts #21883
Comments
Why does the "unused" part matter? Why not charge increasing fees for every write? |
@jstarry why didn't the sender mark it as RO then? The example above illustrates how programs can use this to reduce traffic from aggressive senders that do not observe the state often enough before they sign. |
I don't think that focusing on unused writable data accounts adequately covers the issues with write-lock contention. It just needs to be expensive in general to write-lock an account that many bots / protocols / users want to use. |
@jstarry it does. The issue with "invalid" transactions is that they are created against a stale state. It's a much better environment if senders who are sending against stale state can't increase fees on senders who are sending against valid state. By “much better” is that it allows the developer/Program capture more of the fees. If the network fees are increased indiscriminately that takes away value capture from the layer above. |
@aeyakovenko - Good post, and interesting solution. I think it does solve the current problem. But I have a concern: I can imagine lots of valid transactions where an account will be marked as
Perhaps the But I'm concerned that this limits the application architecture space for any high-speed application. Once N is chosen, all high-speed applications that run above some baseline k*N frequency are somewhat limited in architecture. Any address marked as Maybe this is all fine and worth it. And/or maybe this is a decent patch until a more robust solution comes along. Still, worth understanding the tradeoff. |
Why can't the router simulate ahead of time and only lock the right accounts? If it's using stale data, or wants a fast path, it needs to back off or pay a higher fee. Especially if its locking a ton of accounts. Think about it, all the accounts it locks but doesn't use prevent another valid user from using that program. So under normal conditions, the compute limit would allow some of these go through without causing a spike in fees, but if there is a flood it will force the router to slow down, or simulate more accurately. I am not sure the cap should be set at 25% or 50%, but at some level such that we can segregate optimistic and failed traffic from accurate.
100%, but it's not "nonces". Bumping the fee just more accurately shares resources on the network. |
@aeyakovenko I see, yes, you can choose to make the accounts writable before submitting the tx by running a simulation first. Reasonable solution. This still limits throughput insofar as you're dealing with maybe stale information by the time your tx executes. But I do agree that it's probably required... any finite resource (here, writable accounts) needs some fee to discourage effectively squatting on accounts. So never mind me, carry on! Thank you as always for building! |
What does "signing state" mean? |
changed to "if the senders are signing a transaction that is simulated against recent state." |
my rambling comment... My initial thought is to KISS. Hotter accounts + write locks -> higher fees. It's almost like account based gas fees. The "weight" of a block is essentially proportionate to the number of entries + shreds it produces along with the compute used, so it seems like there should be a higher charge for anything that causes that. Perhaps better packing transactions could help. On that note, perhaps trying to pack TXs looking outside of the PACKETS_PER_BATCH would help get less-hot accounts process faster and would kinda cause account-based queues inside the banking stage? need to think more if ideal property or not. The thing I need to think about more wrt simulation in the aggregator case is MEV. If I see aggregator TX but know it's only going to hit exchange X, can I sandwich them easier? If they're hitting multiple exchanges, that might become a little harder, but I supposed the sandwicher can simulate and know the outcome either way.
can you explain how this would work? a market validators set or enforced app/protocol side? how would this be enforced? i guess dapps want a good experience for their users, validators want to make more money. need to balance that somehow.
for "binary" bots where outcome is 0 or 1, it seems like they could just mutate the state a tiny bit then you end up back at square 1 where you have the same spam + run against compute limit and need to increase fees.
this is a non-problem for any bots; its how they prevent double spends (like blockhash). similar to nonce, any skilled bot writer could keep a deque of blockhashes and do multiples signs + spam. i'd also want to see access patterns for these things - some data we'll hopefully have soon. these periods of degraded performance last several hours, but there are also probably quick bursts during volatility. tl;dr: kiss. access patterns that degrade parallelism of system -> charge more. |
also, we've talked a bit about not even executing txs; just do sigverify + pack blocks. in that case, it becomes a replay stage/tvu problem. seems like you come back to execution speed/parallelism in that case too. |
here's my two cents: some thoughts for the above proposal (= exponential increase fee for unused writable data accounts)
instead how about introducing the merciless transaction execution mode, which takes advantage of parallelizable nature of error transactions (thanks to the state rollback/aborted execution from the very definition of tx). so, when spam activity is going, leaders start to proactively try to execute transactions with the end of previous block state without write lock at the maximum concurrency. and filter out any failed transaction, to be packed into the newly gutter block entry later, which is multiplexed into turbine along side the normal entry shreds. for successful txes from the proactive execution, leader re-executes normally with write lock with tip of the current block state and pack them into normal entries. until the tick is full, leader packs as much as possible these failed transactions (only with fee payer debiting) into gutter block entry while packing normal transactions into normal block entry. pros
cons
|
what's confusing? i write like C engineer, structs at the top :)
Fees need to double when the block is full, like total gas is > average load for N blocks, fees double for the next 2*N blocks
so that program that takes no action will get fifo ordering on whatever lands. That's totally up to the dev. That's the point! Only way to avoid the platform fee is to force control to the program so a program like serum can do its own congestion control and capture value from bots.
yea, write locks that are not necessary degrade system perf. That's what this does.
yea, eth nonce thing is a bit of a non sequitur. |
They know the max they would pay, blockhash signs it, we would need to dump all the old blockhashes for that write account and have users resign. Which is why 8,16 seems reasonable retry time. @ryoqun, your proposal is rather complex from the runtimes perspective. speculative execution that doesn't mutate state is hard. This is just messing with the fee governor. The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum. |
ah okay, so you're suggesting that on-chain apps will change fees and those fees go to platform instead of validator? very interesting, need to think about this more. would want to make sure incentives are aligned with validator operators for them to process less txs that don't allow them to earn as much from tx fees do you have a rough formula for where you see tx fees being derived from? something like: if a validator is running MEV software, does that change anything? |
Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole. Maybe it makes sense to keep track of the most actively write-locked accounts (both succeeded and failed), let's say top-100 (I bet they represent 98% of the activity), and have an extra fee that is a linear function of account usage frequency. The extra fee is applied to every such account included in the transaction, but only if the transaction has been failed. This way, there is no adverse fee effects and no need for additional transaction simulation step for well-behaving clients. It wouldn't help against the intentional attack, though - someone could just send 1 lamport(or token) to the most active raydium pools hundred times per sec, cheap, and transactions succeed. As a second layer of defense, per account fees could be applied to all transactions if account usage frequency stays too high for too long. It will also force protocol developers to break down their stuff when it becomes too popular. |
if it hits capacity limits, then the chain has to increase fees for that write lock aka dapp/market. so users will pay sol instead of what the program wants. giving control to the program means that they can rebate users, or holders, or do whatever they want.
That's effectively what this does. linear doesn't work to force bots to back off though. If the program wants to force a linear fee on usage it can do so because all the transactions that will land will succeed. All a well behaving bot/user has to do is correctly simulate against recent state. |
There are a few issues with the current proposal (there's some overlap here with @ryoqun's comments above too):
Alternate ProposalWhat I suggest is that we do the inverse of what you're proposing. Instead of increasing fees for unused writable accounts, always increase fees for used accounts but send a portion of the increased fees to the program so that they can do the rebate. This still gives programs a lot of control over how to incentivize proper behavior but at the same time gives the runtime a general approach for limiting the amount of contentious transactions. To combat read-lock starvation and other scheduling issues, the runtime could track higher level heuristics about the transactions it has processed and the relationships between processed transactions and even include a proof about transactions it couldn't schedule. |
As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem? |
this mechanism forces the bots that invoke the program that fail to back off, because all failures become exponentially more expensive. If it's possible for us to change for successful write locks, it means program logic is being run, so I am not sure how the suggested approach helps.
TX can do a system call on the cost model for the Account, or it can do it sown metering. Trades per slot, Slots since liqudation/oracle update, etc...
Yea, they can do this logic on their own. if an oracle update is expected every 8 slots, they can 10x the fees immediately.
We have a ton more flexibility in read lock contention. There is no promise that a write isn't inserted before or after any sequence of reads.
how do we know how much to increase fees by, how fast to do it and how fast to back off? what if it's not enough for oracles, but too much for traders? Also, the program can control what token the fees are implemented in, and may not want to distribute sol to its users, may want to distribute the market token, or the projects token instead. With the "failed policy" I feel like we can be fairly aggressive in how fast they are forced to back off, and setting the cap to 25% of the total gives the program a guarantee that some non failed TXs land to talk to it. |
@aeyakovenko we can just use the same mechanism used by the put differently, my proposal is exceptionally legalizing including many (probably bot-initiated) erroneous transactions more cheaply, if the transactions fail with the same (frozen) parent bank state. so this should avoid write_lock contention to execute them at once and state mutation is irrelevant (only fee debiting must be done, specially, also exponentially if solana want to punish hard). this is only triggered when under cluster-wide congestion.
yeah, but shifting this onto dapps?
needless to say, this is cool if done correctly. i'm a bit skeptical for whether being able to do this right on-chain or not. i'm too naive. i just want to retain solana's simple selling point that is is infinitely fixed-cost/fast/cheap from the dapps perspective. |
haha, good call. in that respect, how each competing daap is incentivized to implement their own congestion mechanism properly (without disadvantage to others)? it's like any given service (here congestion controll) must be done from private/public sector in politics. xD |
also, if targeted program implements any such system, how to prevent bot transaction's calling program from peeking the state and avoid CPI if the cost is too high? then, botters can be rest assured and hit the cluster as much as possible? |
I think this is actually the main issue that @aeyakovenko brought up in the issue description. If bots want to conditionally CPI into a program, they still need to write lock the state. If they don't end up doing the CPI because the conditions aren't right, they will still be penalized for not writing to any state accounts owned by the target protocol.
Thanks for this @t-nelson, sorry if I've detracted from the specific problem this issue is trying to solve. Can you elaborate on what you meant by the issue of contention "spilling out as a negative externality on the rest of the cluster"? I don't really understand. In my own words, I would describe the current issue to be a lack of penalty to write locking state that you don't actually mutate. As long as there is no way for programs to balance usage / availability of their state either directly or indirectly, they could be subject to congestion.
The current proposal incentivizes bots to change their tx behavior to 1) have their transaction succeed and 2) always make some change to the writable accounts they locked to avoid the unused account penalty. Once they figure out a cost effective way to do that, they can carry on spamming the program without any fee increases. Of course, the program can observe the contentious behavior and modify the protocol to prevent any cheap mutations to state.
If this is the case, then I'm onboard. But doesn't this imply that we can interleave reads with writes? If the max compute units are reached for a single writable account doesn't that imply that access to that state in a block is exhausted? Is the solution allowing transactions to read from stale state to avoid conflicts with writes? (I'm happy to draft up a separate proposal for this, if so)
Cool, a system call like that would probably do the trick.
Good points here, maybe we should give even more control to the programs here so that they can set fees directly for the state they manage?
The failed policy is nice and simple but maybe not granular enough and too complicate and burdensome to protocols to tune correctly. I get that you want protocols to be able to charge fees in their preferred tokens but what if we didn't do that and just let programs directly set the read or write cost of each account in SOL? I think this would help protocols separate business logic from congestion control because it would be more easily tunable and automatically enforced by the runtime. If those lock fees go back to the program, the program could then rebate fees in whichever currency they want to the user. |
@jstarry @taozhu-chicago given this model, I think what makes the most sense is prioritization of write locks > read locks when the cost model has an option. But we need to think about it. I think the optimal would be to put all the reads for a contentious write lock at the start of the block. So the rest of the block the writes go without interruption. Write A = 100 CUs |
@jstarry why? programs now can assume that all transactions succeed, and can do whatever. Control flow is purely in the programs hands.
Why is that better though? How would a program know how much to set read costs and how would it be able to do so if attacker is spamming the account with failed writes? There is a ton of simplicity gained for the program developer if the assumption they operate under is that all txs that call the program succeed. |
Increasing fees on the dex level poses severe challenges for traders. Can't arb well if you can't estimate fees. Can be circumvented by using an on-chain program (with the exact behaviour we want to prevent), but now we just made it 100x harder to start trading arb strategies on the dex. |
@jstarry i really like the idea of depositing the sol into the writable account that is being spammed |
wdyt of
this will automatically increase the fees when account is congested. |
To add to this discussion, I think what @ruuda mentions here (#24827 (comment)) about the dynamic of having "stale" transactions charged makes a lot of sense. At the beginning of an NFT drop, deterministic congestion fees will be low and start to ramp up, so presumably most transaction senders will try to reference blockhashes that are as old as possible (it doesn't matter if they just time out soon, they can send new ones), and so it might take a while for fees to ramp up enough to lower demand. In the meantime, you'll have bots still try and flood the network with transactions. If you have fee charges on "stale" transactions though, using this strategy of referencing older blockhashes would mean that your transaction has higher risk of being charged a "stale" fee, and so that would put pressure on transaction senders to find a sweet spot between longer transaction life and not getting charged a higher congestion fee. You can base the stale fee on the same deterministic compute charge that the transaction would have used, but just charge the fee payer (lower compute than transfers and highly parallelizable). There are some other ideas in the thread I liked as to how these fees could be distributed among validators. Ultimately, the higher guarantee of getting charged for transactions sent (either normal or "stale") could help disincentivize spammy behavior quite a bit I think, and would also provide validators another revenue source. They can re-share these stale fees captured from spamming with their own stakers (who might be annoyed their transactions sometimes get charged for going stale). As a side note, priority fees seem like a lot of guesswork and seem like they'll still be kinda spammy with people trying to guess proper gas? *Just added this which describes this possible system #25211 |
A note about priorities: the discussion so far has focused on fine-grained fees that can differ per program/account, so congestion on Candy Machine would not make regular transfers more expensive. This is valuable and useful, but not the most important problem to solve right now. Right now, the network degrades or even goes down ~daily, because bots have no way to get their transaction prioritized, except to spam. As one of the larger validators, for the past month we have been struggling with null-routing issues on the one hand and overly aggressive DoS protection on the other hand. Technical work like QUIC support is valuable, but only moves the problem. Volume will just increase until it reaches the new limit. Variable transaction fees are the only realistic way to deal with variable demand. I think we need to focus on getting variable fees out as soon as possible. If that means that regular transfers become more expensive when there is congestion for Candy Machine, that’s a shame, but it is something that can be improved upon later, by making the fees more fine-grained as discussed above. I would argue that being able to do transfers at all, is better than not being able to transact because the network is degraded to the point of being unusable. I would introduce global variable fees first, and iterate on it later. I wrote some thoughts in #24827 for adaptive fees (set by the network, not a priority fee that users can choose), which I think is a less invasive change compared to adding a priority fee or the more fine-grained fees discussed in this issue. |
I think they're prety close to having priority fees done but would be nice to hear your input here (#25178 (comment)) @ruuda. There was a comment further below that talking about having queues/batches/threads dedicated to transactions of certain compute ranges, which I think you could apply a more localized EIP1559 mechanism to (congestion pricing specific to transactions falling within certain compute ranges). |
Is it really this simple?
This becomes crucial, otherwise the program would have to use the instructions sysvar on every ix to properly detect reentry. |
Consider using account balance to prioritize? |
this has been totally rewritten since the time i made any comments above. they no longer apply and i rescind my support for the current proposal |
What are your main objections btw? |
As recent changes to tx submission seem to have resolved a lot of the bottlenecks in feeding transactions, this issue now becomes more relevant. I recently looked at the cost_tracker_stats.costlies_account distribution and noticed that all accounts related to Mango's most active perp markets (bids/asks/event queue/market) now regularly hit the 12M cost limit. A bunch of users started to use the increased compute limit to execute custom programs before placing orders with up to 7-800k CU per TX. We need an efficient way to penalize these actors as they are reducing QOS for all traders due to their lack of incentive to optimize their own code. Their transactions rarely fail, so solutions discussed before that only apply to failed transactions would not be enough. An ideal solution would allows us to add a larger fee to traders that: |
Adding a base fee per cu cost across all transactions would help too and is pretty simple -- there are probably periods where they don't need to use priority so are only paying for a signature or two on those massive transactions. Adding a base fee per cu would help incentive more efficiency (and pay validators :) ). If bots/rpcs collocate with a leader validator they could also get better latency and get in cheap transactions before more highly prioritized transactions land I'd imagine? If so, they can starve out some compute limits before priority is kicking in (I guess there's transaction forwarding though). I haven't seen anyone post any in depth analysis into how effective priority is. |
Furnishing FeeStructure can help, especially an exponential |
Spent a few hours with @jstarry mapping out the problem & solution space for this kind of fee today. We double checked behavior of both Market Makers as well as Liquidators on Mango v3:
We concluded that building more tools to improve priority fee adoption and usage measurement would help to decide in which situations the current priority fee model does not sufficiently serve users and needs a more granular fee model. Possible next steps could be: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Before going there, it would be worth sorting transactions into parallelizable batches. Right now they end up in arbitrary batches that lock whatever they need, so there’s a good probability that all batches need to execute serially. |
I guess we can reduce the complexity of sorting these transaction into parallelizable batches by creating a short list of accounts with most write requests for last n slots. This list can be managed by the validator iteself no need to propogate to the network. |
I would follow the discussion on the transaction scheduler channel of the Solana Tech discord.. there are a lot of changes planned. If you have any ideas you might get more ideas by posting them in the core research channel as well. |
Started to gather a bit more qualitative data:
It would probably save us a ton of compute if we could drop the latter 2 transactions way earlier in the pipeline, even before the validator or mango need to charge fees.
i'm surfacing this, because I would like to get the users who are actually trying to circumvent short-comings of the transaction submission protocol out of the firing line, before we ramp up fees. even just 2xing fees would be really bad for order book LPs. |
don't have access to write there, so will keep it to github |
I had also noticed the point 3. |
Won't order book LPs get a rebate? Basically the steady state here should be close to a maker rebate and a taker fee. |
most of the blockspace goes to cancel replace, fills are rarely relevant. all lps will both send maker & taker orders when refreshing prices |
Having the flexibility to have an application/writeable fee is one of the main features of app-specific rollup approach, would be really to have it on solana. As developers are not able to capture much incentives from users in other forms application fee can incentivize them to onboard more individual users rather than just activity. |
Problem
Bots spam the network, often with failed transactions. Well behaving senders are able to avoid failing txs if the senders are signing a transactions that are simulated against recent state. Malicious senders simply flood the chain with pre-formed transactions as fast as possible.
The flood of transactions can occur if there is a known opportunity that is scheduled, like Raydium IDO, or NFT mint. Or opportunistically during high volatility in markets. Programs can't just charge a large flat fee for small trades or all transactions, because attacker can write a custom program to check if there is liquidity and only then execute, but send the TX flood anyways. The flood will take write locks on state, and all other users will be starved. The program can't defend itself against being simulated.
Proposed Solution
This proposal is in addition to the market driven
additional_fee
signed by users to prioritize access to state. #23211get_active_account_write_lock_fee
system call: return the current activated write lock fee for this accountupdate_account_write_lock_fee
system call: A system call that allows a program to set a write lock fee for accounts that it owns. The fee is applied at the start of TX processing to the original write locked accounts such that the write lock accountdata
remains immutable, but the additional lamports are committed even when the TX fails. Reducing the fee can be activated immediately, increasing the fee has to wait for at least 240 blockhashes, or the current system wide blockhash timeout.It is up to the program to distribute the lamports back to successful callers at the end of the call. Program would need to guard that it's not being called twice within the same transaction, and only refund on the first call. A re-entry safe helper function should be provided to the program so it refund the current fee to the transaction fee payer.
total_failed_lamports
+=max(0, current_lamports - (rent_exempt_lamports + write_lock_fee))
refund
=current_lamports - total_failed_lamports
total_failed_lamports
would need to be tracked by the program's state.Program can implement an eth eip1559 like mechanism by periodically setting the
write_lock_fee
with the MINadditional_fee
paid by callers. MIN would be the min price a caller had to be included in the block to take this write lock.Implementation needs to have an activation slots longer than the 240 slot timeout for a blockhash. so users know the expected fee they are signing. Transactions using durable nonces may need to specify the maximum fee they are willing to pay including write lock fees.
Why it’s better then only
additional_fee
The huge advantage to this, is that it’s possible for a program to build its own congestion control and capture fees and punish misbehaving senders and refund well behaving senders.
Example NFT drop
write_lock_fee
for the candy machine mint to 0.1 solwith fewer bots bidding, there is less load on the leaders and UX improves.
Example defi market
write_lock_fee
to the MINadditional_fee
over the last 10 slotsMarket created the demand for state, and market is now capturing value that otherwise would go to the L1 only.
tag @taozhu-chicago @sakridge @jackcmay
The text was updated successfully, but these errors were encountered: