Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vision: Meta Transaction #266

Open
xlc opened this issue Aug 8, 2022 · 27 comments
Open

Vision: Meta Transaction #266

xlc opened this issue Aug 8, 2022 · 27 comments
Labels
T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@xlc
Copy link
Contributor

xlc commented Aug 8, 2022

We need a way to support meta transaction to enable many useful scenarios. However meta transaction is can be very hard to implement correctly and securely.

TLDR; I don't know how to store nonce for accounts without any tokens without introducing new attack vector.

Scope

Firstly we need to define a scope. Here are a list of the potential use cases that I can think of and we need to decide what do we want to support and what is out of the scope. Support all of the use cases may result the solution be too complicated.

1. Pay for transaction fee

Allow relayer to pay transaction fee for a user. This allows user account does not hold any fee token can still able to interact with the chain.

Use cases

  • dApp onboard new users by allowing them to interact with the dApp contracts without paying transaction fee
  • Create a better proxy account patten. Main account create limited proxy account A, send small fund on fee payment account B. Account A never holds any funds and can be rotated easily without need to move tx fee funds. All the transaction from A are broadcasted by B and B pays all the tx fee.
  • Support any token to be tx fee token. User can craft batch([sendCustomToken(relayerAddr, amount), doTheActualWork()]). And if it is profitable to pay tx fee for getting amount of custom token, relayer submit the tx to make the trade. It is up to the relayer to ensure the sendCustomToken step are expected to be successful.

Flow

There are two possible flow to achieve the desired goal.

  • User sign tx
  • Send tx payload and signature to relayer
  • Relayer sign the transaction and broadcast it
  • Blockchain charges relayer for tx fee, validates the user signature, and dispatch the payload tx from user origin

OR

  • User generate tx payload
  • Relayer sign the tx payload
  • User sign the tx payload with relayer signature and broadcast it
  • Blockchain validates relayer signature, charges relayer for tx fee, and dispatch payload from user origin

2. Atomic multi origin tx

Allow atomically dispatch a transaction mixed from different origins. It can only be successfully executed IFF all the origins signs the whole payload.

Use cases

  • Atomic exchange of assets between multiple users
    • e.g. batch_all([as_user(alice, transfer(10 DOT, bob)), as_user(bob, transfer(100 aUSD, alice))])
  • Perform a sequence of actions from multiple users at once without revealing it to avoid front running
    • e.g. batch([create_swap_pool(aUSD, DOT), as_user(alice, add_liquidity(100 aUSD, 10 DOT)), as_user(bob, add_liquidity(100 aUSD, 10 DOT))]) so that no one can make a trade between Alice add liquidity and Bob add liquidity

Flow

  • Generate a transaction payload
  • All party sign the transaction payload
  • Relayer sign the transaction and broadcast it
  • Blockchain charges relayer for tx fee, execute the payload. When encountered a as_user call, ensure a valid signature is included.

3. Attest messages delivered via XCM

One issue about XCM is that it cannot blindly trust all the messages from the source. SPREE is the solution but it will not be available anytime soon. We can verify the message are indeed authorize by a user by having the user attest the message with a signature. The signing payload should cover the whole XCM so it cannot be placed under a different context.

Use cases

  • User from Astar make one transaction to send ASTR to Acala to swap to GLMR and send GLMR to Moonbeam to invoke a smart contract with origin from the user account
  • Moonbeam user can send one transaction to transfer DOT from Moonbeam to Acala to mint aUSD and send aUSD to Moonbeam and buy more DOT on a dex

Flow

  • User generate XCM payload, sign the payload and fill-in the signature and dispatch the XCM
  • Receiving parachain receives XCM and verify the signature and grant the permission to dispatch the message from user origin

Security considerations

Here are some common security considerations I came up with. Please comment below if you have anything more to add

  • Single chain replay attack prevention
    • Need some mechanism to prevent replay of signatures on a same chain
    • Usually are done with nonce
  • Multichain replay attack prevention
    • Need some mechanism to prevent replay of signatures on different chains
    • Usually are done by including the genesis hash into the payload
  • Signature Invalidation
    • In some cases, the signature are hand off to other party for broadcasting and the other party may choose to not broadcast the message for whatever reason. In the case, the signature signer need to have the ability to invalidate the signature
  • Context aware validation
    • The signature must be validated with a known context. e.g. A XCM to invoke a smart contract on Moonbeam cannot be copied and send to Astar to invoke the smart contract on Astar

The default signed extensions from Substrate provides a very good starting point on what needs to included in the signature payload to keep it secure and the signature validation mechanism should just reuse it if possible.

Implementation considerations

The common way to implement replay attack prevention mechanism is by having a sequence number / nonce that increments for each new message.

However, this is incompatible on the default storage model of Substrate. The nonce have to be stored onchain and all the onchain storages must be covered with a deposit (ED). For some use cases, the signing account will not hold any balances at all and therefore shouldn't really be able to cover the storage deposit for nonce storage.

Another point to keep in mind is that nonce can be reused on chain with ED. An account can purposefully reset's its account data to reset nonce and able to replay some transactions.

One solution is have the fee payer to give user enough balance to cover ED. But ED value can be relatively high (1 DOT on Polkadot) and therefore this solution is not universally suitable to all the chains.

Another potential solution is to expire the signature after certain amount of time so we can purge nonce data periodically but it is only suitable until we can do O(1) purge of storages with a common prefix.

@bkchr
Copy link
Member

bkchr commented Aug 9, 2022

What about using the nonce of the relayer? So, we would include this nonce in the signature to prevent replay? It would make it a little bit more complicated to send, because we would need to either pick some nonce in advance or the relayer would give us some slot and wait until the call is provided.

@xlc
Copy link
Contributor Author

xlc commented Aug 9, 2022

What about using the nonce of the relayer?

This means XCM use case is out of scope because there won't be a relayer. This is fine.

One issue I can think of is that user request a nonce from relayer and does nothing to DoS relayer. This means relayer have to free the nonce after a period of time if not used. This results user have to complete the signing and submit it within a limited time period.

The main is that remember that relayer maybe able to reset its own nonce. This means this could be an attack scenario from malicious relayer:

  • Relayer announce account 111 with nonce 0 really for business.
  • User account 222 sign payload { call: transfer(1 DOT, 333), addr: 222, realyerNonce: 0 }
  • Relayer sign payload with nonce 0 and send tx, resulting user 222 transfer 1 DOT to 333. Nonce increment to 1.
  • Relayer 111 wipe its account data to reset nonce.
  • Relayer can replay tx to make user 222 transfer 1 DOT to 333 again.

User need to have a way to ensure the payload cannot be replayed by relayer and nonce from user account won't work for chain with ED.

@bkchr
Copy link
Member

bkchr commented Aug 9, 2022

If the user account has some balance above ED, why not use the nonce of the user account? Using the relayer nonce would only be required for initial onboarding like use cases?

@burdges
Copy link

burdges commented Aug 9, 2022

Did you consider nullifiers? Aka some value the storage of which prevents double spending? We'll need nullifiers anyways for anonymity solutions, like porting zcash sapling.

The tricky part about nullifeirs is removing the old ones. ZCash never removes them, which becomes a problem.

In your tx fee case, the dApp creates its daily nullifer storage tree, gives each users a signed token saying the date revealing a different nullifier. As they're spent, these nullifiers get inserted into the on-chain nullifier tree, which prevents double spending. At the end of the day, the dApp resets it's nullifier tree (or uses some round robin with several going simultaniously)

If we want better anonymity, we don't have the fiber the nullifers per dApp, but we could fiber them per collator, assuming the collators are reliable. In other words, each collator tracks its own portion of the nullifier tree, and jut prove it's work is being done correctly, kinda like an inner PoV block. Of course if the collator shuts down then all its nullifers cannot be proven to be unspent, but you could've a small number of collators sharing different tree portions, etc.

Our Ring VRF can be treated as outputting a nullifer, so if a person is registered on the ring VRF's personhood chain then they can sign dAppname ++ date ++ try_num for their free trys, which saves the dApp ever needing to presign anything, if just gives free trys to every person in the personhood chain.

If you've one trusted node then you can use prepared batching aka half-aggregation: https://docs.rs/schnorrkel/latest/schnorrkel/struct.PreparedBatch.html

In essence, this takes a set of signatures, and merges them into one signature on all the original (signer,message) pairs. If at least two honest signatures get included, then it's no longer possible to separate the original signatures.

SnarkPack achieves the same thing for Groth16 SNARKs.

These do require some trusted merge node, like a trusted collator, which makes them different & simpler than a "fair exchange" schemes. A trusted collator helps lots with many MEV problems though.

We could do this recursively too, like so you can trust another node besides the collator, and the collator can then protect its tx fees, but this requires code nobody wrote yet. I guess the combiner could send each participant a message promising them who the full list of participants will be.

We should not imho dive too far down the rabbit hole of chains verifying the logic of other chains, because it costs us some of our greatest advantages. It maybe simpler to just make SPREEs work than to do this.

@xlc
Copy link
Contributor Author

xlc commented Aug 9, 2022

If the user account has some balance above ED, why not use the nonce of the user account? Using the relayer nonce would only be required for initial onboarding like use cases?

For many use cases (e.g. gaming), user account will never have any balances or it may not have any fee token balances. This means store nonce under user account will not be suitable as there will be no way to incentive user to wipe the account data without ED. I don't think relayer will not want to pay ED to onboard new user because user can simply transfer ED out and request it again to drain relayer. This solution forces relayer to implement some restrictions such as email verification to ensure it's account cannot be drained. It maybe an acceptable trade-off under some scenarios but certainty won't be a generic solution.

@xlc
Copy link
Contributor Author

xlc commented Aug 9, 2022

1 Did you consider nullifiers?

Things will be easy if Substrate support purge storages with common prefix with O(1) time but we don't and therefore some more tricks are required. But I think this could be the right path. Worst case scenario, we can always ensure the tx cost for such operation covers the deletion cost as well.

  1. If you've one trusted node then you can use prepared batching aka half-aggregation:

That will work. Let try to figure out what's missing to make this possible.

  1. We should not imho dive too far down the rabbit hole of chains verifying the logic of other chains, because it costs us some of our greatest advantages. It maybe simpler to just make SPREEs work than to do this.

I don't disagree with you. It will be great if we have a simple solution to achieve the desired outcome. Otherwise let's just keep building SPREE.

@bkchr
Copy link
Member

bkchr commented Aug 9, 2022

If the user account has some balance above ED, why not use the nonce of the user account? Using the relayer nonce would only be required for initial onboarding like use cases?

For many use cases (e.g. gaming), user account will never have any balances or it may not have any fee token balances. This means store nonce under user account will not be suitable as there will be no way to incentive user to wipe the account data without ED. I don't think relayer will not want to pay ED to onboard new user because user can simply transfer ED out and request it again to drain relayer. This solution forces relayer to implement some restrictions such as email verification to ensure it's account cannot be drained. It maybe an acceptable trade-off under some scenarios but certainty won't be a generic solution.

No that wasn't my point. I mean if the user doesn't have any ED, the user also doesn't has anything that could be "replayed". Like you do a transfer twice, because there is nothing to transfer. The user also has no other "stake" in the state, because no ED. So, as long as there is no ED, it should be fine to use the relayer nonce? Correct me if I'm missing something. However, if the user has some funds or something and wants to prevent that the relayer can replay its calls, we use the user nonce.

@xlc
Copy link
Contributor Author

xlc commented Aug 10, 2022

I see. But ED is only about fee token. User could have application token (maybe sone non transferable voucher) or NFT or some other assets.

@burdges
Copy link

burdges commented Aug 10, 2022

Is there a github issue for O(1) storage purge yet? We'll maybe want that elsewhere like multi-block elections.. It feels silly not to have given how painful storage can become.

Should we do a recursive version of prepared batching aka half-aggregation? Who are the trusted merge nodes in your view? collators?

Would XCMP without SPREE suffice for 3 here? It does not ensure message contents like SPREE does, but it does ensure delivery.

@xlc
Copy link
Contributor Author

xlc commented Aug 10, 2022

Is there a github issue for O(1) storage purge yet? We'll maybe want that elsewhere like multi-block elections.. It feels silly not to have given how painful storage can become.

Not sure. We should have one if we don't. @bkchr @arkpar may know?

Should we do a recursive version of prepared batching aka half-aggregation? Who are the trusted merge nodes in your view? collators?

Can you explain what kind of trust do we need to put on the trusted merge nodes?

Would XCMP without SPREE suffice for 3 here? It does not ensure message contents like SPREE does, but it does ensure delivery.

Delivery is enough. For example, combined with the relayer setup, instead of have relayer deliver the signature and pay for tx fee, we can use XCM to deliver the signature and pay for weight fee. In that way it can already unlock many useful applications.

@burdges
Copy link

burdges commented Aug 10, 2022

Can you explain what kind of trust do we need to put on the trusted merge nodes?

It's only that the merge node could publish some transactions without publishing others.

In principle the merge node could tell the participants about all other participants, so then transactions could contains a commitment, which they publish if they get cheated. It'd be annoying to code up that dispute protocol which literally never runs though.

It works cleanly & simply if there is naturally a node who could be trusted to want all the transactions to appear together. I'd envisioned collators employing prepared batching to protect themselves from fee stealing & MEV by other collators, so then the collator simply trusts itself.

@arkpar
Copy link
Member

arkpar commented Aug 11, 2022

Is there a github issue for O(1) storage purge yet? We'll maybe want that elsewhere like multi-block elections.. It feels silly not to have given how painful storage can become.

Not sure. We should have one if we don't. @bkchr @arkpar may know?

I could not find an existing issue in substrate, maybe @cheme knows one.
O(1) subtree deletion would require some support from the database engine. We've discussed some ideas on how to implement amortized O(1) deletion in paritydb 2.0, but the work on it has not stared yet.
Rocksdb supports prefixed key deletion. While it is faster than wlaking the trie, it is still not O(1) I believe.

@cheme
Copy link
Contributor

cheme commented Aug 11, 2022

I don't think there is issue.
I did one PR paritytech/substrate#5280 for it but it was a long time a go, some discussions exists there but there is clearly too many to find relevant info.
Long story short, the code was not very good and design changes could be a bit too much at the time.
I believe trick I propose in the PR should be dropped (using trie node prefixing in rocksdb to use rocksdb primitives (but as mention I also doubt it can be totally O(1) internally)).
One possible short term direction, I think, could be to focus on making the prefixed removal asynchronous (and non existent on archive node), so it is not truely O(1) but really amortized, and will be good for pov.
But this would require passing the prefix removal info (or child trie removal info) into the db pruning which quite break the current design (client db currently deals with db content and do not know about trie).
Also rocksdb pruning would maybe still need to lookup for trie nodes prefix and have some pretty involved deathrow management, which leads me to think even in the asynch scenario, support by rocksdb may be an issue or at least could have bad worst case scenarios (I did not handle that in my past PR by just considering a runtime should not create two time the same key for a child trie, but it is an assertion that is not enforced and even harder to hold in the prefix removal case).

Child trie case is probably the first to handle (good for contract and does not requires additional trie code changes).

Ideal direction would be to handle at paritydb level but that is even bigger design changes.

@bkchr
Copy link
Member

bkchr commented Aug 11, 2022

Instead of bringing these optimizations to ParityDb, we should implement the lazy flushing as discussed with @gavofyork. This means we don't write every block state directly to disk. We flush every X seconds or something like this. If in between the node would shut down, we would resync and recreate the state. We would always only loose some seconds or minutes of state, nothing to worry about. We would only need to store the transactions and the header of the blocks on disk.

If we have this, we can probably make some weight calculations without including costs for writing to disk.

@cheme
Copy link
Contributor

cheme commented Aug 11, 2022

I am not exactly sure what is lazy flushing, but from your description it does not sound orthogonal.
Notice that here the biggest part of the issue is reading all the nodes under the prefix or in the child trie, the write part is already mainly asynch (will happen on state pruning).
Proposal here was to delay this reading to pruning or never for archival node (by storing some new specific removal ops).

@bkchr
Copy link
Member

bkchr commented Aug 11, 2022

From the in memory perspective, dropping a child trie should only be about removing the child storage root from the main trie and dropping the structure that holds the child trie?

@cheme
Copy link
Contributor

cheme commented Aug 11, 2022

From the in memory perspective, dropping a child trie should only be about removing the child storage root from the main trie and dropping the structure that holds the child trie?

Same from a db perspective, just that dropping the structure is currently done by collecting all heap pointer (the costly part) and only drop them at pruning.
The change describe would be to only collect the structure first heap pointer and lookup for all other heap pointer only at pruning (so it does not get include in proof and synchronous execution time). But we then need a special semantic to this first heap pointer (drop all leaf bellow this one).
So that is this special removal ops that is currently missing.

Ps: this discussion should probably take place in its own issue.

@shawntabrizi
Copy link
Member

shawntabrizi commented Aug 13, 2022

I spoke with Gav on this issue, and we should not be over complicating the design here to support specifically users without balances at the Substrate level.

Ultimately, the only thing we can do to ensure that transactions like this do not get replayed is to increment the nonce of the "meta user", however, there are ways for a user to maintain a nonce without a balance: with provider reference counters.

The Meta transaction feature must simply ensure that a user account has a provider reference counter, which can be obtained by having an existential deposit or some custom solution by the parachain team. With that, we can safely increment and hold the state of the nonce for a user account. In this case, we should consider the provider reference counter as de-sybiling the account.

Most chains will allow users to gain their provider reference counter by simply getting some tokens... a reasonable onboarding experience if this means that all of their future transactions will be free. You can also transfer to the user some (semi)permanently locked balance, which prevents attacks from users siphoning funds. Finally, chains could have a special governance/root controlled pallet which just gives a particular account a provider reference counter. I can imagine this can be used in some kind of off chain KYC process.

Maybe there are other ideas on how to do this even more flexibly, but for now, I think there is plenty to be gained by making the assumption about incrementing nonce.

@burdges
Copy link

burdges commented Aug 13, 2022

Aside from lacking O(1) purge support, is there anything that prevents a parachain from supporting accounts with associated nullifier trees?

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/parachain-technical-summit-next-steps/51/1

@arrudagates
Copy link
Contributor

Would the implementation of meta transactions allow for calls dispatched from inside another extrinsic charge fee from that inner call's origin instead of the outer call's signer?

@kianenigma kianenigma changed the title Meta Transaction Vision: Meta Transaction Mar 10, 2023
@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/meta-transaction-pallet/2302/2

@shunsukew
Copy link

shunsukew commented Mar 14, 2023

@xlc
This could be a dumb question.

The common way to implement replay attack prevention mechanism is by having a sequence number / nonce that increments for each new message.

However, this is incompatible on the default storage model of Substrate. The nonce have to be stored onchain and all the onchain storages must be covered with a deposit (ED). For some use cases, the signing account will not hold any balances at all and therefore shouldn't really be able to cover the storage deposit for nonce storage.

This can be applied even if having another account_id <> nonce HashMap in meta tx pallet (I mean not using frame_system AccountInfo)? And extrinsic sender who update/writes that HashMap is Relayer's account.
Usually, meta transaction in other chains holds nonce mapping in contracts, that means having separate nonce mapping.

@xlc
Copy link
Contributor Author

xlc commented Mar 14, 2023

That could work. But who owns the data? Ideally, we would like to be able to recycle the storage when it is not needed anymore. Presumably it will be the first relayer pays the deposit but we can't have the first relayer owns it due to security concern. Then we have a misincentive here that someone in charge of someone else's money. On the other hand, I don't think this is a super critical issue so maybe we can just ignore this problem like everyone else 💁‍♂️

@shunsukew
Copy link

shunsukew commented Mar 16, 2023

I see, so the discussion above is taking the deposit scheme to recycle unused storage as a premise, like Polkadot does ubiquitously.
Using relayer's deposit incentivizes the relayer to reset nonce, in most cases it should be fine, but we cannot ignore potential situations where relayer(attacker)'s gain overweight the entire cost to replay transaction.

I think, as @shawntabrizi mentioned, using provider reference counter which can maintain nonce of users with 0 balance (without ED, with some custom solution) is the good way to dig into. But my question is how to decrease reference counter and cleanup those accounts' info once they become idle, this may be part of designing.

@xlc
Copy link
Contributor Author

xlc commented Mar 16, 2023

Yeah the reference counter only move the problem. It doesn't solve it. The meta transaction pallet may not need to worry about it which is great but we still need to come up another pallet to manage the reference counter.

@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/socialfi-platform-to-engage-and-bring-more-users-to-polkadot/9136/10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Draft
Status: Backlog
Development

No branches or pull requests