Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: gravity-bridge orchestrators don't reject unprofitable messages from cronos to ethereum #99

Open
yihuang opened this issue Sep 23, 2021 · 18 comments
Assignees
Labels
gravity bridge This issue or pull request is related to the Gravity Bridge functionality

Comments

@yihuang
Copy link
Collaborator

yihuang commented Sep 23, 2021

Currently, the orchestrator relays all messages blindly, vulnerable to spamming, since the gas fee on the ethereum mainnet is so high.
With the current design of the gravity bridge, users can pay a bridge fee to orchestrators with the transferred erc20 tokens.
Ideally, the orchestrator needs to:

  • estimate the gas to be consumed on ethereum
  • get current ethereum gas price from oracle
  • get the exchange rate from token to eth from some exchange API
  • check bridge_fee * exchange_rate > estimatedGas * gasPrice
  • after received the bridge fee tokens, convert them to eth asap when the price hasn't changed much, could be handled externally.

Difficulties:

  • How to estimate the gas on ethereum? maybe need to estimate offline, and put the numbers in the config file, could be different for each token, since the erc20 implementations could be different?
  • fault tolerance, should orchestrator keep working when price data is out of date and the price oracles down?
@tomtau
Copy link
Contributor

tomtau commented Sep 23, 2021

can one at least rate-limit? e.g. relay 1 batch per hour

@yihuang yihuang changed the title Problem: gravity-bridge orchestrators don't reject non-profitable messages from cronos to ethereum Problem: gravity-bridge orchestrators don't reject unprofitable messages from cronos to ethereum Sep 23, 2021
@calvinaco
Copy link
Contributor

One requirement / discussion to add is a timeout for the batch. Say if a batch has been unchanged for say n hours, the relayer should still submit the batch.

@yihuang
Copy link
Collaborator Author

yihuang commented Sep 23, 2021

One requirement / discussion to add is a timeout for the batch. Say if a batch has been unchanged for say n hours, the relayer should still submit the batch.

The batch itself has a timeout, after batch expired, txs get back to the unbatched pool, then user have the chance to cancel the tx, and re-submit it with a bigger bridge fee.

@thomas-nguy
Copy link
Collaborator

thomas-nguy commented Sep 23, 2021

after received the bridge fee tokens, convert them to eth asap when the price hasn't changed much, could be handled externally.

note that it is good to have but not mandatory in the minimum acceptance crtiteria considering short timeline.

I think what calvin mentioned by timeout is the maximum time transaction needs to "wait" before being send to ethereum even if they specify low value or zero fee (in that case orchestrator will cover the cost, but user needs to wait relatively long time) ie rate-limit from tomtau

@yihuang
Copy link
Collaborator Author

yihuang commented Sep 24, 2021

I think what calvin mentioned by timeout is the maximum time transaction needs to "wait" before being send to ethereum even if they specify low value or zero fee (in that case orchestrator will cover the cost, but user needs to wait relatively long time) ie rate-limit from tomtau

  • when batch timeout, it'll create a new batch later with a new batch timeout.
  • when there's a set of higher valued txs, it'll replace the current pending batch, and reset the batch timeout

I'm not sure how can we add our timeout logic in this.

@thomas-nguy
Copy link
Collaborator

thomas-nguy commented Sep 24, 2021

timeout is not the correct term.

Basically we want to send the batch always every X hours whatever if it is profitable or not.

In genesis config we can setup batch timeout > rate-limit

Pseudo code will be

// Given BatchGasEstimate a config/env parameter

ethFeeValue = GetEthGasPrice() * BatchGasEstimate 

for (token_type, batch) in possible_batches {

batchValue = batch.totalFee * GetTokenPrice(token_type)

if (batchValue >= ethFeeValue) {
  send_batch()
  continue;
} else if ( current_time() > next_send_batch_time ) {
  send_batch()
} 
// otherwise do nothing
}


if ( current_time() > next_send_batch_time ) {
 next_send_batch_time += batch_send_rate
}

https://github.com/PeggyJV/gravity-bridge/blob/main/orchestrator/relayer/src/batch_relaying.rs#L133

A batch is build every 10 blocks in Cronos, contains max 100 txs and have a timeout that can be set in genesis config.

  • if someone send a very high fee tx, the batch will be cost efficient with only one tx => it will be send immediately
  • if someone send a low fee tx, the batch need to gather more tx before being send OR relayer needs to reach next_send_batch_time
  • Worse case, there are too many low fee tx (more than 100) then he will have to wait batch_send_rate multiple time

@yihuang
Copy link
Collaborator Author

yihuang commented Sep 24, 2021

if ( current_time() > next_send_batch_time ) {
next_send_batch_time += batch_send_rate
}

@thomas-nguy Do you mean next_send_batch_time = current_time() + batch_send_rate?

@thomas-nguy
Copy link
Collaborator

next_send_batch_time += batch_send_rate if we want to strictly send X batch within a period
next_send_batch_time = current_time() + batch_send_rate if we want to have strictly X time elapsed within two batches

I think both are fine

@devashishdxt
Copy link

Here's the current logic in orchestrator to make the batches more profitable. Once a MsgSendToEth transaction is out of the pool and in a batch one of three things must happen to the funds.

  1. The batch executes on Ethereum, the transfer is complete, locked tokens are burned on Cosmos.
  2. The batch times out, once it's timeout height in blocks is reached, the locked tokens are then returned to the pool.
  3. A later batch is executed (which is guaranteed to have higher fee), invalidating this one and making it impossible to submit. The locked tokens are then returned to the pool.

What changes to the current logic do we want to do as part of this issue?

One thing that I understand from the conversation is to not send a batch to ethereum if it is unprofitable. If we choose not to send a batch and continue sending future batches with higher nonce, all the batches with lower nonce will automatically get invalidated and all the transactions in batch will return to the pool (which may appear in future batches).

Other thing is to add a timeout for the batch but instead of invalidating the batch on timeout, send the batch to ethereum (even if it is unprofitable). I'm not sure how this can be achieved with current design. Any ideas @yihuang @tomtau @thomas-nguy?

@thomas-nguy
Copy link
Collaborator

@devashishdxt currently the relayer will always submit the batch whatever the cost is

https://github.com/PeggyJV/gravity-bridge/blob/main/orchestrator/relayer/src/batch_relaying.rs#L133

We want to add the pricing logic at this level.

If it is not profitable, dont call send_eth_transaction_batch except if we have reach the next_send_batch_time

Also to give more details on your explanation. A batch per token type is created every 10 blocks on cronos (x/gravity), the old one is replaced by a new one which is always more profitable (because it will contain the latest most profitable txs). At any point of time, only one batch per token exists.

If relayer decide not to send the batch, then he will need to wait 10 blocks until a new batch is constructed to see if it is more profitable to be send (unless we reach next_send_batch_time)

@devashishdxt
Copy link

devashishdxt commented Sep 28, 2021

@thomas-nguy

So we need to maintain next_send_batch_time for each token type on relayer? Or create some additional logic in cosmos module to store this time?

If we don't want to change the cosmos module, we'll end up making relayer stateful. Then each node's orchestrator may have different timeout for a given token type (I think). How do we make sure that we're not sending too many non-profitable batches?

We'll have to somehow synchronize the states of different orchestrators to make sure every node has same next_send_batch_time for a given token type.

@thomas-nguy
Copy link
Collaborator

thomas-nguy commented Sep 28, 2021

Yes we dont want to modify anything on the module. All the logic needs to be done on the relayer side.

It is fine for other relayer to have different logic since its an open market.

To avoid to make the relayer stateful, how about instead of using time, using height or nonce? If we knows that batch is constructed every 10 blocks and the average block time, we can compute what is the next batch (height or nonce) that can be eligible to be send independently of the cost.

Otherwise we probably need a DB or some share memory to sync our relayers, if we want to manage multiple instances (that would also prevent them to try to send multiple time the same batch and lose gas fee as only one will be accepted)

@devashishdxt
Copy link

To avoid to make the relayer stateful, how about instead of using time, using height or nonce?

I don't think this makes any difference. If we store any value on relayer, we're making it stateful and different nodes can have different values for this unless we synchronize this using a central DB (which I don't think is a good idea for decentralized systems).

Any comments @yihuang @tomtau.

@thomas-nguy
Copy link
Collaborator

just to clarify the requirement.

This pricing mechanism will be only unique for "our relayer". It is not absurd to use a central DB to synchronise our relayer instances (if we want to manage multiple to ensure better reliability). At the end you can consider that it is only one unique "scalable" relayer.

Anyone can run their own instance of relayer implementing complete different logic. It is still decentralized

@yihuang
Copy link
Collaborator Author

yihuang commented Sep 28, 2021

To avoid to make the relayer stateful, how about instead of using time, using height or nonce?

I don't think this makes any difference. If we store any value on relayer, we're making it stateful and different nodes can have different values for this unless we synchronize this using a central DB (which I don't think is a good idea for decentralized systems).

Any comments @yihuang @tomtau.

Technically we only need at least one orchestrator to deliver the message, we could even only run one real orchestrator/relayer with several dummy ones, to avoid contention at all. So I think it should be ok to not synchronize the state.

@devashishdxt
Copy link

To avoid to make the relayer stateful, how about instead of using time, using height or nonce?

I don't think this makes any difference. If we store any value on relayer, we're making it stateful and different nodes can have different values for this unless we synchronize this using a central DB (which I don't think is a good idea for decentralized systems).
Any comments @yihuang @tomtau.

Technically we only need at least one orchestrator to deliver the message, we could even only run one real orchestrator/relayer with several dummy ones, to avoid contention at all. So I think it should be ok to not synchronize the state.

Ok

@tomtau
Copy link
Contributor

tomtau commented Sep 28, 2021

I think the original problem is pretty complex (besides the difficulties mentioned in the original posting, such as estimating gas costs and resilience to infrastructure faults) -- especially with regards to treasury management of arbitrary tokens (what is "profitable" at the time of submitting the batch may not be profitable a few moments later).

So, I'd suggest closing this for now and opening two issues for slightly more manageable problems:

  1. being able to bind costs of unprofitable messages: have some soft-guarantee time-range when batches are submitted, so one can know how long the orchestrator can operate with existing funds... so the main difficulty would be around gas estimates
  2. have an optional "fast-pass" for selected tokens: have a way to configure that batches would be submitted earlier for chosen tokens (where one assumes the price volatility is manageable), given some simplistic notion of "profitability" (e.g. in that pseudocode Problem: gravity-bridge orchestrators don't reject unprofitable messages from cronos to ethereum #99 (comment) )

The second problem would likely need gas cost estimates too, so perhaps starting with the first one and then building on top of it later?

@yihuang
Copy link
Collaborator Author

yihuang commented Sep 28, 2021

The second problem would likely need gas cost estimates too, so perhaps starting with the first one and then building on top of it later?

I've suggested @devashishdxt take the parameters from the config file, possibly:

  • estimated gas of batch execution, possibly constants for each token
  • current eth gas price
  • token eth exchange rate

And leave the task of updating these parameters to external programs.

@tomtau tomtau added the gravity bridge This issue or pull request is related to the Gravity Bridge functionality label Oct 28, 2021
thomas-nguy pushed a commit to crypto-org-chain/gravity-bridge that referenced this issue Mar 8, 2022
…ges from cronos to ethereum

Solution: Reject unprofitable messages. Also adds logic of a timeout duration after which the batch is sent even if it is unprofitable.

Fixes crypto-org-chain/cronos#99
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gravity bridge This issue or pull request is related to the Gravity Bridge functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants