Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECIP-1092 51attack solution: PirlGuard & Callisto proposal #327

Closed
Dexaran opened this issue Aug 7, 2020 · 56 comments
Closed

ECIP-1092 51attack solution: PirlGuard & Callisto proposal #327

Dexaran opened this issue Aug 7, 2020 · 56 comments
Labels
status:1 draft ECIP is in draft stage an can be assigned ECIP number and merged, but requires community consensus. type: std-core ECIPs of the type "Core" - changing the Classic protocol.

Comments

@Dexaran
Copy link
Contributor

Dexaran commented Aug 7, 2020


lang: en
ecip: 1092
title: 51% attack solution: PirlGuard & Callisto
status: Draft
type: Core
author: dexaran (@Dexaran)
created: 8/7/2020
license: LGPLv3

Abstract

The following describes a method of preventing a 51% attack on Ethash-based POW chains.

Motivation

Ethereum CLassic suffered a number of 51% attacks already. Here you can find an article from Cointelegraph describing the 51% attack on 6 Aug 2020. Here you can find another article describing the 51% attack on 10 Jan 2019. As long as ETC protocol remains unchanged it is susceptible for more 51% attacks.

Specification

The proposed solution refers to the PirlGuard protocol.

The description of the protocol states that instead of automatically syncing with any privately-mined chain branch the new protocol should require the peer proposing the privately-chain (and the reversal of the publicly mined and synced blocks) to mine a number of "penalty" blocks. The number of penalty blocks must depend on the number of the original blocks that would be reverted if the chain will reorganize and sync to the proposed (privately-mined) branch. Thus the cost of the 51% attack will dramatically increase as the attacker will not be able to mine his private branch and then broadcast it to the network thus reverting all the transactions from the "main" branch.

Rationale

This protocol is already implemented in Pirl and Callisto Network. The protocol has a working time tested reference implementation. The proposed change requires minimal modifications of the Ethereum CLassic protocol while offers a reliable method of protecting against the recent attacks and the relevant threat of newer similar attacks.

Implementations

Here is a reference implementation of the protocol in GO lang: https://github.com/EthereumCommonwealth/go-callisto/blob/master/core/pirl_guard.go

https://github.com/EthereumCommonwealth/go-callisto/blob/master/core/blockchain.go#L1540

Presentation

Made by @padenmoss

http://padenmoss.com/public/ECIP-1092_proposal.pdf

Copyright

This ECIP is licensed under GNU Lesser General Public License v3.0.

@BaikalMine
Copy link

Its a good idea!

@TheEnthusiasticAs
Copy link
Member

TheEnthusiasticAs commented Aug 9, 2020

From several days of discussions of this topic in the ETC discord, I summarized the pro and contra arguments, what I could collect:

Pros:

  • a working solution on the Callisto Network
  • Can be implemented like a "copy-paste"
  • could be seen as a temporary solution till a better one is found

Contras:

  • some developers have technical doubts till they see security report(s)
  • it makes possible to have a chain with less PoW to win (PoW rule: a chain with the most PoW wins)
    -> violation of the censorship-resistance-principle (because one mined offline it will be penalized)
  • some other blockchain principles would be broken, what one can't see at them moment
  • an attacker is e.g. making 3k+ block reorgs. They have the time and money to check in and then mine 3k more honestly

Please extend/correct, if you have any suggestions!

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 9, 2020

some developers have technical doubts till they see security report(s)

How can "some developers" have technical doubts if there is a working solution that proves to be tested in the wild for long time already?

As a security auditor I can formally write a security report but it looks like an unnecessary encumbrance for the implementation that is already operational in various perfectly working networks.

@crazypool2019
Copy link

We need a prompt and timely solution, even if it is temporary, implementing this solution is the fastest we have, and we urgently need it

@BaikalMine
Copy link

We need a prompt and timely solution, even if it is temporary, implementing this solution is the fastest we have, and we urgently need it

I fully support! Without this decision, the coin will not live long.

@dinc334
Copy link

dinc334 commented Aug 10, 2020

Expanse was attacked several times in 2019 (https://gist.github.com/metalicjames/01222049f95f85df8c0eb253de54848b)
And our lead developer @chrisfranko decided to implement PirlGuard, after that, we didn't have any problems. So, yea, ETC should add it as fas as they can and maybe later change it to custom solution.

@crazypool2019
Copy link

We need a prompt and timely solution, even if it is temporary, implementing this solution is the fastest we have, and we urgently need it

I fully support! Without this decision, the coin will not live long.

it's true, at this moment etc is a dead coin

@Sparke2
Copy link

Sparke2 commented Aug 10, 2020

ignorance of etc teams and community to the real problem in presence of a proposed solution is just awful
this damages reputation of etc project

@crazypool2019
Copy link

Dear ETC developers, please don't waste any more time, implement pirlguard once and for all

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 10, 2020

I would ask all the participants to provide a proper rationale for their opinions.
Please keep the discussion free from excessive comments. In order to make your voices counted you should switch to twitter and provide your requests there. In this case you can encourage even more parties to pay attention and react.

As for the github thread - I'd prefer to see more technical details and reasoned 'pros' and 'cons' of my proposal.

Thank you for understanding.

@realcodywburns realcodywburns changed the title ECIP-? 51attack solution: PirlGuard & Callisto proposal ECIP-1092 51attack solution: PirlGuard & Callisto proposal Aug 10, 2020
@dinc334
Copy link

dinc334 commented Aug 10, 2020

Pros:

  • working solution on EXP, PIRL, CLO
  • easy to implement
  • easy to understand

Contras:

  • hardfork need
  • not a native ETC solution
    Idk what to add more.

@phyro
Copy link
Member

phyro commented Aug 16, 2020

I could be misunderstanding the implementation so I apologize if that's the case. In case I'm wrong I'd appreciate if you explained to me why this would not be possible.

Let's forget about 51% attacks and say the attacker wants to split the network. The PenaltyCheckLength variable is set to 60 for all clients.

From this line we can see that if I make a reorg of 60 blocks, the if condition will be false and hence we will have no penalty which means the reorg will happen and the person will switch to the attacker chain.
So the victim will now have last 60 blocks different from the mainnet. The attacker now also adds 1 more more block to the victim chain (could also add 10 or whatever). When the mainnet mines 2 blocks, the mainnet chain will have more PoW and the victim client will want to reorg, but since it will try to reorg 62 blocks, it will get a huge penalty due to the int64(tipOfTheMainChain - incomingBlock) computation in a for loop for all reorged blocks that add to the penalty which means the victim chain will stay split for a long time.

If this is possible, then I could time these blocks in a way that roughly 1/3 or 1/2 of the network would receive them and become victims of a split. I could even make K splits if I made K different reorg chains and send them to different nodes.

@knocte
Copy link

knocte commented Aug 17, 2020

IMHO, ETC is selling itself as a more conservative alternative to ETH: by keeping PoW, by not having infinite inflation (like the 1st blockchain: BTC). Deploying this would go completely against this narrative, because this looks like a very experimental thing (has any other blockchain adopted this?). Also, given that ETH is switching to PoS very soon, we might get a new influx of PoW miners on ETC chain, which would decrease the feasibility of more 51% attacks.

@q9f q9f added status:1 draft ECIP is in draft stage an can be assigned ECIP number and merged, but requires community consensus. type: std-core ECIPs of the type "Core" - changing the Classic protocol. labels Aug 17, 2020
@GDHex
Copy link

GDHex commented Aug 17, 2020

@phyro your questions are valid and ur understanding is on the right path.
but keeping a small PenaltyCheckLenght(10) and keeping a healthy set honest and synced nodes will make any attack really hard as the codebase already handles small reorgs witch leads to a very small window. As I said many times (and at some point I started expanding on the code but didn't have the time to complete it) this is NOT a complete 51% attack prevention.
On the other hand pirlguard has secured many chains and mitigated a good amount of attacks. It would be trivial to test (even ur edge case) and implement even expand to suit ur needs.
As right now the ETC chain is getting attacked at a regular basis why not try the solution on a test net where you can also evaluate all the assumptions that you might have regarding limiting and splitting and proper parameters.
Hope that helps

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

@phyro this is a good example of an edge case attack.

I could be misunderstanding the implementation so I apologize if that's the case. In case I'm wrong I'd appreciate if you explained to me why this would not be possible.

TL;DR

This is possible in theory but this should be a 99,9% attack and there is no way to benefit from this attack for the attacker so there are no incentives to perform such an attack for any party.

(Unless the goal of this attack is to destroy the network. However this goal could be accomplished by 51%-attacks in the current state of the network and this would be much cheaper)

The PenaltyCheckLength variable is set to 60 for all clients.

This is still a field for tuning and adjustment. Here we assume that there should be a good balance between the possibility to resync the naturall split and restricting attackers ability to develop a longest chain thus causing the reorganization.

There are good reasons to set PenaltyCheckLength to the smalles possible value (for example 12 as it was suggested as "transaction finality threshold" by Ethereum Foundation in the early days of ETH if I remember correct). I'm ready to gather and provide the statistical data from Callisto and Pirl if necessary.

The attacker now also adds 1 more more block to the victim chain (could also add 10 or whatever).

The attacker can not add more than PenaltyCheckLength blocks however. This is another reason to set it to the lowest possible value.

When the mainnet mines 2 blocks, the mainnet chain will have more PoW and the victim client will want to reorg, but since it will try to reorg 62 blocks, it will get a huge penalty due to the int64(tipOfTheMainChain - incomingBlock)

The less the value of PenaltyCheckLength the fewer penalty blocks would be assigned. This is the third reason to set it to the lowest possible value.

which means the victim chain will stay split for a long time.

Finally the chain will resync again. It should be noted that the attacker must maintain the chain offspring for long time which is costly (without any possibility to make any profit from this unlike 51%-attack case). Once the attacker runs out of funds the chain will die eventually (or become the main chain depending on whether hashpower sources adopted the attackers branch of the chain or not).

It must be noted that block explorers and most exchanges run background nodes instead of relying on a single node confirming deposits. Once there is a disparity between the background node and the node which has confirmed the deposit the exchange must immediately halt deposits until the investigation of the issue is conducted. Unlike the 'stealth mining attack' that can go for an unlimited period of time while the rest of the chain participants are completely unaware of the attack - this edge case will be spotted immediately.

It is not possible to harm any exchange (or DAPP) with this attack as every entity which has multiple geographically spread nodes will have nodes synced with both chains (the attackers branch and the "previous mainnet branch"). In the worst case the exchange can whitelist the correct branch for those nodes who got to the wrong chain due to network latency at the moment of the split.

If this is possible, then I could time these blocks in a way that roughly 1/3 or 1/2 of the network would receive them and become victims of a split

Block mining is very random and will greatly vary in time. You can't just generate a block at a given point in time unless you have 99,99% hashrate advantage.

At a prolonged time frame like 10 hours a minimal difference of hashrate will play its role which makes 51%attacks possible (as even the slightest advantage in hash power may cause the attacker to develop a slightly larger chain if he will maintain this advantage for long enough). In case you want a certain block to be generated at exactly the moment 1/2 or 1/3 of the network has already adopted another block then 51% advantage is not sufficient. If you want then to generate a couple of blocks in a row within a very limited time frame (be it 60 or 12 blocks) then you want much more hash power than 51%.

I could even make K splits if I made K different reorg chains and send them to different nodes.

For each of kth chains you need to pay for another 51% (at the very least and in practice for even more) hashpower.

So, if you have an amount of funds that is orders of magnitude higher than the capitalization of the attacked network in order to invest in an attack that will not bring you any profit, then you can do it.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

@knocte

ETC is selling itself as a more conservative alternative to ETH: by keeping PoW, by not having infinite inflation (like the 1st blockchain: BTC). Deploying this would go completely against this narrative

The chain that has good narrative and ideology but does not work is worthless.

this looks like a very experimental thing (has any other blockchain adopted this?)

This is adopted by Callisto, PIRL, Expanse and other networks. Please, read the ECIP text again.

given that ETH is switching to PoS very soon

IF ETH is switching to PoS ever. There were plans to migrate to PoS at the very start of Ethereum in 2014. Then in 2015. Then in 2016 and 2017 and in 2018. In 2019 this should have certainly happened. Now in 2020 ETH is going to switch to PoS again... now they even have a PoS testnet.

Also, ETH should have a decentralized file system called SWARM which also had a testnet. In 2018 iirc. Nothing happens so far.

@phyro
Copy link
Member

phyro commented Aug 17, 2020

@Dexaran thanks for the answer.

The attacker can not add more than PenaltyCheckLength blocks however so this is another reason to set it to the lowest possible value.

From my understanding, It can after the blocks have been reorged.

Block mining is very random and will greatly vary in time. You can't just generate a block at a given point in time unless you have 99,99% hashrate advantage.

Yes, but this problem can be bypassed if you just mine 80 blocks and then you have 20 windows from 60 to 80 you can try to hit.

For each of kth chains you need to pay for another 51% (at the very least and in practice for even more) hashpower.

You can actually be smarter than I described. Mine 59 blocks and then split them. This way you can create K split by adding just a few blocks to each chain that shares most of the blocks.

I do agree that this can't really hurt 51%, but it opens the attack vector for splitting the network and if the adversary wants to damage the network, they can do so with relatively little cost.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

@phyro for this example let's define chain_A(n) and chain_B(n) where n is the number of a block. A(n) is nth block the the chain A. chain_A is the mainnet. We assume PenaltyCheckLength == 60 as you have proposed while in practice I would suggest 12 or 20.

From my understanding, It can after the blocks have been reorged.

Yes - the attacker can add some blocks after the chain is already reorged. However it is not possible that the attacker will generate a relatively large number of blocks.

I.e. when the mainnet is at block A(n) the attacker can generate block B(n+1). Once the mainnet is at block A(n+60) the attacker could generate a number of blocks at chain_B. Say the attacker has B(n+67) or B(n+70) if he got very lucky. The difficulty adjustment would not allow the attacker to generate significant amount of blocks within this short time window between mainnet blocks A(n) and A(n+60).

The attacker can only propagate as much blocks as he have generated within this time window as if he would be too late to propagate his block to the network at a moment of A(n+60) the attack will completely fail since A(n+61) is too late already.

Yes, but this problem can be bypassed if you just mine 80 blocks and then you have 20 windows from 60 to 80 you can try to hit.

This should not be like this.

Say the attacker have splitted off at block A(n) which means that his chain includes blocks:

 A(n), B(n+1), B(n+2) ... B(n+60), B(n+61), ... B(n+80).

This is really unlikely and assumes that the attacker has very harsh hashrate advantage if he was capable of generating that much blocks.

The mainnet chain includes blocks:

 A(n), A(n+1), A(n+2) ... A(n+59).

At the point of time the block A(n+59) is mined the attacker is PREPARING and waiting for the block A(n+60) to be mined.

Once the block A(n+60) is mined and propagated to the network the mainnet nodes start syncing it. At that moment the attacker must propagate block B(n+60). This is worth to note that one of the blocks A(n+60) and B(n+60) is better than the other so those nodes who have access to both blocks would pick one of them and it is deterministic which decision is correct at this block height.

The attacker then must propagate block B(n+61) to the nodes who have adopted block B(n+60) and cause them to refuse syncing with chain A.

If the attacker has failed to do so at block A(n+60) then at block A(n+61) he can do nothing already. The window between B(n+60) and B(n+80) does not grant any advantage to the attacker because those nodes who will make a decision regarding resyncing are at chain_A and they don't care about how longer the chain_B is once it suggests to rewrite more than 60 blocks.

In order to make your scenario with time window possible the attacker should have developed another chain_C starting at block A(n+1) so that the alternative chain C would include blocks

A(n), A(n+1), C(n+2), C(n+3) ... C(n+61)

The attacker would have another one attempt to cause the network split at block A(n+61) / C(n+61) but if he fails to do so then the attack will also fail without any new time windows.

You can actually be smarter than I described. Mine 59 blocks and then split them. This way you can create K split by adding just a few blocks to each chain that shares most of the blocks.

Developing a chain and causing a split (which is a hard task and dependant on luck already which means that it is very costly in practice) is not enough. The attacker must maintain the network for long time which requires him to pay for hashrate at that chain. Otherwise the nodes would sync to the main branch again eventually.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

Again, if some party is a multi node entity and it has nodes at both chains A and B then this party can deterministically decide which chain is correct.

Because the chain split happened at block n+60 and it is possible to determine which chain includes a better block A(n+60) or B(n+60).

This entity can whitelist the correct chain for the node that suffered network latency issues and failed to sync with the correct chain before it downloaded the next block and it was too late to reorg the chain.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

The reason of this edge case is network latency but not the underlying consensus flaw. If we assume that the network has zero latency and all the nodes of the network receive blocks A(n) and B(n) immediately with no downtime then this edge case would not exist. The nodes would just pick a better block and go with a correct chain.

@GDHex
Copy link

GDHex commented Aug 17, 2020

@Dexaran amazing analysis!
Yes you are right that's why we stressed about the need to have a decent amount of honest nodes. For Pirl the master nodes gave that guarantee.
Again any assumption that needs clarification is very easy resolved with the usage of the solution on a test net as any attack discussed here could be evaluated.

@GDHex
Copy link

GDHex commented Aug 17, 2020

BTW PirlGuard is based on the solution provided by Zen Cash team in this video -> https://www.youtube.com/watch?v=E99wpSZs6iM

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

Adopting each other's solutions, sharing research, sharing ideas, and building on the back what already exists are definitely good practice in the programming and blockchain industry.

This was the original reason for the creation of "Ethereum Commonwealth" - we must avoid duplication of work and spend efforts on what has already been done.

I'm glad to see like-minded persons here.

Divided we fail: https://medium.com/s/story/divided-we-fail-the-irrational-insanity-of-crypto-tribalism-6acc54465769

@chrisfranko
Copy link

chrisfranko commented Aug 17, 2020 via email

@chrisfranko
Copy link

chrisfranko commented Aug 17, 2020 via email

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 17, 2020

There is never a time where the added weight of the longest or heaviest
chain is worth the 1000+ reorganization. Ever. The economic damage that
happens from such a reorg is also a cost.

@chrisfranko is right.

There is no such situation at which a huge reorganization is necessary. Huge reorganizations never happen naturally - these are already an abnormal behavior of the network.

If such a split would naturally happen then some exchanges would be damaged already. This would be a disaster for the network. In practice this never happens. Only attacker can exploit this "feature" of the protocol to harm the network and participants.

@gitr0n1n
Copy link
Contributor

gitr0n1n commented Aug 21, 2020

Github is the most proper platform to discuss technical merits of the proposals without any third-party moderation and other teams controlling the time allocation etc.

Fair enough, you're right about the Q&A between teams. It was one of my suggestions after the backlash from the call for those that desired talking during that call but didn't get the opportunity. I can see how one would prefer Github if they don't want to utilize those calls. And I feel you're correct that Github is the proper channel since that is what the ECIP proces states. Of course, you shouldn't be required to do anything outside of the ECIP process. I don't think anyone was requiring this suggestion, it was more of an invitiation to join that form of dialogue.

IMO some of your criticism has merit, other parts are hyperbolic. As an example, I submitted an ECIP the same day as you, ECIP 1093, and it's been treated similar; no tweets, broken file path, auto rejection in a poorly handled CDC call.

However, your ECIP got a marketing push from ETC Labs this past week via crypto publication articles. Additionally the #general channel has been flooded with new accounts shilling of your proposal for nearly two weeks. You've had community members engaging in thoughtful discussion related to the technical merits of the proposal. So it's not as if it has not gotten attention. Just some food for thought as you complain about a lack of attention.

In any case, I drafted up this little blog to try to keep track of all the evolving proposals and topics buzzing around the ECIP world:
https://ethereumclassic.org/blog/2020-08-21-core-devs-call-2020-Q3-Hardfork-Process-Feedback

@gitr0n1n
Copy link
Contributor

gitr0n1n commented Aug 21, 2020

For what its worth @Dexaran, I prefer this method over the algorithm change methods as I believe the risks of those changes are far greater than remaining on Ethash and finding a solution. Additionally, from what I've seen on the Checkingpointing and Timestamping system of IOHK, I'm not a fan of the idea of relying on a separate federated chain to secure ETC. So this proposal appears to be the most logical solution I've read thus far. That's not to negate the fair criticism listed in above comments. Just my opinion on the matter.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 22, 2020

So it's not as if it has not gotten attention. Just some food for thought as you complain about a lack of attention.

I'm not complaining about the lack of attention, but I am totally against censorship.

I am a longtime contributor of the project. I came up with a solution and a team capable of implementing it in no time. I followed all the contribution guidelines. Then I asked for a quick announcement of my proposal because I originally proposed it as a temporary solution that could be implemented in a soft-fork and calm down exchanges, pools and users to give us the time we need to carefully evaluate long-term solutions.

However it turned out that someones employee has a little bit of power in an important part of the project and he has exploited all his control to censor my proposal because he does not personally like me for some subjective reason. This goes against the benefit of the project and the community. This goes against the core principles of the project and the Crypto-Decentralists Manifesto.

Neutrality is necessary. It’s important for anyone participating in
blockchain-enabled cooperation to be on an equal footing with everyone else. It doesn’t matter if you wield huge economic power or only a tiny amount. It doesn’t matter whether you’re a saintly Mother Theresa or a vicious drug dealer. It doesn’t matter whether you’re a human or a refrigerator. It doesn’t matter what you believe in, what political theory you subscribe to, or whether you’re a moral or immoral person. A participant’s ethnicity, age, sex, profession, social standing, friends or affiliations, make or model, goals, purposes or intentions — none of this matters to the blockchain even a bit. The rules of the game are exactly the same for everyone, period. Without neutrality, the system is skewed towards one set of participants at the expense of others. - A Crypto-Decentralists Manifesto

I was constantly asking for an explanation of the reasons behind not announcing my proposal. The aforementioned employee openly refused to answer my question, showed extreme incontinence, used curses and profanity. This is a cruel violation of any professional ethics.

If IOHK has 10x the resources and produces 10x more content than me, so they get 10 announcements versus my 1 announcement, then that's okay. But please, when I have something important to announce - do not censor it on purpose even if you don't like me.

After that, I insistently asked to provide the github issue link but not the pull request in the announcement. Most of our Callisto security auditing workflow is handled through Github and we have performed more than 300 security audits there so we have a great expertise in understanding user experience when it comes to using the github platform. I'm absolutely sure that most of the users never want to get to the pull request when they follow the proposal link. Instead they want to get to the human-friendly proposal description and the following discussion. As a result, many (potentially valuable) contributors will leave ECIP without participating in the discussion if they are redirected to a pull request that does not provide any valuable information about the proposal and only serves as a place where developers can validate the submitted proposal.

In the recent announcement that was finally made the Pull Request link is provided instead of issue link. This is counterproductive.

After all, once the employee made the announcement, he did it wrong. The mistake is that he provided two links to ECIP 1092 but no link to ECIP 1094 which is also referred in the announcement.

I'm happy that the issue is resolved.

I'm not calling anyone by name on purpose. I believe that our internal conversations and disagreements should stay internal but not related to this ECIP and I hope that such a censorship and unprofessional counterproductive behavior will never happen again.

I wish him to be careful, because next time he might make a bigger mistake because of emotion. The project can be seriously damaged if the account is compromised as a result of a human mistake and someone scams thousands of people on behalf of the project using an “official” account.

In the light of the recent events I would like to re-open an old Decentralized on-chain registry of media resources.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 22, 2020

Additionally the #general channel has been flooded with new accounts shilling of your proposal for nearly two weeks. You've had community members engaging in thoughtful discussion related to the technical merits of the proposal.

I'm an active contributor of large projects like Ethereum and EOS so there are some people interested in what I'm doing. Also, the proposed protocol has some followers already because it is not my invention and it is not the first time the protocol is being used to solve a problem of 51%-attacks.

Fair enough, you're right about the Q&A between teams. It was one of my suggestions after the backlash from the call for those that desired talking during that call but didn't get the opportunity.

I will try to have a representative for the next call. I agree that having a video/audio explanation and a number of representatives is better than just a pure text description.

In any case, I drafted up this little blog to try to keep track of all the evolving proposals and topics buzzing around the ECIP world

Great work, thank you.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 22, 2020

from what I've seen on the Checkingpointing and Timestamping system of IOHK, I'm not a fan of the idea of relying on a separate federated chain to secure ETC. So this proposal appears to be the most logical solution I've read thus far.

After I investigated the IOHK Checkpointing proposal, I came to the conclusion that it does not interfere with ECIP 1092 and both proposals can be applied at the same time.

This would probably be the best option.

@padenmoss
Copy link

padenmoss commented Aug 23, 2020

I have created a simple powerpoint to reinforce this proposal. Please feel free to reuse or request changes.
http://padenmoss.com/public/ECIP-1092_proposal.pdf

@IstoraMandiri
Copy link
Contributor

IstoraMandiri commented Aug 23, 2020

Hi @Dexaran thank you for the explanation.

Note there are two types of eclipse attack outlined in the paper:

  • Eclipse attack by monopolising connections
  • Eclipse attack by table poisoning

You say:

It is worth to mention that isolating a node is not an easy task already. We can't just tell a node "Please drop your current connections and connect to mine nodes instead".

But the latter method, table poisoning, appears to be exactly this.

In any case, the idea would not be to target a single node, but target groups of mining pools so you create a chain partition that lasts for 60+ blocks. If a node evetually then rejoins the network, they cannot reach consensus.

I would also mention that with or without an exclipse attack, a state actor who controls a nations IP infrastructure (perhaps a nation state that controls the most mining power), would be able to trivially achieve a network partition and thus a permenant chain split by blocking certain remote connections for a short amount of time (60 blocks).

@IstoraMandiri
Copy link
Contributor

IstoraMandiri commented Aug 23, 2020

I will answer you specific responses:

This is not an issue of PirlGuard in first place. Then "isolating a node with PirlGuard" is as hard as "isolating a node without PirlGuard". So the complexity of the attack is literally the same.

The problem is not with defending the node from eclipse attacks but dealing with the aftermath of eclipse attacks.

In the case of PirlGuard, you cannot rely on the longest chain rule, as it comes down to the local (subjective) state of the network.

Finality is not subjective. It is simply governed by new consensus rules, which are certainly objective, but to some extent dependent on the previous state of the network. Unlike the current consensus rules that assume that the network could be shut down and then resumed at any time. However in reality the network is never shut down and we can remove these redundant mechanics from the consensus.

It is subjective because it does not rely on an objective point of reference -- i.e. the longest chain based on Proof of Work.

PirlGuard is subjective because it relies on the state of the network from the point of view of the node; it is literally subjective.

Naturally long partitions never happen. You can point me out any example since 2009 if you have any. I'm not aware of any natural chain splits for the past 11 years. Prevention of such chain splits is the main goal of the network and all the complex underlying mechanics because this splits would make the network unusable.

I don't have enough information to prove whether or not this happens, but lack of it happening doesn't prove that it can't happen.

It may be the case that it only doesn't happen because there's no incentive to have this kind of attack. It may be that it would only happen when a large market cap chain implements a consensus protocol that incentivises this attack.

In case of a successful eclipse attack a node or a group of nodes may be definitely isolated and it will never sync with the mainnet chain again. However this would not happen if some of this nodes have a whitelisted private connection with another distanced node to compare with.

Perhaps, but can we guaruntee this? Can we also guaruntee that those whitelisted nodes cannot be censored or DDOSd? Could an attacker not partition the network in a way that maintains some of these whitelisted connections? Could a state actor partition the network for 60+ blocks?

These are all additional assumptions about the security model that we don't have to worry about now, but will be introduced with PirlGuard.

If the cost of the attack is not known and taking control over the 100% of the nodes of the network is not possible even in theory then the attack has little to no chance of success.

This assumption is incorrect. You don't need to take control of 100% of the nodes; you just need to partition the network temporarily.

The economics and ideological background of 51%-attacks are absolutely the same from what we can observe in the past history of our projects. These are the same at the present time.

There is no reason to assume that economics and politics are different in case of this attack.

As mentioned, I disagree because the incentive to do an attack on ETC is greater and requires relatively the same resources. The risk/reward is higher, and it might be worth waiting to execute this 0-day attack if you can get a greater payout. It may even be easier to pull off on ETC because it's network is larger, more decentralized, and therefore easier to partition.

The politics are clearly different because ETC poses more of a threat and is disliked by more people than other smaller chains.

Additionally, the market cap metric matters because by sabotaging the ETC network it is easier to open short positions that do not effect the price of ETC due to having a highly liquid market.

@Dexaran
Copy link
Contributor Author

Dexaran commented Aug 24, 2020

@hitchcott

It is worth to mention that isolating a node is not an easy task already. We can't just tell a node "Please drop your current connections and connect to mine nodes instead".

But the latter method, table poisoning, appears to be exactly this.

In any case the main vulnerability exploited in this attack scenario is the fact that a node has empty table upon reboot and the attacker has some methods of filling the db of the victim with his malicious nodes.

Simply having two nodes and never rebooting them at the same time will solve the problem per se.

In any case, the idea would not be to target a single node, but target groups of mining pools so you create a chain partition that lasts for 60+ blocks

Mining pools have all the incentives to retain the consensus state. They also have the measures to prevent any attack i.e. having "private" observer nodes and background nodes. Even if mining pools will refuse to do so then miners will switch their hashing powers because miners are incentivised to be on a chain which is supported by exchanges but not pool operators. As the result the system tends to self-balance and remain in a consensus state.

It may be the case that it only doesn't happen because there's no incentive to have this kind of attack. It may be that it would only happen when a large market cap chain implements a consensus protocol that incentivises this attack.

In case of 51%-attacks if something is vulnerable then it is attacked (no matter small cap or large cap). In case of this attack there is no reason to assume that the system of incentives will be somewhat different from that in other types of attacks.

If the cost of the attack is not known and taking control over the 100% of the nodes of the network is not possible even in theory then the attack has little to no chance of success.

This assumption is incorrect. You don't need to take control of 100% of the nodes; you just need to partition the network temporarily.

And you can't do so if you do not even know which nodes to target. In case a victim has a background node and you don't know which node of the network it is - you have little to no chance to cause the network split by just isolating random nodes and hoping that it will be the victims node.

However this would not happen if some of this nodes have a whitelisted private connection with another distanced node to compare with.

Perhaps, but can we guaruntee this? Can we also guaruntee that those whitelisted nodes cannot be censored or DDOSd? Could an attacker not partition the network in a way that maintains some of these whitelisted connections? Could a state actor partition the network for 60+ blocks?

All the network participants have incentives to adopt the "an isolated node can not trust itself" paradigm and thus resist the attack.

  • Exchanges are not interested in losing funds.
  • Mining pools are not interested in mining a chain that is not adopted by exchanges.
  • Miners are not interested in supporting pools that mine improper chains.

The whole ecosystem is only interested in remaining in a consensus state.

The politics are clearly different because ETC poses more of a threat and is disliked by more people than other smaller chains.
Additionally, the market cap metric matters because by sabotaging the ETC network it is easier to open short positions that do not effect the price of ETC due to having a highly liquid market.

Your statement does not correspond with what we can see in reality. In case of other types of attacks (including 51%-attacks) this difference does not play any role because all chains are 51%-attacksed as long as they are vulnerable and the market cap make no difference. There is no reason to assume that in case of this particular attack the system of incentives is different.

It is subjective because it does not rely on an objective point of reference -- i.e. the longest chain based on Proof of Work.
PirlGuard is subjective because it relies on the state of the network from the point of view of the node; it is literally subjective.

It may be subjective from the point of view of a single node. From the point of view of the external observer it is always possible to determine which chain is valid objectively.

However if all the existing chains are known at the same time then it is always possible to determine which chain is the correct chain by rolling back the time to the moment of the split and determining which block of two blocks with the number split_block_number + finality_threshold is better. Then the chain with the better block is considered valid.

Again, the reason of the split in this case is not the consensus shortcoming but the imperfection of hardware or connection.

The problem is not with defending the node from eclipse attacks but dealing with the aftermath of eclipse attacks.

You are correct - in case of a network split mitigation of the consequences will be harder for those network participants who weren't prepared.

On the other hand, the network participant is obliged to pay attention to the network mechanism that he uses. The same logic can be applied to mining pools that have super-laggy nodes and therefore always mine orphaned blocks.

It is always possible to invent a way to have bad hardware / bad connection / or not follow the security guidelines and thus suffer from your own irresponsibility. This is not a problem of the consensus model already.

@padenmoss
Copy link

padenmoss commented Aug 24, 2020

Suggestion: Change the title to "Mining Penalty for Deep Reorganization" as the special characters in the title and author field are preventing table population at https://ecips.ethereumclassic.org/core

@chrisfranko
Copy link

chrisfranko commented Aug 26, 2020 via email

@creepas
Copy link

creepas commented Aug 30, 2020

As there are no serious arguments against this lets move it to last call?

@q9f q9f mentioned this issue Aug 31, 2020
@creepas
Copy link

creepas commented Sep 1, 2020

I talked to many other miners and we do belive its important to secure the network with this EICP. If you want to then work on other solutions fine. But we cant support network where we are loosing so much money when the 51 percent is going on. We need some solution fast. As this is already tested solution and working solution im heavy in favor to move it to last call.

@creepas
Copy link

creepas commented Sep 1, 2020

So i went trough the code and its really simple:

If you are offline miner you cant join the network and do the reorg because of the penalty.

It makes sense because if you mine offline and dont interact with the network than WHT* you are mining?

If you do mine offline it has sense you want to prepare just an attack and thats all... what else you do?

Expample of penalty:

Lets say you wanted to mine 5 blocks in private (offline) and now you want to sync them to real online network:

penalty = (1 + 1) + ( 1+1+2 ) + ( 1+1+2+3) +( 1+1+2+3+4) + ( 1+1+2+3+4+5)

so for you to mine 5 offline blocks you need to mine 41 online blocks

I dont think it has anything to do with "nakomoto ideas" as it seems it does penalty only to the actor that wants to dmg the network...

With last attack it would be insane number...

with that even 60 comfirmations on exchanges would be okey and safe as the penalty would be simply to big. (Can be bigger for start...)

But it would be much lower than 80.000 that some exchanges have now 💯 !

Maybe its better to agree on some small reorgs like is in original code:

Small reorg are tolerated using this params https://git.pirl.io/community/pirl/-/blob/master/params/protocol_params.go#L147 as it can happens with network latency or other events related to the nature of the connection

@Dexaran
Copy link
Contributor Author

Dexaran commented Sep 10, 2020

In the absence of counterarguments against the implementation of this proposal as well as in view of the obviousness of the necessary solution for 51 attacks, I propose to move this to "Last call".
It is clear that at this stage this proposal enjoys the support of the ETC community and satisfies the requirements of rough consensus in my opinion.

@wpwrak
Copy link

wpwrak commented Sep 25, 2020

Sadly, there is still no formal definition of the change. Such a definition should include, as a minimum,

  • the representation of the chain this modification is applied to,
  • the ways how this chain is updated,
  • precisely where, when, and how Pirlguard modifies the update process and with which points of the chain representation it interacts, all this described in the terms defined as part of this formal definition,
  • if not part of the above item, actions that result from Pirlguard's classification.
    Only then will it be possible to meaningfully reason about the effectiveness of the protection afforded by Pirlguard, possible weaknesses, and to analyze/simulate effects emerging from the interaction of groups of nodes.
    For a change to a consensus protocol, this amount of formal foundation should be considered a minimum requirement before it could move beyond any draft stage of a standardization process. Having an implementation is nice, but introduces a vast number of details that probably are not relevant to the intended design of Pirlguard.

@padenmoss
Copy link

padenmoss commented Oct 1, 2020

I do not understand what formal definition you are looking for. Are the proposals and implementations referenced unclear in some way for you?

Here is a formal definition from the linked PirlGuard website:

"Once the attacker opens their node for peering it will attempt to peer with rest of the nodes on the network, telling them that they are wrong. However, once this happens PirlGuard will drop the peer and penalize them by sentencing them to mine X amount of penalty blocks due to their un-peered mining."

To answer your questions:

  • This "modification" requires a hard fork that must apply to a majority of nodes and the protocol will affect the chain whenever a block that requires a chain reorganization of pre-determined length ('X') is introduced by one node to any other node from here-on out.

  • The chain is "updated" by penalizing blocks introduced that require deep chain reorganization greater than 'X', which prevents 51% attacks as an improvement.

  • The "update process" modifies consensus between nodes whenever a block requires chain reorganization deeper than 'X'.

This is all thoroughly documented in the links provided herein. Solving this issue with a hard fork is permanent and not subject to conflicting chain splits between various client updates as was experienced at the beginning of August 2020.

@wpwrak
Copy link

wpwrak commented Oct 1, 2020

A formal specification would be 1) concise, 2) allow comparing its model against related specification, such as that of Ethash, 3) allow to precisely model the behaviour of the proposed change, and 4) allow verifying that implementations match the specification. Pirlguard comes with a) very high-level descriptions of the general idea, and b) a patch to an existing client.

a) means that there is far too much room for interpretation, and all the elementary steps (e.g., what information is present at what time, both in terms of state and in terms of data passed between clients, how exactly it is to be used, including how it affects state) are not clearly defined. b) in a way provides that detail, but puts it in the context of a complex implementation. So in order to understand what exactly that patch is doing, you need to i) be very familiar with the implementation of go-callisto, and ii) know the Go language. Worse, the pointer to the code is not even versioned, so you'd have to consider all past and future changes to any of the code not only in the Pirlguard extension but to go-callisto as a whole.

Basing specifications on implementations is inherently dangerous. Even if the implementation tries to be concise, you can quickly end up with code that doesn't work or that does not behave the way it was intended. This is especially true if you're using one of the "new" languages, that still change fairly frequently. As an example, consider the Ethash reference in Python, at
https://eth.wiki/en/concepts/ethash/ethash
It looks nice and concise, but if you try to run it, you'll find that it doesn't work, and you may find that it produces results that differ from what real Ethash does. For a runnable implementation that does produce correct results, see
https://github.com/LinzhiChips/ethash-ecip1043/blob/master/ethash.py

go-callisto is a lot more code, so imagine all the things that you may have to consider there. Not because they would affect what Pirlguard does, but because you can't be sure that they don't unless you have checked them. That's why a specification should be concise and self-reliant. Also, for analysis, you usually want to make simulations. It's usually not too hard to make a model from a good specification, but it's very hard to turn some big implementation into a simulation. (Been there, done that, but while it was fun at the time, I would recommend putting the bar a little lower: http://umlsim.sourceforge.net/)

So, to summarize, the high-level description, a), fails to meet requirements 2, 3, and 4. The code b) fails to meet 1, 2, and 4 (with the exception of the implementation being go-callisto, or very similar to it), and makes 3 too difficult to be of practical use.

E.g., a good specification would first define the local state of the client, e.g., how the relevant part of the chain is represented there. That's also an opportunity to define all the terminology and concepts that will be used, such as head(s), distances, and so on. Then it would define the current operation of incorporating new blocks obtained from a peer, in the terms of this specification, and a low level of abstraction. Now you have a basis for presenting the modification: define any state that gets added (if needed), then define what incorporation of new blocks looks like then.

With such a specification, one could immediately analyze whether the algorithm is sensitive to the timing or partitioning of new blocks presented, whether state is affected by client resets, under what conditions clients may diverge for an undesirably long time, and so on.

If all that gets too long, you can also make a more compact specification that leaves our the underlying model, and reference a more detailed version. At least back when I was working with IETF specifications, that was common practice: the RFC stated the goals and provided the (patch-sized) core bits of the protocol/algorithm/etc., and then there were one or more papers with the underlying details and further analysis.

@padenmoss
Copy link

The behavior of this change has been modeled on the callisto network and a similar implementation has occurred on Horizen. What you are describing is a non-existent obstacle based on a demand for comparison via ethereum classic test nets. This is wholly unnecessary, indeed a waste of resources and man hours, due to the existing network which has already implemented these changes that can be accessed by utilizing callisto. Callisto is itself a fork of ethereum classic, which could be arguably seen as a testnet.

If you want to compare PirlGuard to MESS, then compare the callisto network and the mordor test network. The precise model you desire already exists in the Horizen implementation, with multiple studies conducted on the efficacy of these changes: https://github.com/anneouyang/horizen/blob/master/README.md

No write up or description is going to change the protocol from what is already described. If you wish for a comparison to a separate model, the burden is on you to describe the problem and test a solution. ECIP-1092 is a solution that is easily explained and incorporated quickly without wasting hours on test nets and developing a system similar to what has already been successfully integrated elsewhere.

None of what you describe is necessary. You voice what you would like and expect others to react when you do not put forth effort to answer your own questions. Stop wasting time.

@phyro
Copy link
Member

phyro commented Oct 1, 2020

I agree with @wpwrak. This isn't specific to PirlGuard though, this should be done for any serious change. I think any change to the core and especially the consensus should be shown secure either by proving it or by convincing people that the simplicity of the change and enough eyes will be good enough - even though the code change is 100 lines, it's far from clear to me what side-effects PirlGuard can cause to the system. In both cases, you need a framework to reason about the change and some sort of a formal spec is a great way to do that. If we don't do the research and keep introducing important changes that have very little research behind, we'll eventually make the network explode because we will miss new cases they introduce. Of course, since we're not the size of Bitcoin, we can't rely on having every change come from a peer reviewed research paper, but it needs at least a bit of rigor and some mathematical description. As was mentioned above, there needs to be some modeling of the state and description of the system variables that the change can affect. Dropping a .go file and saying "it works on a tiny network" is not sufficient in my opinion. I'm not a fan of any kind of subjectivity in the consensus, but if I compare MESS and PirlGuard proposals, there's a very big difference in the amount of work that was put into the ECIP and the analysis of the solution.

@padenmoss
Copy link

If you are interested in testing the security, why not attempt to break Callisto? The resources required to build a separate test network which includes PirlGuard would probably cost more than what would be required to test your proposed attack vector on Callisto.

What I see from both @phyro and @wpwrak is a lack of personal responsibility for your accusations. Neither of you are willing to put forth effort into creating a formal definitions of neither ECIP-1092 nor your criticisms of PirlGuard. All you have are speculation and requirements which you are unwilling to meet.

You say this code is not functional, yet you are unwilling to test it nor put your money where your mouth is and attack the Callisto Network.

Since you are unwilling/unable to prove the dysfunction of Callisto, I say we move ECIP-1092 forward unless detractors decide to quit being cheap cowards and break Callisto with their proposed attacks.

@wpwrak
Copy link

wpwrak commented Oct 2, 2020

@padenmoss, you seem to mistake the concerns phyro and I are raising as some extravagant obstacles we put up due to some hostile attitude. Be assured that this is not the case. You are likely to encounter similar requests any time you propose changes of protocols that are used on a larger scale. The purpose of such requests is not to shoot down a proposal but to better understand it and to ensure it does not introduce new major issues.

Regarding effort, I offered two weeks ago on the ecip-1092-pirlguard channel to help with drafting such a specification. After one positive response, the rest were all negative.

I agree that a test net would be an unsuitable tool for such analysis. That's why we use models and simulations. Test networks are good for shaking out relatively easy to find implementation and interoperability issues, but simulations let you create precise conditions, reproduce experiments, and add whatever instrumentation you find useful. But before you can simulate, you need a model.

Thanks for the pointer to Horizen ! This is the first time I saw this mentioned. And yes, also a compact implementation in the constrained environment of a simulator could be a basis for a specification.

@stevanlohja
Copy link
Contributor

Hey @Dexaran are you still active on this proposal? Please let us know in a few days otherwise this Issue could be closed or ECIP set to inactive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:1 draft ECIP is in draft stage an can be assigned ECIP number and merged, but requires community consensus. type: std-core ECIPs of the type "Core" - changing the Classic protocol.
Projects
None yet
Development

No branches or pull requests