Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated ec spc and weight fn #538

Merged
merged 16 commits into from
Oct 17, 2019
Merged

updated ec spc and weight fn #538

merged 16 commits into from
Oct 17, 2019

Conversation

sternhenri
Copy link
Contributor

Updated ec with pseudocode to be moved into Go.

Three open questions:

  • (@jbenet) happy with weighting function as is (spec. forkFactor) or tune ahead of testnet?
  • (@Kubuxu) notes on making it easier to implement (sorry)
  • (@whyrusleeping) (@jbenet) notes on new ticket rule to ensure |tickets(tipset)| == |blocks(tipset)|? Decision time to keep or drop...

Idea is to ensure 1 ticket per block always, with downside being ability for miners to grind through nonces in parallel. Though they would have been able to do that in old way through a batch of k at a time (with k the randomness lookback). I claim the difference is negligible without a working VDF.

Copy link

@Kubuxu Kubuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some proposed changes:

src/algorithms/expected_consensus.md Show resolved Hide resolved
src/algorithms/expected_consensus.md Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Show resolved Hide resolved
src/algorithms/expected_consensus.md Show resolved Hide resolved
@anorth
Copy link
Member

anorth commented Oct 7, 2019

FYI @ZenGround0

@whyrusleeping
Copy link
Member

@sternhenri how do i sample randomness from a null block round?

src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
src/algorithms/expected_consensus.md Outdated Show resolved Hide resolved
@jbenet
Copy link
Member

jbenet commented Oct 8, 2019

I'm not able to review the weighing function in the next few days -- i'm not paged in enough. Can somebody else? (@nicola @nikkolasg @jzimmerman)

@nicola
Copy link
Contributor

nicola commented Oct 9, 2019

Hey team, I am blocked on this being merged since I am doing changes which will require this part to compile and in order to do so, I need to either re-do some things in this PR or wait.

@jbenet
Copy link
Member

jbenet commented Oct 10, 2019

This needs to be rebased over master. there's a bunch of commits here that are already in master.

@sternhenri sternhenri changed the title updated ec and weight fn updated ec spc and weight fn [WIP] Oct 11, 2019
@sternhenri
Copy link
Contributor Author

sternhenri commented Oct 11, 2019

Loads of progress. Still WIP. Left TBD:

  • Review Block validation
  • Consensus faults
  • Power table
  • Fix randomness retrieval

@jbenet
Copy link
Member

jbenet commented Oct 11, 2019

I think you need to rebase on top of master, i think @nicola made a bunch of changes to get this area to compile


The weight should be calculated using big integer arithmetic with order of operations defined above. The multiplication by 1,000 and flooring is meant to help generate uniform weights across implementations.
We get:
- `w[r+1] = w[r] + log2b(totalPowerAtTipset(ts)) * 2^8 + (log2b(totalPowerAtTipset(ts)) * len(ts.blocks) * wRatio_num * 2^8) / (e * wRatio_den)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log2b(totalPowerAtTipset(ts)) will essentially become a constant at higher network sizes (it may not catch a 2x drop in power), and it's quite painful to compute.

IMO it should be either dropped, or made more 'aggressive' (something between log and linear)

Copy link

@Kubuxu Kubuxu Oct 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magik6k it is : log2b(x big.Int) = x.BitLen() - 1
AFAIK, it is there to prevent case where minority miner forks off and slashes other miners.

For which it will work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still slash ~50% of power in the worst case without anyone noticing now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @magik6k, this is super useful feedback.

Remember that this is added to weight at every round, and the other factor is going to be smaller (by wRatio_num/wRatio_den), so while it is indeed a constant, a weight drop by 20% over the space of finality (say ~500 blocks) will be noticeable.

On the flip-side, a power-rental attack will have to be maintained for a bit of time before it becomes irrecoverable. Say I can rent 80% of the network for a few hours, can an honest majority collude to ignore my fork and get the heaviest back in some reasonable amount of time? This is what the more 'aggressive' functions lack...

Thereafter, @Kubuxu is right in terms of what this seeks to prevent as a main reason.

What do you think?

In any case, I need to make clear that log2b is a parameter choice as is wRatio so will update to reflect that. Log2 is simply fave candidate rn.

@sternhenri
Copy link
Contributor Author

Done with main content switches @jbenet, now just need to get it compiling before it can be merged. This branch includes:

  • Storage power actor changes
  • all power table accounting
  • all collateral explanations
  • consensus faults explanations
  • randomness sampling throughout the chain
  • specific code for ticket and election proof generation and validation
  • new ticket format for @whyrusleeping
  • weight function specc'd
    and more.

Does not include:

  • storage faults flow
  • completed block validation flow

@sternhenri sternhenri changed the title updated ec spc and weight fn [WIP] updated ec spc and weight fn Oct 13, 2019
@sternhenri
Copy link
Contributor Author

@porcuquine @jbenet any thoughts on test failure here. Work compiles fine for me. Thought I'd ask before diving in in case it's obvious to either of you!

@porcuquine
Copy link
Contributor

@sternhenri It looks like this needs make fmt-code changes to be checked in. This would be clearer if we factor the formatting check into a separate CI job. I did the quickest, dirtiest thing for v0.

@jbenet
Copy link
Member

jbenet commented Oct 15, 2019

That's great to hear @sternhenri will review today

Copy link
Member

@jbenet jbenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM -- looks roughly right, but i'd like @zixuanzh and @nicola to verify it too


The portion of blocks a given miner generates through leader election in EC (and so the block rewards they earn) is proportional to their `Power Fraction` over time. That is, a miner whose storage represents 1% of total storage on the network should mine 1% of blocks on expectation.
In the case that no miner is eligible to produce a block in a given round of EC, the storage power consensus subsystem will be called by the block producer to attempt another leader election by incrementing the nonce appended to the ticket drawn from the past in order to attempt to craft a new valid `ElectionProof` and trying again.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add note about clock synchrony? (ie why cant i just do all them at once until i win? because people wont accept your blocks early)

At height Y in (39, 67], M1 will attempt to generate an `ElectionProof` using the storage market actor from height 39 for their own power (and an actor from Y for total network power); at height 68, M1 will use the storage market actor from height 67 for their own power, and the storage market actor from height 68 for total power and so on.
- At height 0, take the genesis block, return its ticket
- At height n+1, take the heaviest tipset in our chain at height n.
- select the block in that tipset with the smallest final ticket, return its ticket
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "the heaviest tipset in our chain at height n" mean? either:

  • (a) it means "oh there's exactly one tipset in our chain at heigh n, because "our chain" just means the links walking back block.Parents
  • (b) or it means "oh there's many tipsets in our chat at height n, because "out chain" means all possible chains we have, with all possible tipsets, we have to pick the heaviest at each epoch.

if (a), just remove the "take the heaviest tipset in our chain at height n", there's only one.
if (b), then i think this is ambiguous and maybe wrong because - what happens if i have tipsetA and tipsetB each the heaviest tipset in epochA and epochB respectively, but NOT part of the same history (dont link to each other).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • for any tipset:
    • select the block in that tipset with the smallest final ticket, return its ticket.


#### Picking a Ticket to Seal

When starting to prepare a SEAL in round X, the miner should draw a ticket from X-F with which to compute the SEAL.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this here? shouldn't we just say tickets are how we produce randomness in the chain at a given epoch, and then let other things (ie seal) say how they use that randomness?


#### Verifying a Seal's ticket

When verifying a SEAL in round Z, a verifier should ensure that the ticket used to generate the SEAL is found in the range of rounds [Z-T-F-G, Z-T-F+G].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be defined in Sealing, not here in SPC? unless you want it here because it's relevant to verifiable resource commitment?

SuspendPower(addr addr.Address, numSectors UVarint)
UnsuspendPower(addr addr.Address, numSectors UVarint)
RemovePower(addr addr.Address, numSectors UVarint)
RemoveAllPower(addr addr.Address, numSectors UVarint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we need all these methods, let's only have the ones we absolutely need (smaller potential problem/audit surface)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, we only called UpdatePower(powerDelta) in StorageMinerActor for now. powerDelta can be positive or negative which may be a problem. We also don't need addr if we are only changing power for the fromActor who calls the method.


The Miner lifecycle in the power table should be roughly as follows:
- MinerRegistration: A new miner with an associated worker public key and address is registered on the power table by the storage mining subsystem, along with their associated sector size (there is only one per worker).
- UpdatePower: These power increments and decrements (in multiples of the associated sector size) are called by various storage actor (and must thus be verified by every full node on the network). Specifically:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in multiples of the associated sector size

why not just raw Bytes at every point? or GB?

- All Power is decremented immediately after a missed PoSt.
- Power is decremented immediately after faults are declared, proportional to the faulty sector size.
- Power is incremented after a PoSt recovering from a fault.
- Power is definitively removed from the Power Table past the sector failure timeout (see {{<sref storage_faults>}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the sector state cycle that we're working on in #569

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize, only sectors in the Active state will command power. A Sector becomes Active after first PoSt from Committed and Recovering stages. Power is immediately decremented when an Active Sector enters the Failing state (through DeclareFaults or Cron) and when an Active Sector expires.

@jbenet
Copy link
Member

jbenet commented Oct 17, 2019

This needs a rebase unfort-- let's do it soon to avoid much greater divergence.

@zixuanzh wonder how much this conflicts with #569 -- should we merge this first and rebase #569 or vv.?

@jbenet
Copy link
Member

jbenet commented Oct 17, 2019

@sternhenri I rebased for you on top of master (#569 and more) o/

@jbenet
Copy link
Member

jbenet commented Oct 17, 2019

given this is a huge PR, i'm leaning towards merging it before all the comments are addressed (and avoid further merge conflicts

@jbenet
Copy link
Member

jbenet commented Oct 17, 2019

Ok i'm going to merge @sternhenri please address comments as a follow-on PR. Also reconcile w/ new deals/post work -- some things may have been duped/broken

@jbenet jbenet merged commit cdf945e into master Oct 17, 2019
@jbenet jbenet deleted the feat/ec branch October 17, 2019 10:55
@jbenet
Copy link
Member

jbenet commented Oct 17, 2019

cc @zixuanzh pls review this, and maybe help reconcile w/ new deals work if anything is off


In order to determine that the mined block was generated by an eligible miner, one must check its `ElectionProof`.
Note: We draw the miner power from the prior round. This means that if a miner wins a block on their ProvingPeriodEnd even if they have not yet resubmitted a PoSt, they retain their power (until the next round).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we are saying the same thing, miners keep their power as long as they submit a PoSt when challenged by PoSt Surprise or at the end of a ProvingPeriod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is talking specifically about the edge case that is: what if I win a block in the round in which i am challenged (and don't submit). Do I still win or is it forfeited. So this is just meant to avoid fencepost errors. I'll clarify

Any node that detects either of the above events should submit both block headers to the `StoragePowerActor`'s `ReportConsensusFault` method. The "slasher" will receive a portion (TODO: define how much) of the offending miner's {{<sref pledge_collateral>}} as a reward for notifying the network of the fault.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have previously defined the slashing mechanism in #383, we might want to check if that's still relevant and if we can use Cron here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this supercedes that.

And yes, I expect the cronActor could call these though in the case of consensus faults it wouldn't be able to (too costly).


It is important to note that there exists a third type of consensus fault directly reported by the `CronActor` on `StorageDeal` failures via the `ReportUncommittedPowerFault` method:
- (3) `uncommitted power fault` which occurs when a miner fails to submit their `PostProof` and is thus participating in leader election with undue power (see {{<sref storage_faults>}}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes correct, storage faults are declared after a Sector being in the Failing state for three consecutive proving period. A Sector can be put into the Failing state through DeclareFaults or CronAction when a PoSt is missed. Just to be on the same page, pledge collateral will not be slashed for DeclareFaults (miners lose power for declaring faults) but will be slashed when it is spotted by CronAction on the ground of uncommitted power. The amount to be slashed is still TBD cc @jbenet .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I expect all of this is defined in the storage faults section.

SuspendPower(addr addr.Address, numSectors UVarint)
UnsuspendPower(addr addr.Address, numSectors UVarint)
RemovePower(addr addr.Address, numSectors UVarint)
RemoveAllPower(addr addr.Address, numSectors UVarint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, we only called UpdatePower(powerDelta) in StorageMinerActor for now. powerDelta can be positive or negative which may be a problem. We also don't need addr if we are only changing power for the fromActor who calls the method.

type StorageMiner struct {
MinerAddress addr.Address
MinerStoragePower block.StoragePower
MinerSuspendedPower block.StoragePower
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i might be missing something here, any reasons to keep SuspendedPower vs just removing that power?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things. I think passing address is better design since we would want to use the same methods for both the cronactor and self-reporting. So I chose to leave it there. Happy to change though.

My thought on suspended power is that it is necessary in order to account for collateral vs power. For instance if I declare a fault and my power drops, I should not be able to withdraw collateral. For that reason we need an intermediary state for power between just "present" and "absent", hence "suspended".

It may be worth our making a state machine for faults @zixuanzh

- All Power is decremented immediately after a missed PoSt.
- Power is decremented immediately after faults are declared, proportional to the faulty sector size.
- Power is incremented after a PoSt recovering from a fault.
- Power is definitively removed from the Power Table past the sector failure timeout (see {{<sref storage_faults>}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize, only sectors in the Active state will command power. A Sector becomes Active after first PoSt from Committed and Recovering stages. Power is immediately decremented when an Active Sector enters the Failing state (through DeclareFaults or Cron) and when an Active Sector expires.

VRFKeyPair filcrypto.VRFKeyPair
type ConsensusFaultType union {
DoubleForkMiningFault ConsensusFault
ParentGrindingFault ConsensusFault
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should UncommittedPowerFault live here too?


return true
func (bl *Block_I) TicketAtEpoch(epoch ChainEpoch) Ticket {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so does this mean that randomness drawn from a null round is just the randomness of the 'next' block after the series of nulls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for lag. No, it means that it is the randomness from the prior "live" round. (null rounds just means nonce increment). I'll specify further

@whyrusleeping
Copy link
Member

Still unanswered:

@sternhenri how do i sample randomness from a null block round?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants