Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEAL Replica ID should also include a historical ticket #56

Closed
nicola opened this issue Dec 9, 2018 · 30 comments
Closed

SEAL Replica ID should also include a historical ticket #56

nicola opened this issue Dec 9, 2018 · 30 comments
Assignees

Comments

@nicola
Copy link
Contributor

nicola commented Dec 9, 2018

This enforces that the miner is dedicating storage to a particular part of the network

EDIT: Based on discussion below and in spec review, I am changing the title and am adding to the original description here. –@porcuquine

When sealing begins, a ticket should be included in the data hashed to generate the replica id.

This ticket should be from a block which is FINALITY rounds back (at seal start time).

When seal proofs are verified, it must be verified that the round from which the ticket was fetched is less than RECENCY rounds back (at verification time).

This implies that RECENCY must be greater than FINALITY (by the total allowable seal time, in rounds).

RECENCY and FINALITY are both integer constants whose values are yet to be determined.

NOTE: This change will require a change to the sealing process because in general, a Filecoin node does not request sealing. Rather sealing is triggered when a piece is added and this results in a sector being full. For this to work, we should pass the correct ticket (FINALITY rounds back) whenever a piece is added. Then whenever sealing is triggered, the most recent ticket can be used to generate the replica id. cc:@laser.


This requires changes in code by @porcuquine @laser

@pooja pooja transferred this issue from another repository Jan 11, 2019
@whyrusleeping
Copy link
Member

@nicola any update here?

@nicola
Copy link
Contributor Author

nicola commented Jan 12, 2019

@porcuquine should add this to the Filecoin Integration (and spec), let me know if it's unclear

@porcuquine
Copy link
Contributor

porcuquine commented Jan 12, 2019

I will add to the spec, then we will implement. Let me check some assumptions first:

  • This can and should be the same hashing method as used for the challenge seed (to PoSt).
  • This may sometimes require hashing multiple blocks (i.e. a parent set).
  • This can and should be the same type as proverID and minerID (other components of replicaID).
  • Specifically, this can be 31 bytes — the type we internally call FrSafe, since it is guaranteed to fit into one field element.

cc: @laser

@nicola
Copy link
Contributor Author

nicola commented Jan 13, 2019

there should be only one block (technically, we should not use the hash of the block but the winning ticket at the current epoch)

@porcuquine
Copy link
Contributor

@laser Is there now a clear answer to the question of where this 'hash of block' chain randomness would come from and what its characteristics would be? I think we should make the spec change once that's the case, but not before.

@nicola
Copy link
Contributor Author

nicola commented Mar 11, 2019

Note: the spec should have this regardless of what is implemented today, otherwise will mislead others reading the spec / interested in doing separate implementations

@pooja
Copy link
Contributor

pooja commented Apr 5, 2019

@porcuquine @nicola Can you please update this issue with the latest?

@pooja pooja added the P2 label Apr 5, 2019
@porcuquine
Copy link
Contributor

porcuquine commented Apr 5, 2019

@nicola @laser @ZenGround0 @sternhenri @whyrusleeping Does anyone know exactly what value is meant to be added to the replica ID? I have not been able to get this answer, but I suspect that one of you knows it.

If none of us know, I believe between us we can decide. Let's do that here, please.

@porcuquine
Copy link
Contributor

Maybe this is the answer:

there should be only one block (technically, we should not use the hash of the block but the winning ticket at the current epoch)

Is this well-defined? Does it mean the current epoch when sealing begins? If so, is there a restriction on how soon the sector must be committed? If not, what prevents generating replica IDs far in advance and using them much later? That is: how does that differ from using very old tickets to generate replica IDs?

@nicola
Copy link
Contributor Author

nicola commented Apr 6, 2019

My latest comment is correct: we should use the ticket of the winning blocks

@porcuquine
Copy link
Contributor

Please read my questions again, or consider this:

What if we always add the ticket produced in the very first round?

If this is invalid, what check enforces that?

If it is valid, what benefit does it provide?

In order for this to be useful, it seems we would need to enforce a certain recency on the ticket. We would need to allow for tickets at least as old as required by the fastest possible seal. In order not to force everyone to perform the fastest possible seal, we would probably want to allow tickets as old as some slower but acceptable seal time. With very long seal times, there's the likelihood that some sealing processes will be interrupted and restarted, adding time. We probably want to account for other delays (like being offline when sealing completes).

Taken together, this suggests that ticket recency should be fairly relaxed (i.e. allow for tickets which are quite old relative to the time at which the sector is committed) — since the replica ID needs to have been constructed when sealing begins.

This also suggests that ticket recency needs to be a function of sector size (since it depends on sealing time).

If we were to add this, I think we would need to:

  • First, define a mechanism by which recency would be checked. For example, will we — at sector commitment time — scan the chain backward looking for a ticket? Or perhaps we will jump back to the oldest allowable ticket (for the committed sector's size) and scan forward.
  • Second, define how ticket recency is calculated as a function of sector size.

It's entirely possible that I'm missing the point of this plan, or that the implied parts I'm asking about have already been specified elsewhere. If so, please just point me to the relevant explanation or repeat it here.

@whyrusleeping Do you know how this is supposed to work?

@whyrusleeping
Copy link
Member

If we're using randomness from the chain at all for mixing here, we should be using the same chain randomness we use for everything else, smallest ticket at tipset X (or its hash).

as @porcuquine says: if we're using values from the chain at all, there needs to be some recency or its pointless (using the first randomness value all the time defeats the purpose). The smallest we can make the limit is PackTime + SealTime + SubmitTime. Likely, we want to add in some additional grace period value that is a significant portion of the sealing time itself.

@porcuquine
Copy link
Contributor

porcuquine commented Apr 7, 2019

Thanks, @whyrusleeping.

One follow-up question: given that there would need to be a significant lag between a block being mined and it being committed-to in the replica ID of a sector, does this address the problem it's meant to?

In other words, can we define the problem we're trying to solve and ensure the recency requirements make this useful under that model?

As an example, I assume (perhaps wrongly) that this requirement is in some way meant to address the risk of forks. If that's the case, and the recency requirement is greater than the finalization period (whether defined or 'pragmatic'), then it's no help. [EDIT: I don't really understand the implications of how this would interact with finality.]

Or maybe the point is that the recency requirement will help force finality — since miners will presumably never want to accept chains which invalidate their storage. Is that the idea? If not, maybe it should be. I see finality listed as an [open question])https://github.com/filecoin-project/specs/blob/61d312f545f4b4d7f3c65061024dfb470e8c1d8e/expected-consensus.md) so am not sure what the latest thinking might be.

@porcuquine
Copy link
Contributor

The more I think about this, the less I understand the idea.

Let's say I am a storage miner, and I begin mining. I commit to a chain (i.e. one of potential alternative forks) by adding a ticket to my replica ID.

After some time passes, it turns out that my guess was wrong, and the ticket to which I committed is in fact no longer part of the current best chain.

As a result, I wasted my time and CPU, as well as making deals I can't (yet) support, so my clients also suffer.

How does this help anyone?

I'm probably just not getting it, but I still don't yet understand what problem this solves — and whether it's worth this negative outcome.

@sternhenri
Copy link
Contributor

Please be gentle with this: I may be jumping into something I don't fully get, with missing context. In case it helps @porcuquine, though I strongly defer to @nicola and @whyrusleeping on this one.

While I have little understanding of some of the context here (ie I could be way off base), here is some of what I gather:

  • We want SEALING to be strongly tied to a given chain, so it enforces some protocol security, preventing miners from flip-flopping across forks, or otherwise enabling nothing at stake (see Seal and PoST security hardness discussion consensus#30 for more on why this is a really cool aspect of FIL).
  • A big part of where this can be relaxed is with PoSTs (which can be easily generated unlike SEALs).

So to me the tradeoff would be between sampling too far back (ie not enforcing much of a commitment to a given chain) and too close to the present (ie risking wasted SEALs for honest miners).

I agree with @whyrusleeping that we should use at least the same randomness as we do for consensus (assuming I've read him right), though we could argue for looking farther back in the case of SEALing (given a greater cost to being wrong, i.e. not just loss of block reward on expectation but waste of a resource and slashing). It would not make sense to look farther back than finality.

Beyond that I don't see why including the hash of a block would be preferable to including a ticket here, though I see downsides to it (grinding) depending on the threat model for PoSTs which I don't have cached.

@porcuquine porcuquine changed the title SEAL Replica ID should also include the hash of a block SEAL Replica ID should also include a historical ticket Apr 9, 2019
@porcuquine
Copy link
Contributor

@nicola @whyrusleeping @sternhenri

I updated the issue and changed the title. Please review and see if this seems correct now.

@sternhenri Based on conversation with @whyrusleeping, I wrote that the ticket should be from exactly FINALITY rounds back. I think this is (just barely) consistent with your statement above that 'It would not make sense to look farther back than finality.' Are we all on the same page with these definitions?

@whyrusleeping
Copy link
Member

@porcuquine more generally, you should select randomness from a block that is final, otherwise you risk having created an invalid sector.

Really, the point of all of this is to increase the cost of 'historical' forks, where someone goes back in time and tries to create a different chain that is heavier than the current real one. If sectors werent tied to chain, then any currently existing sector could be validly used in the attackers fork (ignoring PoSt issues for a moment). By mixing in chain randomness here, we ensure that an attacker going back a month in time to try and create their own chain would have to completely regenerate any and all sectors they use for their forks power.

@porcuquine
Copy link
Contributor

porcuquine commented Apr 9, 2019

@whyrusleeping Understood. In practical terms are you saying the ticket should be selected from FINALITY or greater blocks back? If not, do we have a more well-defined way to specify how the ticket should be selected?

@sternhenri
Copy link
Contributor

sternhenri commented Apr 9, 2019 via email

@whyrusleeping
Copy link
Member

@porcuquine yes, unless miners feel like betting on a chain (which, unless we have true finality, will be probabilistic anyways, and its always 'betting').

Really, this comes down to something like bitcoins '6 block confirmations' thing, where you pick a heuristic of how sure you want to be.

@porcuquine
Copy link
Contributor

Okay, so from what I'm hearing, implementation should provide a value called FINALITY but this doesn't necessarily have to be specified by the protocol. Miners can theoretically set it how they like. RECENCY on the other hand needs to be a protocol-wide constant because it affects proof validity of sector commitments.

I also realize there's a further wrinkle, which is that this check cannot be performed by the FPS — so it's not technically part of proof verification. Rather, it needs to be performed by the node before even verifying the proof. If the recency check fails, then the node shouldn't even bother trying to verify the proof because even a valid proof will be 'invalid' in context. Does that sound right?

From a code perspective, how do you think these values should be specified, given that one may be configurable, and the other is to-be-determined. (It might make most sense for you to have this conversation with @laser, since I'm a bit removed from the go-filecoin code base.)

@sternhenri
Copy link
Contributor

sternhenri commented Apr 9, 2019 via email

@porcuquine
Copy link
Contributor

The proofs spec now includes the need for a ticket when sealing and verifying seal.

The spec doesn't yet reflect the need to verify recency of that ticket outside of the FPS. @whyrusleeping where do you think that should go? (I think proofs.md is probably not the right place.)

@laser Can you create a dev issue that will bring seal and verify seal APIs up-to-date with the spec?

Once those two points are addressed, I think this issue can be closed.

@laser
Copy link
Contributor

laser commented May 29, 2019

How will the prover (the creator of a commitSector message) and the verifier (some miner which received and processes the commitSector message from the network) agree on a block from which the ticket is plucked? Reading through this thread, it does not appear to be the case that the ticket will be included in the commitSector message.

For PoSt, challenge seed randomness is plucked from the block at a height which is equal to the height of the block which marks the start of miner's current proving period (minus lookback). If no block exists at that height, we use the genesis block. A miner's current proving period start-block-height is stored in the state tree - which makes it easier for both the prover and verifier to agree on which block to sample from.

When a miner starts sealing, however, they may not have started proving anything (and thus has no proving period start-block that the network agrees on). So, which block does the miner pluck a ticket from?

If we allow the storage miner to choose a ticket FINALITY blocks back from some arbitrarily-chosen block height, then I am not sure how a seal-verifier will be able to figure out which ticket to pass to verify_seal.

@porcuquine
Copy link
Contributor

I think the miner is meant to choose the most recent known ticket when sealing. I do think this means the ticket's round (or the ticket itself — but round is probably more efficient both to store and to verify) will need to go into the commitSector message. Does that sound right, @whyrusleeping?

@porcuquine
Copy link
Contributor

One more thought: we technically don't need to include anything. Since recency bounds the number of possible values, we could scan (using some sensible heuristic to minimize cost in the normal case) and attempt to verify with every valid ticket. Since verification is relatively cheap this could (in some universe) be worth the on-chain savings. That said, we aren't going to do this, and I mention only for completeness.

I spoke to @whyrusleeping, and he confirms that round number (not ticket) should be included in the commitSector message. cc: @laser

@laser
Copy link
Contributor

laser commented May 30, 2019

@porcuquine

I spoke to @whyrusleeping, and he confirms that round number (not ticket) should be included in the commitSector message. cc: @laser

Roger that. I will put up a spec-repo PR.

laser added a commit that referenced this issue Jun 2, 2019
- replace "sector access" with path
- drop "SectorStore" - which we don't use
- replace "ticket" with "roundNumber" as per [this convo](#56)
- reorder parameters to match impl
@laser
Copy link
Contributor

laser commented Jul 24, 2019

After speaking with @sternhenri and @porcuquine, it is not clear to me from which round a miner should select a ticket for purposes of creating a replica ID (an input to seal).

Additionally, it is it not clear to me how verification should work. The round number from which the miner plucked a ticket (to create a replica ID) is included by the miner in the commitSector message after sealing completes. A verifier presumably must reject commitSector messages whose round is outside of some range.

@sternhenri - Would you please provide some clarity? Specifically:

  • which round should a miner select?
  • how should verification (w/respect to round) work?

cc @dignifiedquire

@sternhenri
Copy link
Contributor

Yes, as best I can tell (should be verified), your understanding of verification is correct. The protocol should specify the valid range for ticket plucking. Anything out of that range should be rejected.

I do think we would want to prevent miners from potentially losing valid SEALS because they plucked a ticket from a block that wasn't finalized, so I would add Finality F to this.

I'll add that there is no incentive for the miner to include a more recent ticket, only incentive to use an older ticket. An older ticket gives them more flexibility to pick subchains on which to PoSt thereafter, a newer one just makes it more likely they pick a non-final ticket.

Variables in the following explanation:

  • F -- Finality
  • X -- when miner starts SEALing
  • Z -- block height in which the SEAL appears
  • Y -- round in SEAL commitSector
  • T -- estimated time for SEAL
  • G -- necessary flexibility to account for network delay and SEAL-time variance

Specifically, in round X the miner starts working on a replica.

  • Miner draws min ticket from X - F

Due to potential variation in time it takes to SEAL, we want to give some flex to miners (but not too much as that would negatively impact security). Let's call that flexibility G, which should be correlated to variance in SEAL time across miners, with some padding for network-related delay (did block get in on time, did miner immediately submit their completed SEAL, etc). The time to SEAL is T

Verifier V receives a block with a SEAL in round Z, indicating it was made with a ticket from round Y (could be X - F or not, miner could lie), V should check:

  • Y within G of Z - T - F.

@sternhenri
Copy link
Contributor

#512

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants