Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fastx distsys] Nested objects, provenance, and replay prevention #98

Closed
sblackshear opened this issue Dec 29, 2021 · 7 comments
Closed
Assignees

Comments

@sblackshear
Copy link
Collaborator

This goes deeper into an issue with object wrapping initially raised in https://docs.google.com/presentation/d/1-xpsmJC3VEaRPH9U1Lklap858MaU_E3YQsRcaiOYI0Y/edit?usp=sharing, slide 16.

Note that this is going off of the Overleaf https://www.overleaf.com/project/6187be7580dc35362b091a73, not what is currently implemented (since the current implementation is lagging behind the spec a bit). In particular, the discussion will not mention object sequence numbers/versions, since they are an optional feature that is useful for making sync more sane, but we could just as easily choose to omit them.

Let's say the following things happen:

  1. First, a transaction T creates an object O with id id1. The derived ObjRef is (id, digest(T), com(O)). The authority updates LockMap[(id, digest(T), com(O))] := T and ObjMap[id] := O
  2. A subsequent transaction T' takes this object as input and wraps it in a new object we will call W. The authority updates LockMap[(id, digest(T), com(O))] = \bot and ObjMap[id] := \bot, as well as doing some W-related updates that we will omit.
  3. A subsequent transaction T'' destroys W, unwraps O from it, and sends it to an address. The authority updates LockMap[(id, digest(T''), com(O))] := T and ObjMap[id] := O

This scheme achieves the following:

  • Replay prevention (by uniqueness of the TxDigest and id combination in the LockMap domain)
  • Object ID stability (i.e., O retains the same ID even as it flows in and out of other objects)

Discussion

However, there is some funkiness here:

  • For tracking of object provenance, it would be nice for a client to easily determine whether step (2) above deleted O forever, or wrapped it in another object. We could choose to push this problem to application devs (e.g., say "well-written applications should emit an event like Wrapped(W.id, O.id)"), or choose to help more directly (e.g., we could do ObjMap[id] := Wrapped(digest(T)) in step 2 instead, where Wrapped is some special marker pointing to the tx that wrapped the object)
  • It could be nice for the storage layer to have the property that once ObjMap[id] is set to \bot, no subsequent assignments to ObjMap[id] can occur.
@gdanezis
Copy link
Collaborator

I have a strong prejudice towards also maintaining the sequence number (or increasing it).

One reason is that if we store any structure by (obj_id, seq, tx_digest) keys in a database we can efficiently tell what refers to the 'first' and what to the 'last' version of an object. Whereas if we reuse the seq then a reader may have to traverse the full hash chain of certs / objects to understand what they are missing.

@sblackshear
Copy link
Collaborator Author

I agree with the importance of this use-case + think that the sequence number does not provide much value if we can't use it for this (especially if it's not needed for replay protection). I have two proposals for maintaining sequence numbers across wrapping, which I'll write in subsequent comments:

@sblackshear
Copy link
Collaborator Author

  1. Force the Move representation of FastX objects to carry a sequence number. We already do this for ID's--the verifer requires every struct with the key attribute to have a field of type ID, as its first field. This way, when an object gets wrapped, we'll remember its old sequence number if/when it flows out of the wrapper and becomes a top level object again.

Notes about this approach:

  • a transaction that wraps object O will bump O's sequence number
  • a transaction that unwraps object O and then publishes it as a top level object will bump O's `sequence number
  • if O modified while it is wrapped, its sequence number will not be bumped
  • if O moves between wrappers without being published as a top-level object, its sequence number will not be bumped.

@sblackshear
Copy link
Collaborator Author

  1. Enhance the runtime to understand the distinction between deleting an object and wrapping it. Deletions will be handled as before, but wrapping will write a special sentinel value into the ObjMap that contains the sequence number of the object at the time it was wrapped and the TxDigest of the transaction that wrapped it. E.g.,
enum ObjMapValue {
  /// a live object. has a sequence number inside
  Object(Object),
  /// object wrapped somewhere in another live object
  Wrapped { seq: SequenceNumber, wrapped_by: TxDigest },
  /// object permanently deleted
  Tombstone { deleted_by: TxDigest }

Notes about this approach:

  • When an object O gets wrapped, its entry in ObjMap is updated with Wrapped entry containing the sequence number before wrapping and the tx that wrapped it
  • If O is subsequently unwrapped + deleted without being published as a top-level object, the runtime will replace its entry in ObjMap with a Tombstone
  • If O is subsequently unwrapped and published as a top-level object, the runtime will use the sequence number from the Wrapped entry for continuity
  • If O moves between wrappers, the runtime could detect this and update the entry, or it could choose not to.
  • Aside: Wrapped could (alternatively, or additionally) contain the ID of the top-level object that directly or transitively contains the wrapped object
  • Aside: do we need/want Tombstone? I had it in my head (and in the overleaf spec) that it's important for an authority to distinguish between an entry in the ObjMap that is empty because its corresponding ID has not yet been created and one that is permanently empty due to a deletion. But I can no longer convince myself.

@sblackshear
Copy link
Collaborator Author

I'm torn on these. (1) is definitely simpler/more direct, but it leaks more implementation details of FastX into Move, which will make it harder to reuse Move code from elsewhere in FastX (and vice versa). But maybe we can minimize the UX impact of this by requiring the first field to be ObjMetadata { id: ID, version: u64 } (or similar). I think this approach would also minimize the changes to the verifier if we adapt ID immutability and no-leak checks to apply to ObjMetadata instead--CC @lxfind to advise.

(2) is definitely a bit more complex--in particular, the logic to create a Wrapped entry will require "digging" into each output object to see if it wraps another object, which isn't great. But if we think Wrapped has significant value in making object provenance more transparent, I'm open to it (though as discussed in the meeting yesterday, I'm in favor of pushing this responsibility onto clients).

@lxfind
Copy link
Contributor

lxfind commented Dec 30, 2021

On (2), an alternative is to require calling a native function (e.g. wrap) when wrapping an FastX object into another, that way we can track all the wrapping events.

@lxfind
Copy link
Contributor

lxfind commented Dec 30, 2021

On (2), an alternative is to require calling a native function (e.g. wrap) when wrapping an FastX object into another, that way we can track all the wrapping events.

Hmm this may not be enforceable. One can wrap an object by writing into a reference, which cannot be detected by a verifier. That leaves another alternative which is when defining an object struct, if it embeds another object, the embedding needs to go through a wrapper instead of direct embedding.

@sblackshear sblackshear self-assigned this Jan 9, 2022
sblackshear added a commit that referenced this issue Jan 10, 2022
Trying approach (1) to addressing #98. In particular, this:

- Extends the Move `ID` type to include a `version`. This allows all Move objects to carry their verson, and thus persist it across wrapping and unwrapping. But having it live inside `ID` saves us from having to rewrite the `ID`-related bytecode verifier passes or ask the programmer to write `struct S has key { id: ID, version: Version, ... }` to declare a FastX object. In fact, there are no changes from the programmer's perspective except that they can now read an object's version inside Move if they wish to.
- Removed `version` from the Rust object types, since it now lives in the Move-managed `contents` field. Added a variety of helper functions for reading and writing the `version from Rust.
- Changed the adapter to understand the new location of `version`. This actually simplifies the adapter code quite a bit.
- Added a test that demonstrates that sequence numbers are maintained correctly across wrapping and unwrapping.

There is one change introduced here that is important enough to mention separately *object versions now begin at 1*. That is, if a tranaction creates and then transfers an object `X`, the version of `X` in the transaction effects will be 1, not 0. The reasons why this change is needed are somewhat subtle.
- This change happens because the adapter increments the sequence number of every transferred object.
- You could imagine asking the adapter to instead check whether a transferred object was created by the current transaction, passed as an input, or unwrapped + only incrementing the sequence number in the second two cases. However, if sequence numbers of freshly created objects started at 0, the adapter would actually not be able to tell the difference between a freshly created object `O1` (will have seq 0) and an object `O2` that was created (will have seq 0), then subsquently wrapped (will still have seq 0), and later unwrapped (will still have seq 0!).
- Another solution to this problem would be incrementing the sequence number of all objects passed as inputs to the transaction before beginning execution. This would allow freshly created objects to start at seq 0, but it would be slightly odd in that the programmer would pass in an object with seq `S`, but would see `S+1 if they try to read the sequence number from Move during the transaction. It seems most intuitive to maintain the invariant that the object you put into the transaction is exactly what you will read inside the transaction. In addition, that would require us to do the "created or transferred/unwrapped" special-casing described above, which makes the adapter a bit more complicated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants