Add TAP supporting content addressed targets #156

adityasaky · 2022-06-20T17:06:25Z

This TAP proposes supporting Merkle DAG objects / nodes as targets in TUF metadata. This allows us to extend TUF's protections to various popular applications like Git and other content addressable systems like IPFS and OSTree.

One key change proposed in this TAP is that the ecosystem or application decides how to calculate the hash of its artifact, and this value is re-used.
Generally speaking, to record the hash of a non-file artifact, we need some representation of it. This TAP indirectly proposes using the representation used by the application or ecosystem in question by re-using the hash based identifier.

The TAP goes into some detail about the properties the application of ecosystem under consideration must possess to ensure the integrity of the hash values. Feedback as a whole is welcome! There are some discussion points in line that I'm also going to copy here:

Should conforming implementations also implement base TUF for regular files?
What makes sense for length for Git objects? The commit object files pre-compression?
- This is a use-case specific question that's perhaps better handled when considering the ecosystem specifically rather than this TAP as a whole, but I also wonder if we can provide more guidance on fields that are affected by these changes.

znewman01 · 2022-07-07T15:25:43Z

Very cool, I can see many ways in which this would be useful :)

However, I fail to see what about this proposal is specific to Merkle DAGs. For instance, it seems like Docker images could also benefit (longer digression below). But it really just seems to be describing a pluggable way of adding targets that aren't simple files to TUF.

I agree that Merkle DAGs are a good candidate for non-file types of things we want to hash, and that we need to be really careful about the characteristics of such hashing techniques when we add. But they seem to be inessential to the proposal.

Docker's not the best example here because there is a single file that you can sha256 hash to get the digest---the "manifest", which is a JSON file. I suppose in some sense the manifest represents a Merkle DAG, because it contains hashes of the layers of the image. But checking that a full Docker image has the appropriate digest involves hashing the layers and comparing them to the manifest. So I think it fits.

trishankatdatadog · 2022-07-07T19:16:38Z

Cc @erickt

adityasaky · 2022-07-07T19:54:42Z

Hi Zack, thanks for your comments! I want to note some thoughts:

I agree that Merkle DAG objects are a subset of entities we can record in TUF metadata. Initially, the idea with this specific
TAP was to not clear the way for all non-file entities (that would require something like in-toto's ITE-4: https://github.com/in-toto/ITE/blob/master/ITE/4/README.adoc). In most cases, we'd have to specifically define how to calculate the hashes of these abstract entities, there may well be nothing we can use as is. ITE-4 handles that using something called a "hashable representation".
The reason it talks about Merkle DAG ecosystems specifically is because apart from opening the door to non-file entities, it also allows for hash values that the TUF implementation didn't explicitly calculate, but was instead provided with. While there are varying degrees of verifiability for these values (it's quite straightforward to verify a Git commit's ID for example), one other thing this draft says is that the valid existence of a node in the graph can be used for that node's verification (Git does this well, for example, erroring out when there are invalid / tampered commit objects). This is where the idea of trusted ecosystems that validate their Merkle DAGs came from, given regular auditing of these applications / ecosystems to ensure new changes haven't undermined the assumptions made.
We may not want to go this route for any ecosystems, instead requiring values to be explicitly validated each time. In this case, the TUF implementation would likely have to be aware of the nuances of the ecosystem in greater detail than it'd need to verify the existence of a node in a trusted application. I think this could get quite complicated when we get to "storage backends" like IPFS. I'm currently working on a python-tuf proof-of-concept that's looked at Git so far, but IPFS is what I want to play with next.

There was interest in the last community meeting for something like ITE-4, i.e., more open to non-file entities than what is here, so you're not alone in that regard.

merkle-dag-targets.md

POUFs/TAF-POUF/pouf2.md

lukpueh · 2023-02-23T16:13:56Z

This is interesting. Which of TUFs security properties do we actually care for here? Is TUFs integrity protection even relevant, when targets are content addressable? In other words, does the client need to verify the target hash at all, or is it enough that the target path is in targets metadata?

adityasaky · 2023-03-07T18:00:50Z

Which of TUFs security properties do we actually care for here?

That's a great point and I think emphasizing that in the text will greatly help. With content addressed systems, we care about all of TUF's properties minus artifact integrity. Let me take a pass on that.

Also note that we've proposed a prototype of this TAP as a GSoC 2023 task. That should help us clarify some of these ideas and better evaluate how this TAP would work in practice.

adityasaky · 2023-03-22T14:08:45Z

@lukpueh I reworked the TAP to focus on TUF properties that matter outside of artifact integrity validation. LMK what you think!

tap19.md

trishankatdatadog · 2023-03-22T19:57:25Z

(Sorry, just wanted to say pls count me out of reviewing this TAP now as I will be on a few weeks of PTO. Thanks!)

jkjell

This looks great and sounds really interesting! I dropped a bunch of noob questions in my review. One of the things I was quite sure where to put was around the common and implicit nature of many of these content addressable systems to be used in a distributed environment. This often leads to separate integrity checks at the "server" and the "client" of the application. I don't know if that needs to be more explicitly addressed or would just be simply covered in the "Security Assessment" of the ecosystem.

POUFs/TAF-POUF/pouf2.md

tap19.md

lukpueh

Thanks for the updates, @adityasaky, this looks a lot better!

tap19.md

renatav

I think that there are a couple of things that should be covered by the POUF according to the TAP that we haven't addressed in the TAF POUF, like backwards compatibility. Is that a blocker?

tap19.md

POUFs/TAF-POUF/pouf2.md

tap19.md

POUFs/TAF-POUF/pouf2.md

JustinCappos

I have a few minor concerns with this as stated in my comments. I am supportive in general, but would like to hear more from other community members (especially Lukas, Jussi, and Marina).

If we had a more definitive policy about having "core TUF" and "TUF extensions" this would be an easy, immediate approve from me as a TUF extension.

znewman01 · 2023-04-25T23:24:34Z

CC @sudo-bmitch who mentioned that OCI is a little weird—they wrap the content-addressed blobs in some metadata, then use that as the hash

(I know OCI isn't a primary use case here, but it could be an interesting one.)

sudo-bmitch · 2023-04-26T00:00:30Z

The TL;DR on OCI is you have the following:

A tag that points to a manifest (effectively a mutable symbolic link to a hash)
A manifest which is an OCI json structure containing either a list of manifests or blob hashes (but not both)
Blobs are any data you want, stored by hash

If a blob isn't referenced by a manifest, a registry will usually garbage collect it after some time. So to structure things in OCI you'd want to identify what the individual blobs need to be, manifests to reference those blobs, and what tags to use to locate the manifests. The sha256 hash of the blobs will be the same across different CAS implementations, but the hash of an OCI manifest will probably only exist in the OCI implementation (but it may be similar in concept to the Git directory listing and hash).

mnm678

One nit, but this looks ready to merge as a draft

tap19.md

Signed-off-by: Aditya Sirish <[email protected]> Co-authored-by: Renata Vaderna <[email protected]> Co-authored-by: John Ericson <[email protected]>

JustinCappos

Given the recent changes, I've happy to approve merging this as a draft.

adityasaky · 2023-05-18T19:53:27Z

@sudo-bmitch thanks for the OCI specific information! This TAP should cleanly apply to OCI by using the digest of the "root" manifest. I'd prefer to add this use case in a separate PR though rather than in this one, so that we can explore this structure some more.

adityasaky force-pushed the merkle-dag-tap branch 2 times, most recently from 39bbc23 to fee9704 Compare June 22, 2022 14:45

adityasaky force-pushed the merkle-dag-tap branch from fee9704 to a1a3e77 Compare July 24, 2022 14:31

erickt reviewed Jul 24, 2022

View reviewed changes

merkle-dag-targets.md Outdated Show resolved Hide resolved

merkle-dag-targets.md Outdated Show resolved Hide resolved

merkle-dag-targets.md Outdated Show resolved Hide resolved

merkle-dag-targets.md Outdated Show resolved Hide resolved

adityasaky force-pushed the merkle-dag-tap branch from a1a3e77 to 027da94 Compare July 26, 2022 06:02

adityasaky force-pushed the merkle-dag-tap branch 2 times, most recently from 6d83f29 to 50b65db Compare November 30, 2022 15:45

adityasaky changed the title ~~Add TAP supporting Merkle DAG targets~~ Add TAP supporting content addressed targets Jan 25, 2023

lukpueh reviewed Feb 23, 2023

View reviewed changes

POUFs/TAF-POUF/pouf2.md Outdated Show resolved Hide resolved

This was referenced Mar 7, 2023

Prototype support for content addressable systems such as IPFS theupdateframework/python-tuf#2325

Open

Add TAP enabling a verifiable chain of TUF metadata versions #157

Closed

adityasaky force-pushed the merkle-dag-tap branch 3 times, most recently from 941d7c5 to 91d5caf Compare March 15, 2023 21:46

adityasaky force-pushed the merkle-dag-tap branch from 111ba4b to 98f8556 Compare March 22, 2023 14:09

mnm678 reviewed Mar 22, 2023

View reviewed changes

tap19.md Outdated Show resolved Hide resolved

tap19.md Show resolved Hide resolved

jkjell reviewed Mar 23, 2023

View reviewed changes

POUFs/TAF-POUF/pouf2.md Outdated Show resolved Hide resolved

tap19.md Show resolved Hide resolved

tap19.md Show resolved Hide resolved

lukpueh reviewed Mar 29, 2023

View reviewed changes

tap19.md Outdated Show resolved Hide resolved

tap19.md Outdated Show resolved Hide resolved

adityasaky force-pushed the merkle-dag-tap branch from ebaace5 to 5112977 Compare April 12, 2023 15:40

adityasaky requested review from lukpueh and mnm678 April 12, 2023 16:22

renatav reviewed Apr 12, 2023

View reviewed changes

tap19.md Outdated Show resolved Hide resolved