IPFS: add a transport layer (and append-only data structures) #148

gritzko · 2016-07-24T04:13:38Z

(moved from an e-mail thread with @jbenet @haadcode @diasdavid)
Consider TCP/IP (Transmission Control Protocol / the Internet Protocol). Once you open a network address, you get a data stream. That gives certain flexibility, so IP becomes the hourglass waist. IP can run on top of anything, everything can be implemented on top of IP. Most of application-level protocols run on top of TCP, which provides the data pipe API.

IPFS as such is a git-like graph of static blobs. That makes it simple enough to be the waist, but too limiting to become everybody's base API (data pipe). Can it be the waist then?
I should be able to open a content address (hash or sig) to get a stream of frames.
That is the base API for 95% of stuff.

Streams of immutable frames take a convenient middle ground between mutability and immutability.
IMO, IPFS needs two further steps of generalization:

add append-only frame streams as a generalization of the DAG
add hyperlogs/partially ordered logs as a further generalization (this allows for multiple concurrent writers).

That way IPFS becomes the waist of sufficient expressive power, on par with TCP/IP.
Merkleweb is simple enough (IP) and streams provide the API to build on.
The current diagram shows "naming" between "merkledag" and "applications".
That is why it feels so strange: naming tries to play the role of the transport layer here!!!

The merkledag is the "thin waist" of authenticated datastructures. It is a minimal set of information needed to represent + transfer arbitrary authenticated datastructures. More complex datastructures are implemented on top of the merkledag, such as:

git and other version control systems

bitcoin and other blockchains

unixfs, a content-addressed unix filesystem

(from https://github.com/ipfs/specs/tree/master/architecture)

IPFS needs a transport layer and a transport API.
Something like https://github.com/haadcode/ipfs-log must be a part of the waist.
That will require a pubsub/multicast machinery, obviously.

As a side note, I may recommend to borrow some crypto machinery from RFC7574 which was designed for this kind of use case. (Just ping me if you need it to be generalized to partial orders).

(23 Jul: created; 24 Jul: edited for clarity, moved to notes)

jbenet · 2016-07-24T19:41:13Z

some quick notes:

thanks very much for your thoughts and recommendations! 😄 👍
We disagree on the semantics of some terms. it is not my hope here to find agreement on "what is more right", just to establish how we think about it.
as mentioned elsewhere, we ARE working on pubsub and it WILL be exposed to ipfs applications. but it's not meant to be the thin waist for authenticated data structures.
the stack diagram above is not complete, at all. that's just a projection that's useful to think about.
this is a more complete diagram, but even still incomplete:
of course naming is not a transport... naming is one of several layers that can be (is) added between IPLD and applications. it depends very much on what they use. naming is shown there because many applications use it, and there's an important feature for IPFS: IPNS naming. Also IPNS naming itself depends on underlying transport changes (eg dht vs pub/sub).
for us, thin-waist does not always require a transport. for example JSON is a thin waist of APIs and it works over a large variety of transports (HTTP, REST, custom RPCs, telehash, and more).
HTML5 (HTML, JS, CSS3) is a thin waist of many application platforms, also deployed over a variety of transports, in the web (http, http2), native apps (electron, cordova), and more.
of course the transport is critical. but not the main point of our "thin waist" description. (similar to HTML/JSON vs HTTP). the transport actually varies depending on the use case. you can glob all the relevant transports together and say it's a waist with a common interfaces (IPFS or even libp2p), but we prefer to think of the underlying data structures and semantics as "the thin waist for authenticated data structures". that's much more important and likely to be used in other systems that have very specific transport/distribution mechanics.
Think of IPFS as a transport for IPLD. (eg HTTP to HTML/JSON). Of course, IPFS aims to cover many general cases and provide a good enough transport for all IPLD data structs, but it wont cover all cases / applications.

jbenet · 2016-07-24T19:42:06Z

btw i should mention that i think it will take us a few rounds of discussion like this and being exposed to similar use cases/requirements to synchronize our views, so it's totally fine to disagree lots for a while.

gritzko · 2016-07-27T09:01:27Z

A Merkle DAG is a really good abstraction for the waist. Naturally so.
But, it is strictly immutable. Hence, to implement any mutability, you'll need side channels. Hence, IPLD is not the waist.

Consider TCP/IP. The stuff below the waist works in terms of packet/datagram/frame forwarding. The stuff above works in terms of data streams. Both abstractions are very natural, very general, very convenient. The magic is how streams turn into packets and vice-versa.

IPLD object forwarding is perfectly OK for the lower part of the hourglass. They are immutable, cacheable, integrity-checked. But the upper part, IMO, needs that data stream foundation to build on.

I imagine:

    // get a static blob
    var stream = ipfs_api.open("LONG_LONG_HASH");
    // get a git-like DAG
    var dag_stream = ipfs_api.open("HEAD_HASH", O_RECURSIVE);
    // get a live video feed
    var live_video_stream = ipfs_api.open("STREAM_PUBLIC_KEY", O_FOLLOW);
    // get a partially ordered database log (hyperlog)
    var db_op_log_stream = ipfs_api.open("INITIAL_PUBLIC_KEY",
        O_FOLLOW | O_FOLLOW_INVITED_KEYS);

Then, the upper interface is a stream of immutable IPLD objects/frames. The lower interface is object/frame forwarding. Then, you can focus on the magic inbetween and let the crowd's creativity blossom above and below.

Also, the "pub/sub" mental model is possibly off the mark a bit. When we open a TCP connection to receive live data, we don't consider it "pub/sub". We just "read" from a network "address". If we can "just read" from a content address (hash or key), then we have it.

28Jul edit: hash or key

jbenet · 2016-08-01T19:01:55Z

@gritzko okay i think we have found agreement! \o/

And wow, what agreement! i think in the last few days i've thought about what i needed to understand more of what you initially proposed. (sorry if we were speaking past each other). glad we synchronized much faster :) and i think it's great.

This last post jives enormously well with what @mikolalysenko @nicola @diasdavid and I have been discussing in Lisbon the past few days. We've reviewed a bunch of pub/sub lit and landed on "pub/sub of IPLD objects" being the best way to make pub/sub work, but also to upgrade the IPFS core interface with a corecursive programming model (in addition to the existing recursive support). The gist (people are writing this up) is that we want to be able to subscribe to a given key (i.e. representing some IPLD object), and receive (as emitted pub/sub messages) objects that link to the key, either directly or in a log. The stream can fork (i.e. objects can be sent that do not form a strict log) for purposes of partition tolerance. different heads can be merged back and published (like git, blockchains, hyperlog, orbit-db). Any object gaps (due to partitions, being offline, failures, or omission) can just be retrieved normally. there's a lot more but this gist should give you the idea that i think we're on the same page :) 👍

mikolalysenko · 2016-08-01T19:09:00Z

@gritzko it is spooky how close that is to what @nicola and I were talking about a few days back.

nicola · 2016-08-01T19:51:31Z

@gritzko, @mikolalysenko and I were talking exactly about this:

Also, the "pub/sub" mental model is possibly off the mark a bit. When we open a TCP connection to receive live data, we don't consider it "pub/sub". We just "read" from a network "address". If we can "just read" from a content address (hash or key), then we have it.

I will sync agin with @mikolalysenko (the gist would have been ready if I would not have been sick!)

nicola · 2016-08-01T19:56:48Z

This is more and less what we had at the end of the night

jbenet · 2016-08-01T20:28:46Z

Thanks for the notebook picture @nicola. the serendipity with this issue continues. I wrote this, got interrupted before posting, and when i came back:

gritzko mentioned this issue Jul 24, 2016

IPFS: add append-only data structures ipfs/ipfs#190

Closed

daviddias added the pubsub label Aug 1, 2016

whyrusleeping mentioned this issue Aug 2, 2016

Questions after first learning about IPFS. ipfs-inactive/faq#154

Closed

nicola mentioned this issue Aug 15, 2016

Designing PubSub libp2p/research-pubsub#9

Open

24 tasks

daviddias added the IPLD label Aug 28, 2017

daviddias added the topic/libp2p Topic libp2p label Nov 23, 2018

lidel mentioned this issue Jan 27, 2022

First pass at adding content into IPFS file protocol/launchpad#31

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPFS: add a transport layer (and append-only data structures) #148

IPFS: add a transport layer (and append-only data structures) #148

gritzko commented Jul 24, 2016

jbenet commented Jul 24, 2016 •

edited

Loading

jbenet commented Jul 24, 2016

gritzko commented Jul 27, 2016 •

edited

Loading

jbenet commented Aug 1, 2016

mikolalysenko commented Aug 1, 2016

nicola commented Aug 1, 2016

nicola commented Aug 1, 2016

jbenet commented Aug 1, 2016 •

edited

Loading

IPFS: add a transport layer (and append-only data structures) #148

IPFS: add a transport layer (and append-only data structures) #148

Comments

gritzko commented Jul 24, 2016

jbenet commented Jul 24, 2016 • edited Loading

jbenet commented Jul 24, 2016

gritzko commented Jul 27, 2016 • edited Loading

jbenet commented Aug 1, 2016

@gritzko okay i think we have found agreement! \o/

mikolalysenko commented Aug 1, 2016

nicola commented Aug 1, 2016

nicola commented Aug 1, 2016

jbenet commented Aug 1, 2016 • edited Loading

jbenet commented Jul 24, 2016 •

edited

Loading

gritzko commented Jul 27, 2016 •

edited

Loading

jbenet commented Aug 1, 2016 •

edited

Loading