Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

add basic spec for hamt #109

Closed
wants to merge 2 commits into from
Closed

add basic spec for hamt #109

wants to merge 2 commits into from

Conversation

whyrusleeping
Copy link
Contributor

I don't think this is done yet, but it should be good enough to start an implementation with. Please review for missing details, grammar, and confusing stuff.

@mikeal
Copy link
Contributor

mikeal commented Apr 1, 2019

@rvagg has been working on an implementation actually. it would be nice to eventually align his, the Go implementation, and the Java HAMT that peergos wrote.

@whyrusleeping
Copy link
Contributor Author

Yeah, would definitely love some review from implementers. Tagging him and ian now

@whyrusleeping whyrusleeping requested a review from rvagg April 1, 2019 22:35
@whyrusleeping
Copy link
Contributor Author

cc @ianopolous

The `KV` is serialized as a cbor array (major type 4) with the 'key' field
serialized as a cbor string (major type 3) (TODO: should this just be major
type 2? its probably good to support arbitrary bytes as keys) and placed in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we also use byte[] as keys. In our main usage they are random 32 byte arrays (not a hash).

type 2? its probably good to support arbitrary bytes as keys) and placed in the
zero'th position of the array, and the value serialized { in some way } and
placed in array position 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cbor map would be even more compact (although, I guess, IPLD doesn't currently support binary keys...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm.. yeah. that might get complicated



## Set Value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weren't we considering some kind of hashing with replacement system to completely fill up each layer? Or was that too expensive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt end up investigating it too much, it feels like it might get pretty expensive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buckets system is probably good enough for most cases. What test data did you use when testing depth?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just random keys and values, did some number of inserts, measured average densities, depths, etc. I don't think i committed the code, but it was just the tests in go-hamt-ipld, with some stats collection. Could rig it up again pretty quickly


```go
type Node struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add an optional seed (defaults to 0)? That way we have room to fix the hashmap DoS attack if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been messing with hash algorithm pluggability thinking that (a) different algorithms (and different key lengths) might be optimal in different scenarios, and (b) having the ability to switch it out provides some future-proofing in case of fundamental flaws being discovered in a chosen algorithm. Having space for a seed would also open up space for some keyed algorithms too.
What I can't see (yet) is what kind of use-cases of IPLD are there that attacks against the hash would matter? What's the threat model where this is a concern or is it just a matter of being safe for some as yet unforeseen scenario?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashmap DoS attack works by attacker inserting keys that hash to the same bucket in a hash map. Something very similar can be done with HAMT by selecting keys to lay in the same branch of the tire.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seed/Nonce works around it by making so the attacker can't simply predict in which branch of the tree will given key lay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure how adding a seed helps here. It needs to be deterministic, and if its deterministic, then the attacker can know it too and it doesnt make their lives any harder.

I guess forcing a rehash at each layer makes the attack linearly more expensive, but doesnt necessarily prevent attacks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we actually do need to support, e.g., sha256 if we want both security and determinism on systems like Filecoin. Currently, given an N byte insecure hash function, an attacker could create a tree N deep at the target hash, filling the last layer. This could be used to prevent anyone from using a specific key.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a hard time imagining wanting to use non-deterministic balancing in a distributed system. It seems that would produce either massive flapping if actually used in a frequently updated dataset, and/or introduces a need for coordination where they're previously wasn't one (which is pretty much universally a Bad Thing in a distributed system). Is there a concrete situation where we can imagine using such a thing, and using it well?

"Use a SHA (or other cryptographic function) when it matters" sounds like a much better approach. We're already tossing around enough cryptographic functions that it doesn't sound likely to be much of a cost center to introduce another.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a hard time imagining wanting to use non-deterministic balancing in a distributed system. It seems that would produce either massive flapping if actually used in a frequently updated dataset, and/or introduces a need for coordination where they're previously wasn't one (which is pretty much universally a Bad Thing in a distributed system). Is there a concrete situation where we can imagine using such a thing, and using it well?

So, the simple use-case is a block-chain where the blockchain determines the seed. Once every N blocks, the hamt would be reseeded automatically.

In general, this will also work for single-writer, multi-reader setups. That's usually the most common case.


"Use a SHA (or other cryptographic function) when it matters" sounds like a much better approach. We're already tossing around enough cryptographic functions that it doesn't sound likely to be much of a cost center to introduce another.

I agree although this still won't be optimal. That is, An attacker could pretty easily create a very deep hamt.


Basically, I'd prefer to leave room for future improvements now instead of having to introduce them later. However, custom hash functions is probably enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien what youre suggesting would mean rewriting the entire HAMT every epoch. I think thats far worse than a (worst case) 32 deep lookup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying it could be rebalanced as necessary. However, I agree the better solution is to just use SHA256.

Note: The worst case without using sha256 isn't a 32 deep lookup, it's a full kv list at the max depth preventing further modifications.

To look up a value in the HAMT, first hash the key using a 128 bit murmur3 hash.
Then, for each layer take the first W bits of the hash, and use that to compute
the index for your key, as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't W always a single byte?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's fixed to an arity of 256, but I'm wondering why 256 is chosen here, is it mainly for the ease of accessing the hash in 8-bit chunks?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we cap number of links in protobuf nodes at ~170 so I think the conservative approach was taken and the number of links was kept roughly similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, wanted to make it a number that would result in acceptable maximum node sizes, but 256 specifically was chosen simply because it makes reading the next index off the hash easy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there is a pathological case here: large keys. Should we impose a size limit (256 bytes?) as most filesystems do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a keysize limit is worthwhile, though that should probably be left up to the application

If the lookup terminated on a non-nil Pointer with existing KVs:

1.) If the KVs array has fewer than three items in it, insert the new key value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why fix this at 3?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my rough math, it comes out that maximum size of a HAMT object with keys of 256 bytes, with 36 byte CIDs, with 256 slots filled by 3 items and 20% CBOR encoding overhead comes up to about 256KiB which is the default size of an IPFS block.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, this is good context, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also ran some experiments with different number sizes, and setting it to 4 didnt seem to improve things much.

I'll make '3' a constant and define it


1.) If the KVs array has fewer than three items in it, insert the new key value
pair into the KVs array in order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*key order

but we also need to define what that means, strict byte comparison order probably

insert those four items into that node starting from the current depth
(meaning, if the current tree depth is 3, skip the first `3 * W` bits of the
key hash before starting index calculation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and put the cid of the resulting node in the current node where the KV array was


Now, count the total number of KV pairs across all Pointers in the current
Node. If that number is less than four, gather the remaining KV pairs, delete
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except if we are at depth 0 (the root)

@achingbrain
Copy link
Member

@rvagg has been working on an implementation actually. it would be nice to eventually align his, the Go implementation, and the Java HAMT that peergos wrote.

There's also hamt-sharding which is used by js-IPFS and was extracted from the unixfs-importer that @pgte wrote.

@mikeal
Copy link
Contributor

mikeal commented Apr 3, 2019

@achingbrain that’s good to know. I thought it was still inside unixfs-importer and I had assumed it was tied to dag-pb but it looks like it doesn’t rely on any serializer which is great!

the node, and re-insert them. If the node they are re-inserted into also then
has less than four elements in it (the newly reinserted elements are the only
ones in the node) then recurse.
Node. If that number is less than four, and we are not at the root node of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less than M+1

@mikeal
Copy link
Contributor

mikeal commented Apr 3, 2019

A few top level comments.

  • This is very cbor specific, and we’ve been trying to stay away from format specific specs when we can instead leverage the data model. That way, the same spec can be implemented across different dag implementations that implement the full data model.
  • We need to add some kind of identifier to the root node of the structure that tells us the structure is a hamt. Collections in general need to be self describing, so we need to find a good way to do that which we can then leverage for other collections down the road. Something like a _collection property that is set to hamt-v1.
  • It would be nice if we could add some basic path selectors to the bottom of the spec to show how to look up a key in the graph.

@rvagg
Copy link
Member

rvagg commented Apr 4, 2019

So my experimental JavaScript version is @ https://github.com/rvagg/iamap, it's more similar to the Peergos CHAMP (and Steindorfer's CHAMP PoC) than go-hamt-ipld. It's agnostic to the storage system but there's a IPLD-ish example that encodes with dag-cbor and uses it as Map whose values are links to Sets (same data structure but where keys are the thing we care about but values just exist to make it work, iterating over it with keys() gives us an unsorted Set).

The major difference with the current go-hamt-ipld is that I went with CHAMP's separate datamap and nodemap with the elements stored as data from the left and nodes from the right, which is also what @ianopolous did with the Peergos CHAMP. However, I'm not convinced the benefits of that can be taken advantage of when you're dealing with a non-memory backing store so maybe I'll just collapse that into a single map.

Aside from that, the other thing I've done is just make things flexible:

  • Hash algorithm—you have to register a hash function before you can use a map and it checks those against the multicodec table. The annoying bit is that you can't be certain how many bytes a hash function provides without either testing it or being told up front how many. Although I have considered taking the suffix of the multicodec name because it's accurate in enough cases (but not all but maybe special cases to disallow those is good enough).
  • bit width—how many bits for each level, therefore how many elements
  • bucket size

The pre-serialized shape of a node is:

{
  codec: multicodec<byte[]>,
  bitWidth: <int>,
  bucketSize: <int>,
  depth: <int>,
  dataMap: <int>,
  nodeMap: <int>,
  elements: [
    [ [ key<byte[]>, value<any> ], [ key<byte[]>, value<any> ], [ key<byte[]>, value<any> ] ],
    [ [ key<byte[]>, value<any> ], [ key<byte[]>, value<any> ] ],
    { link: <CID> }
  ]
}

Where elements are either key/value pairs or links to child nodes.

I was thinking of moving that to an array to make it more naturally compact, something like:

[
  [ codec, bitWidth, bucketSize ],
  depth,
  dataMap,
  nodeMap,
  [ ... elements ... ]
]

Where the map's configuration options are the first part (and could expand easily, perhaps adding a seed or some other configuration option for a future variant), the rest is the state of the current node. This would be a very compact serialization across multiple formats without too much special casing.

So far, as I play with this I have some suggestions and some unresolved questions:

  • datamap + nodemap vs single map—with the former there's minor traversal benefits and organisation of elements is a little nicer, but might not be worth the complexity
  • Storing map configuration in the root node vs all nodes—I like having them in all nodes because it makes state management during traversal super easy, but it's not strictly necessary and you could shave off ~12 bytes or so without it (is that worth it?).
  • Allowing arbitrary values, not just CIDs—currently I'm allowing anything and it's turning out to be pretty helpful. If you have simple values, especially ones that are smaller than a CID then it's quick and uncomplicated. The Set use-case is pretty neat too and that just needs a small token as a value. If we get this right then it would be very simple to expose both an IPLD Map and an IPLD Set using the same underlying structure. Steindorfer even gives us examples with his OOPSLA'15 artifacts (TrieMap_5Bits.java is the standard CHAMP and TrieSet_5Bits.java is a Set version).
  • Configurability of both bit-width and bucket size—I know the current go-hamt-ipld is for a specific filecoin use-case but I'd really like to have some flexibility in data shape for other use-cases that may operate under completely different conditions with completely different data types. Perhaps in the browser you'd prefer deep structures so fetching smaller nodes is quick? Perhaps you have tiny values that you are storing in-line so want much larger bit-width? Being able to vary depth & "thickness" at least gives us the ability to experiment. It seems to me that it's too early to have a good understanding of what use-cases this might have outside of IPFS & filecoin and how different shapes might impact those.
  • I think I'm in favour of hash pluggability. It's is a neat feature and I'm enjoying the flexibility it affords (and being able to force identity hashes for tests works great). A quick & short hash is great for small and simple data, but maybe you have a massive data set that could blow out a smaller hash (murmur3 @ 128 gives you 16 levels if you're taking 8 bits each level, which is a lot but only if the hash is distributing perfectly evenly, a theoretical max of 2^(hashBytes/bitWidth)*bucketSize I think, 16M for this current spec) or maybe you need the assurance of randomness that a proper cryptographic hash gives you?

@ianopolous
Copy link
Member

* Allowing arbitrary values, not just CIDs—currently I'm allowing anything and it's turning out to be pretty helpful. If you have simple values, especially ones that are smaller than a CID then it's quick and uncomplicated. The Set use-case is pretty neat too and that just needs a small token as a value. If we get this right then it would be very simple to expose both an IPLD Map and an IPLD Set using the same underlying structure. Steindorfer even gives us examples with his [OOPSLA'15 artifacts](https://github.com/msteindorfer/oopsla15-artifact/tree/master/pdb.values/src/org/eclipse/imp/pdb/facts/util) (TrieMap_5Bits.java is the standard CHAMP and TrieSet_5Bits.java is a Set version).

Note that even with values restricted to cids, you can still do this using a raw cid with identity multihash.

I'm also very pro using different hash functions. We have several different CHAMPs in Peergos and we either use SHA256 or the identity (in that case our keys are cryptographically random 32 bytes and not attacker controlled)

@whyrusleeping
Copy link
Contributor Author

Action items/key questions so far:

  • Pluggable hashes, default to cryptographically secure
  • maximum key size?
  • parameters in every node?

Allowing arbitrary values, not just CIDs

Arbitrary values are already allowed, i don't see anything in the spec that implies this. Am I missing something?

parameters in every node

I guess ~12 bytes isnt the end of the world. It just feels annoying from a 'save the trees' perspective.

@rvagg
Copy link
Member

rvagg commented Apr 26, 2019

This is describing go-ipfs-hamt as it is now. Is that being used already? I know that hamt-sharding is used for js-ipfs-mfs so is it the same way in Go?

If it's the case that this is describing something that's publishing IPLD data today, a unixfs-v1 feature, how about we rename it to something like "unixfs-v1-hamt.md" and be clear that this is specifically for that purpose and then go about making sure that it accurately describes what is being used today. Then that frees up possibilities for more types in the data-structures/ directory.

I've been thinking of this PR in terms of an "ideal HAMT" spec for IPLD, but maybe I was wrong and this is about capturing current state? In #110 I've introduced the term "Multi-block Collections" with an introductory description in the data-structures/ directory. I'm imagining the directory filling out with a bunch of specs for different shaped collections: maps, lists, sets, queues, etc., all foreshadowed in that introductory document. In that case, the current unixfs HAMT deserves recording as it is because it's a thing in the wild, but it shouldn't constrain thinking about expansions to HAMT or even other data structures that implement similar user interfaces (Maps).

If this is purely about future work then never mind all that. I'm taking a bit of a detour to @warpfork's IPLD Schemas to see how far we can stretch them to describe these complex block-spanning data structures. I'll try and get back to my JS HAMT with Schemas soon and hopefully we'll have a good language to help spec this out.

@rvagg rvagg mentioned this pull request Jun 12, 2019
@rvagg
Copy link
Member

rvagg commented Jun 12, 2019

Here's my iteration on this spec: #131

It's very close to this one with the following major differences:

  • Flexible hash algorithm, bitwidth and bucket size, encoded into the root block, but with recommended defaults that are the same as this one.
  • Block layout is a little different, using map instead of bf and data instead of p, then within the data array the "pointers" are either keyed with "link" or "bucket" as the discriminator. The root block has the 3 parameters above too.
  • I've tried to avoid being specific to codec or programming language, schemas are helping a bit with that.
  • Moar text.

@mikeal
Copy link
Contributor

mikeal commented Oct 8, 2020

Closing as this is quite old and we have a much more detailed HAMT spec now.

@mikeal mikeal closed this Oct 8, 2020
@mvdan mvdan deleted the feat/hamt-spec branch October 13, 2020 21:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants