From 53b72f01f68ad2c9c10c28ff21f5cda0929832ba Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Fri, 19 Jan 2018 22:47:08 -0800
Subject: [PATCH 01/27] start work on hyperdb DEP

---
 proposals/0000-hyperdb.md | 237 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 237 insertions(+)
 create mode 100644 proposals/0000-hyperdb.md

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
new file mode 100644
index 0000000..05764d1
--- /dev/null
+++ b/proposals/0000-hyperdb.md
@@ -0,0 +1,237 @@
+
+Title: **DEP-0000: HyperDB**
+
+Short Name: `0000-hyperdb`
+
+Type: Standard
+
+Status: Undefined (as of 2018-01-XXX)
+
+Github PR: (add HTTPS link here after PR is opened)
+
+Authors: noffle (Stephen Whitmore), bnewbold (Bryan Newbold)
+
+
+# Summary
+[summary]: #summary
+
+HyperDB is a new abstraction layer providing a general purpose distributed
+key/value store over the Dat protocol. It is an iteration on the hyperdrive
+directory tree implementation, providing a general purpose key/value store on
+top of the hypercore append-only log abstraction layer. Keys are path-like
+strings (eg, `/food/fruit/kiwi`), and values are arbitrary binary blobs (with
+expected size of under a megabyte).
+
+hyperdrive is expected to be re-implemented on top of HyperDB for improved
+performance with many (millions) of files. The hyperdrive API should be largely
+unchanged, but the `metadata` format will be backwards-incompatible.
+
+
+# Motivation
+[motivation]: #motivation
+
+HyperDB is expected to drastically improve performance of dat clients when
+working with archives containing tens of thousands of files in single
+directories. This is a real-world bottleneck for several current users, with
+basic local actions such as adding a directory taking an unacceptably long time
+to complete.
+
+A secondary benefit is to refactor the trie-structured key/value API out of
+hyperdrive, allowing third party code to build on this abstraction layer.
+
+
+# Usage Documentation
+[usage-documentation]: #usage-documentation
+
+HyperDB is structured to be used much like a traditional hierarchical
+filesystem. A value can be written and read at locations like `/foo/bar/baz`,
+and the API supports querying or tracking values at subpaths, like how watching
+for changes on `/foo/bar` will report both changes to `/foo/bar/baz` and also
+`/foo/bar/19`.
+
+## New API
+
+`add(key, value)`
+
+`get(key)`
+
+`delete(key)`
+
+# Reference Documentation
+[reference-documentation]: #reference-documentation
+
+## Set of append-only logs (feeds)
+
+A HyperDB is fundamentally a set of
+[hypercore](https://github.com/mafintosh/hypercore)s. A *hypercore* is a secure
+append-only log that is identified by a public key, and can only be written to
+by the holder of the corresponding private key.
+
+Each entry in a hypercore has a *sequence number*, that increments by 1 with
+each write, starting at 0 (`seq=0`).
+
+HyperDB builds its hierarchical key-value store on top of these hypercore
+feeds, and also provides facilities for authorization, and replication of those
+member hypercores.
+
+## Incremental index
+
+HyperDB builds an *incremental index* with every new key/value pairs ("nodes")
+written. This means a separate data structure doesn't need to be maintained
+elsewhere for fast writes and lookups: each node written has enough information
+to look up any other key quickly and otherwise navigate the database.
+
+Each node stores the following basic information:
+
+- `key`: the key that is being created or modified. e.g. `/home/sww/dev.md`
+- `value`: the value stored at that key.
+- `seq`: the sequence number of this entry in the hypercore. 0 is the first, 1
+  the second, and so forth.
+- `path`: a 2-bit hash sequence of the key's components
+- `trie`: a navigation structure used with `path` to find a desired key
+
+### Prefix trie
+
+Given a HyperDB with hundreds of entries, how can a key like `/a/b/c` be looked
+up quickly?
+
+Each node stores a *prefix [trie](https://en.wikipedia.org/wiki/Trie)* that
+assists with finding the shortest path to the desired key.
+
+When a node is written, its *prefix hash* is computed. This done by first
+splitting the key into its components (`a`, `b`, and `c` for `/a/b/c`), and then
+hashing each component into a 32-character hash, where one character is a 2-bit
+value (0, 1, 2, or 3). The `prefix` hash for `/a/b/c` is
+
+```js
+node.path = [
+1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
+0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+4 ]
+```
+
+Each component is divided by a newline. `4` is a special value indicating the
+end of the prefix.
+
+#### Example
+
+Consider a fresh HyperDB. We write `/a/b = 24` and get back this node:
+
+```js
+{ key: '/a/b',
+  value: '24',
+  trie: [],
+  seq: 0,
+  path:
+   [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+     0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
+     4 ] }
+```
+
+If you compare this path to the one for `/a/b/c` above, you'll see that the
+first 64 2-bit characters match. This is because `/a/b` is a prefix of
+`/a/b/c`.  Since this is the first entry, `seq` is 0.
+
+Now we write `/a/c = hello` and get this node:
+
+```js
+{ key: '/a/c',
+  value: 'hello',
+  trie: [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , [ , , [ { feed: 0, seq: 0 } ] ] ],
+  seq: 1,
+  path: 
+   [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+     0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+     4 ] }
+```
+
+Its `seq` is 1, since the last was 0.  Also, this and the previous node have
+the first 32 characters of their `path` in common (the prefix `/a`).
+
+Notice though that `trie` is set. It's a long but sparse array. It has 35
+entries, with the last one referencing the first node inserted (`a/b/`). Why?
+
+(If it wasn't stored as a sparse array, you'd actually see 64 entries (the
+length of the `path`). But since the other 29 entries are also empty, hyperdb
+doesn't bother allocating them.)
+
+If you visually compare this node's `path` with the previous node's `path`, how
+many entries do they have in common? At which entry do the 2-bit numbers
+diverge?
+
+At the 35th entry.
+
+What this is saying is "if the hash of the key you're looking for differs from
+mine on the 35th entry, you want to travel to `{ feed: 0, seq: 0 }` to find the
+node you're looking for.
+
+This is how finding a node works, starting at any other node:
+
+1. Compute the 2-bit hash sequence of the key you're after (e.g. `a/b`)
+2. Lookup the newest entry in the feed.
+3. Compare its `path` against the hash you just computed.
+4. If you discover that the `path` and your hash match, then this is the node
+   you're looking for!
+5. Otherwise, once a 2-bit character from `path` and your hash disagree, note
+   the index # where they differ and look up that value in the node's `trie`.
+   Fetch that node at the given feed and sequence number, and go back to step 3.
+   Repeat until you reach step 4 (match) or there is no entry in the node's trie
+   for the key you're after (no match).
+
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+A backwards-incompatible change will have negative effects on the broader dat
+ecosystem: clients will need to support both versions protocol for some time
+(increasing maintenance burden), future clients may not interoperate with old
+archives, etc. These downsides can partially be avoided by careful roll-out.
+
+For the specific use case of Dat archives, HyperDB will trivially increase
+metadata size (and thus disk and network consumption) for archives with few
+files.
+
+# Rationale and alternatives
+[alternatives]: #alternatives
+
+TODO: 
+
+- Why is this design the best in the space of possible designs?
+- What other designs have been considered and what is the rationale for not choosing them?
+- What is the impact of not doing this?
+
+# Unresolved questions
+[unresolved]: #unresolved-questions
+
+TODO: 
+- What parts of the design do you expect to resolve through the DEP consensus process before this gets merged?
+- What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers?
+- What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP?
+
+# Migration logistics
+[migration]: #migration
+
+HyperDB is not backwards compatible with the existing hyperdrive
+implementation, meaning dat clients will need to support multiple on-disk
+representations during a transition period.
+
+a new abstraction layer between hypercore (replicated append-only
+logs) and hyperdrive (versioned file system abstraction).  HyperDB provides an
+efficient key/value database API, with path-like strings as keys and arbitrary
+binary data (up to a reasonable chunk size) as values. HyperDB will require
+breaking changes to dat clients, but will not require changes to the network
+wire protocol.
+
+# Changelog
+[changelog]: #changelog
+
+As of January 2018, @mafintosh is leading development of a hyperdb nodejs
+module on [github](https://github.com/mafintosh/hyperdb), which is the basis
+for this DEP.
+
+- 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
+  [hyperdb github repo][arch_md]
+- 2018-01-XXX: First complete draft submitted for review
+
+[arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md

From 6ca744c8d0f5e5a68123707abb54efa499a633fd Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 4 Feb 2018 19:09:56 -0800
Subject: [PATCH 02/27] hyperdb progress

---
 proposals/0000-hyperdb.md | 131 +++++++++++++++++++++++++++-----------
 1 file changed, 94 insertions(+), 37 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 05764d1..e81fc59 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -5,7 +5,7 @@ Short Name: `0000-hyperdb`
 
 Type: Standard
 
-Status: Undefined (as of 2018-01-XXX)
+Status: Undefined (as of 2018-02-XX)
 
 Github PR: (add HTTPS link here after PR is opened)
 
@@ -17,14 +17,14 @@ Authors: noffle (Stephen Whitmore), bnewbold (Bryan Newbold)
 
 HyperDB is a new abstraction layer providing a general purpose distributed
 key/value store over the Dat protocol. It is an iteration on the hyperdrive
-directory tree implementation, providing a general purpose key/value store on
-top of the hypercore append-only log abstraction layer. Keys are path-like
-strings (eg, `/food/fruit/kiwi`), and values are arbitrary binary blobs (with
-expected size of under a megabyte).
+directory tree implementation, building top of the hypercore append-only log
+abstraction layer. Keys are path-like strings (eg, `/food/fruit/kiwi`), and
+values are arbitrary binary blobs (generally under a megabyte).
 
-hyperdrive is expected to be re-implemented on top of HyperDB for improved
-performance with many (millions) of files. The hyperdrive API should be largely
-unchanged, but the `metadata` format will be backwards-incompatible.
+Hyperdrive (used by the Dat application) is expected to be re-implemented on
+top of HyperDB for improved performance with many files (eg, millions). The
+hyperdrive API should be largely unchanged, but the `metadata` format will be
+backwards-incompatible.
 
 
 # Motivation
@@ -37,44 +37,98 @@ basic local actions such as adding a directory taking an unacceptably long time
 to complete.
 
 A secondary benefit is to refactor the trie-structured key/value API out of
-hyperdrive, allowing third party code to build on this abstraction layer.
+hyperdrive, allowing third party code to build applications directly on this
+abstraction layer.
 
 
 # Usage Documentation
 [usage-documentation]: #usage-documentation
 
+*This section describes HyperDB's interface and behavior in the abstract for
+application programmers. It is not intended to be exact documentation of any
+particular implementation (including the reference Javascript module).*
+
 HyperDB is structured to be used much like a traditional hierarchical
 filesystem. A value can be written and read at locations like `/foo/bar/baz`,
 and the API supports querying or tracking values at subpaths, like how watching
 for changes on `/foo/bar` will report both changes to `/foo/bar/baz` and also
 `/foo/bar/19`.
 
-## New API
+Lower-level details of the hypercore append-only log, disk serialization, and
+networked synchronization features that HyperDB builds on top of are not
+described in detail here. Multi-writer hypercore details are also not discussed
+in this DEP. References to formal documentation of these details are TODO.
 
-`add(key, value)`
+A HyperDB database instance can be represented by a single hypercore feed (or
+several feeds in a multi-writer context), and is named, referenced, and
+discovered using the public and discovery keys of the hypercore feed (or the
+original feed if there are several). In a single-writer configuration, only a
+single node (holding the secret key) can mutate the database (eg, via `put` or
+`delete` actions).
 
-`get(key)`
+**Keys** can be any UTF-8 string. Path segments are separated by the forward
+slash character (`/`).
 
-`delete(key)`
+**Values** can be any binary blob, including empty (of zero length). For
+example, values could be UTF-8 encoded strings, JSON encoded objects, protobuf
+messages, or a raw `uint64` integer (of either endian-ness). Length is the only
+form of type or metadata stored about the value; deserialization and validation
+are left to library and application developers.
 
-# Reference Documentation
-[reference-documentation]: #reference-documentation
+TODO: key corner-cases:
+- `///` same as `//`? `///a` vs. `//a`?
+
+TODO: size limits on keys and values?
+
+TODO: "prefix" end in a trailing slash, or optional?
+
+A leading `/` is optional: `db.get('/hello')` and `db.get('hello')` are
+equivalent.
+
+## Core API Semantics
+
+`db.put(key, value)`: inserts `value` (arbitrary bytes) under the path `key`.
+Returns an error (eg, via callback) if there was a problem.
+
+`db.get(key)`: Reading a non-existant `key` is an error.
+
+`db.delete(key)`: Removes the key from the database. Deleting a non-existant
+key is an error.
+
+`db.list(prefix)`: returns a flat (not nested) list of all keys currently in
+the database.
+
+TODO: in what order?
 
-## Set of append-only logs (feeds)
+TODO: what if node is a file, not a prefix?
 
-A HyperDB is fundamentally a set of
-[hypercore](https://github.com/mafintosh/hypercore)s. A *hypercore* is a secure
-append-only log that is identified by a public key, and can only be written to
-by the holder of the corresponding private key.
+An example pseudo-code session working with a database might be:
 
-Each entry in a hypercore has a *sequence number*, that increments by 1 with
-each write, starting at 0 (`seq=0`).
+TODO:
 
-HyperDB builds its hierarchical key-value store on top of these hypercore
-feeds, and also provides facilities for authorization, and replication of those
-member hypercores.
+TODO: Read Stream
 
-## Incremental index
+TODO: Changes Stream
+
+TODO: Diff Stream
+
+
+# Reference Documentation
+[reference-documentation]: #reference-documentation
+
+The new Node protobuf message schema is:
+
+```
+  message Node {
+    optional string key = 1;
+    optional bytes value = 2;
+    repeated uint64 clock = 3;
+    optional bytes trie = 4;
+    optional uint64 feedSeq = 6; // TODO remove and merge into index (trie+feedSeq)
+  }
+```
+
+## Incremental Index
 
 HyperDB builds an *incremental index* with every new key/value pairs ("nodes")
 written. This means a separate data structure doesn't need to be maintained
@@ -209,19 +263,22 @@ TODO:
 - What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers?
 - What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP?
 
-# Migration logistics
+# Dat Migration logistics
 [migration]: #migration
 
-HyperDB is not backwards compatible with the existing hyperdrive
-implementation, meaning dat clients will need to support multiple on-disk
-representations during a transition period.
+HyperDB is not backwards compatible with the existing hyperdrive metadata,
+meaning dat clients may need to support both versions during a transition
+period. This applies both to archives saved to disk (eg, in SLEEP) and to
+archives received and published to peers over the network.
+
+No changes to the Dat network wire protocol itself are necessary, only changes
+to content passed over the protocol. The Dat `content` feed, containing raw
+file data, is not impacted by HyperDB, only the contents of the `metadata`
+feed.
 
-a new abstraction layer between hypercore (replicated append-only
-logs) and hyperdrive (versioned file system abstraction).  HyperDB provides an
-efficient key/value database API, with path-like strings as keys and arbitrary
-binary data (up to a reasonable chunk size) as values. HyperDB will require
-breaking changes to dat clients, but will not require changes to the network
-wire protocol.
+Upgrading a Dat (hyperdrive) archive to HyperDB will necessitate creating a new
+feed from scratch, meaning new public/private key pairs, and that public key
+URL links will need to change.
 
 # Changelog
 [changelog]: #changelog
@@ -232,6 +289,6 @@ for this DEP.
 
 - 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
   [hyperdb github repo][arch_md]
-- 2018-01-XXX: First complete draft submitted for review
+- 2018-02-XX: First complete draft submitted for review
 
 [arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md

From 6570f5f3139cd2f917da01f6ce20cde257f66801 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sat, 3 Mar 2018 14:28:42 -0800
Subject: [PATCH 03/27] hyperdb progress

---
 proposals/0000-hyperdb.md | 142 ++++++++++++++++++++++++--------------
 1 file changed, 91 insertions(+), 51 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index e81fc59..2a0656e 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -5,11 +5,11 @@ Short Name: `0000-hyperdb`
 
 Type: Standard
 
-Status: Undefined (as of 2018-02-XX)
+Status: Undefined (as of 2018-03-XX)
 
 Github PR: (add HTTPS link here after PR is opened)
 
-Authors: noffle (Stephen Whitmore), bnewbold (Bryan Newbold)
+Authors: [Stephen Whitmore](https://github.com/noffle), [Bryan Newbold](https://github.com/bnewbold)
 
 
 # Summary
@@ -36,9 +36,11 @@ directories. This is a real-world bottleneck for several current users, with
 basic local actions such as adding a directory taking an unacceptably long time
 to complete.
 
-A secondary benefit is to refactor the trie-structured key/value API out of
-hyperdrive, allowing third party code to build applications directly on this
-abstraction layer.
+A secondary benefit is to refactor the [trie][trie]-structured key/value API
+out of hyperdrive, allowing third party code to build applications directly on
+this abstraction layer.
+
+[trie]: https://en.wikipedia.org/wiki/Trie
 
 
 # Usage Documentation
@@ -56,8 +58,10 @@ for changes on `/foo/bar` will report both changes to `/foo/bar/baz` and also
 
 Lower-level details of the hypercore append-only log, disk serialization, and
 networked synchronization features that HyperDB builds on top of are not
-described in detail here. Multi-writer hypercore details are also not discussed
-in this DEP. References to formal documentation of these details are TODO.
+described in detail here; see the [DEP repository][deps]. Multi-writer
+hypercore semantics are also not discussed in this DEP.
+
+[deps]: https://github.com/datprotocol/DEPs
 
 A HyperDB database instance can be represented by a single hypercore feed (or
 several feeds in a multi-writer context), and is named, referenced, and
@@ -67,7 +71,10 @@ single node (holding the secret key) can mutate the database (eg, via `put` or
 `delete` actions).
 
 **Keys** can be any UTF-8 string. Path segments are separated by the forward
-slash character (`/`).
+slash character (`/`). Repeated slashes (`//`) are disallowed. Leading and
+trailing `/` are optional in application code: `/hello` and `hello` are
+equivalent. A key can be both a "path segment" and key at the same time; eg,
+`/a/b/c` and `/a/b` can both be keys at the same time.
 
 **Values** can be any binary blob, including empty (of zero length). For
 example, values could be UTF-8 encoded strings, JSON encoded objects, protobuf
@@ -75,48 +82,53 @@ messages, or a raw `uint64` integer (of either endian-ness). Length is the only
 form of type or metadata stored about the value; deserialization and validation
 are left to library and application developers.
 
-TODO: key corner-cases:
-- `///` same as `//`? `///a` vs. `//a`?
-
-TODO: size limits on keys and values?
-
-TODO: "prefix" end in a trailing slash, or optional?
-
-A leading `/` is optional: `db.get('/hello')` and `db.get('hello')` are
-equivalent.
-
 ## Core API Semantics
 
+A `db` is instantiated by opening an existing hypercore with hyperdb content
+(read-only, or optionally read-write if the secret key is available), or
+creating a new one. A handle could represent any specific revision in history,
+or the "latest" revision.
+
 `db.put(key, value)`: inserts `value` (arbitrary bytes) under the path `key`.
-Returns an error (eg, via callback) if there was a problem.
+Requires read-write access.  Returns an error (eg, via callback) if there was a
+problem.
 
-`db.get(key)`: Reading a non-existant `key` is an error.
+`db.get(key)`: Reading a non-existant `key` is an error. Read-only.
 
 `db.delete(key)`: Removes the key from the database. Deleting a non-existant
-key is an error.
+key is an error. Requires read-write access.
 
 `db.list(prefix)`: returns a flat (not nested) list of all keys currently in
-the database.
-
-TODO: in what order?
+the database under the given prefix. Prefixes operate on a path-segment basis:
+`/ab` is not a valid prefix for key `/abcd`, but is valid for `/ab/cd`. If the
+prefix does not exist, returns an empty list. The order of returned keys is
+implementation (or configuration) specific. Read-only.
 
-TODO: what if node is a file, not a prefix?
+If the hypercore underlying a hyperdb is only partially replicated, behavior is
+implementation-specific. For example, a `get()` call could block until the
+relevant value is replicated, or the implementation could return an error.
 
 An example pseudo-code session working with a database might be:
 
-TODO:
-
-TODO: Read Stream
-
-TODO: Changes Stream
-
-TODO: Diff Stream
-
+    db.put('/life/animal/mammal/kitten', '{"cuteness": 500.3}')
+    db.put('/life/plant/bush/bananna', '{"delicious": 103.4}')
+    db.delete('/life/plant/bush/bananna')
+    db.put('/life/plant/tree/bananna', '{"delicious": 103.4}')
+    db.get('/life/animal/mammal/kitten')
+    => {"cuteness": 500.3}
+    db.list('/life/')
+    => ['/life/animal/mammal/kitten', '/life/plant/tree/bananna']
 
 # Reference Documentation
 [reference-documentation]: #reference-documentation
 
-The new Node protobuf message schema is:
+A HyperDB hypercore feed typically consists of a sequence of entries, all of
+which are protobuf-encoded `Node` messages. Higher-level protocols may make
+exception to this: for example, hyperdrive reserves the first entry of the
+(`metadata`) feed for a special entry that refers to the second (`content`)
+feed.
+
+The `Node` protobuf message schema is:
 
 ```
   message Node {
@@ -124,10 +136,18 @@ The new Node protobuf message schema is:
     optional bytes value = 2;
     repeated uint64 clock = 3;
     optional bytes trie = 4;
-    optional uint64 feedSeq = 6; // TODO remove and merge into index (trie+feedSeq)
+    optional uint64 feedSeq = 6;
   }
 ```
 
+TODO(mafintosh): where is this schema actually defined for the `next` branch of
+the hyperdb repo?
+
+`key` and `value` are as expected. A non-existant `value` indicates that this
+message is a deletion (see below). The `clock` element is reserved for use in
+the forthcoming `multi-writer` standard, not described here. An empty list is
+the safe (and expected) value for `clock` in single-writer use cases.
+
 ## Incremental Index
 
 HyperDB builds an *incremental index* with every new key/value pairs ("nodes")
@@ -139,7 +159,7 @@ Each node stores the following basic information:
 
 - `key`: the key that is being created or modified. e.g. `/home/sww/dev.md`
 - `value`: the value stored at that key.
-- `seq`: the sequence number of this entry in the hypercore. 0 is the first, 1
+- `feedSeq`: the sequence number of this entry in the hypercore. 0 is the first, 1
   the second, and so forth.
 - `path`: a 2-bit hash sequence of the key's components
 - `trie`: a navigation structure used with `path` to find a desired key
@@ -233,6 +253,14 @@ This is how finding a node works, starting at any other node:
    Repeat until you reach step 4 (match) or there is no entry in the node's trie
    for the key you're after (no match).
 
+# Examples
+[examples]: #examples
+
+TODO: lookup a "deep" key
+
+TODO: deleting a key
+
+TODO: listing a prefix
 
 # Drawbacks
 [drawbacks]: #drawbacks
@@ -246,24 +274,16 @@ For the specific use case of Dat archives, HyperDB will trivially increase
 metadata size (and thus disk and network consumption) for archives with few
 files.
 
+
 # Rationale and alternatives
 [alternatives]: #alternatives
 
-TODO: 
+TODO: scaling issues with existing system; handling many entries
 
-- Why is this design the best in the space of possible designs?
-- What other designs have been considered and what is the rationale for not choosing them?
-- What is the impact of not doing this?
+TODO: why multi-writer fields included here
 
-# Unresolved questions
-[unresolved]: #unresolved-questions
 
-TODO: 
-- What parts of the design do you expect to resolve through the DEP consensus process before this gets merged?
-- What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers?
-- What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP?
-
-# Dat Migration logistics
+# Dat migration logistics
 [migration]: #migration
 
 HyperDB is not backwards compatible with the existing hyperdrive metadata,
@@ -280,12 +300,32 @@ Upgrading a Dat (hyperdrive) archive to HyperDB will necessitate creating a new
 feed from scratch, meaning new public/private key pairs, and that public key
 URL links will need to change.
 
+
+# Unresolved questions
+[unresolved]: #unresolved-questions
+
+Details of keys to work out (before Draft status):
+
+- must a prefix end in a trailing slash, or optional?
+- can there be, eg, keys `/a/b` and `/a/b/c` at the same time?
+
+There are implied "reasonable" limits on the size (in bytes) of both keys and
+values, but they are not formally specified. Protobuf messages have a hard
+specified limit of 2 GByte (due to 32-bit signed arthimetic), and most
+implementations impose a (configurable) 64 MByte limit. Should this DEP impose
+specific limits on key and value sizes? Would be good to decide before Draft
+status.
+
+Apart from leaving fields in the protobuf message specification, multi-writer
+concerns are out of scope for this DEP.
+
+
 # Changelog
 [changelog]: #changelog
 
-As of January 2018, @mafintosh is leading development of a hyperdb nodejs
-module on [github](https://github.com/mafintosh/hyperdb), which is the basis
-for this DEP.
+As of March 2018, @mafintosh is leading development of a hyperdb nodejs module
+on [github](https://github.com/mafintosh/hyperdb), which is the basis for this
+DEP.
 
 - 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
   [hyperdb github repo][arch_md]

From 7dc1759de6afc61acfe8b47e34ab18160e6c803e Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 4 Mar 2018 14:50:29 -0800
Subject: [PATCH 04/27] significantly more progress on hyperdb DEP

---
 proposals/0000-hyperdb.md | 418 ++++++++++++++++++++++++++++++--------
 1 file changed, 331 insertions(+), 87 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 2a0656e..9a144b2 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -82,7 +82,9 @@ messages, or a raw `uint64` integer (of either endian-ness). Length is the only
 form of type or metadata stored about the value; deserialization and validation
 are left to library and application developers.
 
+
 ## Core API Semantics
+[core_api]: #core_api
 
 A `db` is instantiated by opening an existing hypercore with hyperdb content
 (read-only, or optionally read-write if the secret key is available), or
@@ -119,83 +121,213 @@ An example pseudo-code session working with a database might be:
     db.list('/life/')
     => ['/life/animal/mammal/kitten', '/life/plant/tree/bananna']
 
+
 # Reference Documentation
 [reference-documentation]: #reference-documentation
 
 A HyperDB hypercore feed typically consists of a sequence of entries, all of
-which are protobuf-encoded `Node` messages. Higher-level protocols may make
+which are protobuf-encoded Node messages. Higher-level protocols may make
 exception to this: for example, hyperdrive reserves the first entry of the
 (`metadata`) feed for a special entry that refers to the second (`content`)
 feed.
 
-The `Node` protobuf message schema is:
+The sequence of entries includes an incremental index: the most recent entry in
+the feed contains metadata pointers that can be followed to efficiently look up
+any key in the database without needing to linear scan the entire history or
+generate an independent index data structure. Of course implementations are
+free to maintain such an index if they prefer.
+
+The Node protobuf message schema is:
 
 ```
   message Node {
     optional string key = 1;
     optional bytes value = 2;
     repeated uint64 clock = 3;
-    optional bytes trie = 4;
-    optional uint64 feedSeq = 6;
+    optional bytes trie = 4;    // TODO: actually a {feed, seq} pair
+    optional bytes path = 5;    // TODO: actual type?
+    optional uint64 seq = 6;
+    optional bytes feed = 7;    // TODO: actual type?
   }
 ```
 
 TODO(mafintosh): where is this schema actually defined for the `next` branch of
 the hyperdb repo?
 
-`key` and `value` are as expected. A non-existant `value` indicates that this
-message is a deletion (see below). The `clock` element is reserved for use in
-the forthcoming `multi-writer` standard, not described here. An empty list is
-the safe (and expected) value for `clock` in single-writer use cases.
+- `key`: UTF-8 key that this node describes. Leading and trailing forward
+  slashes (`/`) are always striped before storing in protobuf.
+- `value`: arbitrary byte array. A non-existant `value` entry indicates that
+  this Node indicates a deletion of the key; this is distinct from specifying
+  an empty (zero-length) value.
+- `clock`: reserved for use in the forthcoming `multi-writer` standard, not
+  described here. An empty list is the safe (and expected) value for `clock` in
+  single-writer use cases.
+- `trie`: a structured array of pointers to other Node entries in the feed,
+  used for navigating the tree of keys.
+- `path`: a 2-bit hash sequence of `key`'s components.
+- `seq`: the zero-indexed sequence number of this Node (hyperdb) entry in
+  the feed. Note that this may be offset from the hypercore entry index if
+  there are any leading non-hyperdb entries.
+- `feed`: reserved for use with `multi-writer`. The safe single-writer value is
+  to use the current feed's hypercore public key.
+
+TODO(mafintosh): should `feed` actually be the current hypercore publickey?
+
+
+## Path Hashing 
+[path_hashing]: #path_hashing
+
+Every key path has component-wise fixed-size hash representation that is used
+by the trie. This is written to the `path` field each Node protobuf
+message.
+
+The path hash is represented by an array of bytes. Elements are 2-bit encoded
+(values 0, 1, 2, 3), except for an optional terminating element which has value
+4. Each path element consists of 32 values, representing a 64-bit hash of that
+path element. For example, the key `/tree/willow` has two path segments (`tree`
+and `willow`), and will be represented by a 65 element path hash array (two 32
+element hashes plus a terminator).
+
+The hash algorithm used is `SipHash-2-4`, with an 8-byte output and
+16-byte key; the input is the UTF-8 encoded path string segment, without any
+`/` separators or terminating null bytes. An implementation of this hash
+algorithm is included in the libsodium library, under the function
+`crypto_shorthash()`. A 16-byte "secret" key is required; for this use case we
+use all zeros.
+
+When converting the 8-bytehash to an array of 2-bit bytes, the ordering is
+proceed byte-by-byte, and for each take the two lowest-value bits (aka, `hash &
+0x3`) as byte index `0`, the next two bits (aka, `hash & 0xC`) as byte index
+`1`, etc. When concatanating path hashes into a longer array, the first
+("left-most") path element hash will correspond to byte indexes 0 through 31;
+the terminator (`4`) will have the highest byte index.
+
+For example, consider the key `/tree/willow`. `tree` has a hash `[0xAC, 0xDC,
+0x05, 0x6C, 0x63, 0x9D, 0x87, 0xCA]`, which converts into the array:
+
+    [ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3 ]
+
+
+`willow` has a 64-bit hash `[0x72, 0x30, 0x34, 0x39, 0x35, 0xA8, 0x21, 0x44]`,
+which converts into the array:
+
+    [ 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1 ]
+
+These combine into the unified byte array with 65 elements:
 
-## Incremental Index
+    [ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3,
+      2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1,
+      4 ]
 
-HyperDB builds an *incremental index* with every new key/value pairs ("nodes")
-written. This means a separate data structure doesn't need to be maintained
-elsewhere for fast writes and lookups: each node written has enough information
-to look up any other key quickly and otherwise navigate the database.
+As another example, the key `/a/b/c` converts into the 97-byte hash array:
 
-Each node stores the following basic information:
+    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+      0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
+      0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+      4 ]
 
-- `key`: the key that is being created or modified. e.g. `/home/sww/dev.md`
-- `value`: the value stored at that key.
-- `feedSeq`: the sequence number of this entry in the hypercore. 0 is the first, 1
-  the second, and so forth.
-- `path`: a 2-bit hash sequence of the key's components
-- `trie`: a navigation structure used with `path` to find a desired key
+<!---
 
-### Prefix trie
+Generation code (javascript) for the above:
 
-Given a HyperDB with hundreds of entries, how can a key like `/a/b/c` be looked
-up quickly?
+    var sodium = require('sodium-universal')
+    var toBuffer = require('to-buffer')
+
+    var KEY = Buffer.alloc(16)
+    var OUT = Buffer.alloc(8)
+
+    sodium.crypto_shorthash(OUT, toBuffer('tree'), KEY)
+    console.log("tree: ", OUT)
+    console.log(hash('tree', true))
+
+    sodium.crypto_shorthash(OUT, toBuffer('willow'), KEY)
+    console.log("willow: ", OUT)
+    console.log(hash('willow', true))
+
+    sodium.crypto_shorthash(OUT, toBuffer('a'), KEY)
+    console.log("a: ", OUT)
+    console.log(hash('a', true))
+
+Then, to manually "expand" arrays in python3:
+
+    hash_array = [0x72, 0x30, 0x34, 0x39, 0x35, 0xA8, 0x21, 0x44]
+    path = []
+    tmp = [(x & 0x3, (x >> 2) & 0x3, (x >> 4) & 0x3, (x >> 6) & 0x3) for x in hash_array]
+    [path.extend(e) for e in tmp]
+    path
+
+--->
+
+
+## Incremental Index Trie
+[trie_index]: #trie_index
 
 Each node stores a *prefix [trie](https://en.wikipedia.org/wiki/Trie)* that
-assists with finding the shortest path to the desired key.
-
-When a node is written, its *prefix hash* is computed. This done by first
-splitting the key into its components (`a`, `b`, and `c` for `/a/b/c`), and then
-hashing each component into a 32-character hash, where one character is a 2-bit
-value (0, 1, 2, or 3). The `prefix` hash for `/a/b/c` is
-
-```js
-node.path = [
-1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
-0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
-0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
-4 ]
-```
+can be used to look up other keys, or to list all keys with a given prefix.
+This is stored under the `trie` field of the Node protobuf message.
+
+The trie effectively mirrors the `path` hash array. Each element in the `trie`
+points to the newest Node which has an identical path up to that specific
+prefix location. Elements can be null; any trailing null elements can be left
+blank.
+
+The data structure of the trie is a sparse array of pointers to other Node
+entries. Each pointer indicates a feed index and a sequence number; for the
+non-multi-writer case, the feed index is always 0, so we consider only the
+sequence number (aka, entry index).
+
+To lookup a key in the database, the recipe is to:
+
+1. Calculate the `path` array for the key you are looking for.
+2. Select the most-recent ("latest") Node for the feed.
+3. Compare path arrays. If the paths match exactly, you have found the Node you
+   were looking for!
+4. If not, find the first index at which the two arrays differ, and look up the
+   corresponding element in this Node's `trie` array. If the element is empty,
+   then your key does not exist in this hyperdb.
+5. If the trie element is not empty, then follow that pointer to select the
+   next `Node`. Recursively repeat this process from step #3; you will be
+   descending the `trie` in a search, and will either terminate in the Node you
+   are looking for, or find that the key is not defined in this hyperdb.
+
+Similarly, to write a key to the database:
+
+1. Calculate the `path` array for the key, and start with an empty `trie` of
+   the same length; you will write to the `trie` array from the current index,
+   which starts at 0.
+2. Select the most-recent ("latest") Node for the feed.
+3. Compare path arrays. If the paths match exactly, then you are overwriting
+   the current Node, and can copy the "remainder" of it's `trie` up to your
+   current `trie` index.
+4. If not, find the first index at which the two arrays differ. Copy all `trie`
+   elements (empty or not) into the new `trie` for indicies between the
+   "current index" and the "differing index".
+5. Next look up the corresponding element in this Node's `trie` array at the
+   differing index. If this element is empty, then you have found the most
+   similar Node. Write a pointer to this node to the `trie` at
+   the differing index, and you are done (all remaining `trie` elements are
+   empty, and can be omitted).
+6. If the differing tree element has a pointer (is not empty), then follow that
+   pointer to select the next `Node`. Recursively repeat this process from step
+   #3.
+
+To delete a value, follow the same procedure as adding a key, but write the
+`Node` without a `value` (in protobuf, this is distinct from having a `value`
+field with zero bytes)..
 
-Each component is divided by a newline. `4` is a special value indicating the
-end of the prefix.
+# Examples
+[examples]: #examples
 
-#### Example
+## Simple Put and Get
 
-Consider a fresh HyperDB. We write `/a/b = 24` and get back this node:
+Starting with an empty HyperDB `db`, if we `db.put('/a/b', '24')` we expect to
+see a single `Node`:
 
-```js
-{ key: '/a/b',
+```
+{ key: 'a/b',
   value: '24',
-  trie: [],
+  trie:
+   [ ],
   seq: 0,
   path:
    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
@@ -203,16 +335,18 @@ Consider a fresh HyperDB. We write `/a/b = 24` and get back this node:
      4 ] }
 ```
 
-If you compare this path to the one for `/a/b/c` above, you'll see that the
-first 64 2-bit characters match. This is because `/a/b` is a prefix of
-`/a/b/c`.  Since this is the first entry, `seq` is 0.
+Note that the first 64 bytes in `path` match those of the `/a/b/c` example from
+the [path hashing][path_hash] section, because the first two path components
+are the same. Since this is the first entry, `seq` is 0.
 
-Now we write `/a/c = hello` and get this node:
+Now we `db.put('/a/c', 'hello')` and expect a second Node:
 
-```js
-{ key: '/a/c',
+```
+{ key: 'a/c',
   value: 'hello',
-  trie: [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , [ , , [ { feed: 0, seq: 0 } ] ] ],
+  trie:
+   [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
+     , , { feed: 0, seq: 0 } ],
   seq: 1,
   path: 
    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
@@ -220,47 +354,91 @@ Now we write `/a/c = hello` and get this node:
      4 ] }
 ```
 
-Its `seq` is 1, since the last was 0.  Also, this and the previous node have
-the first 32 characters of their `path` in common (the prefix `/a`).
+The `seq` is incremented to 1. The first 32 characters of `path` are common
+with the first Node (they share a common prefix `/a`).
 
-Notice though that `trie` is set. It's a long but sparse array. It has 35
-entries, with the last one referencing the first node inserted (`a/b/`). Why?
+`trie` is defined, but mostly sparse. The first 32 elements of common prefix
+match the first Node, and then two additional hash elements (`[0, 1]`) happen
+to match as well; there is not a differing entry until index 34 (zero-indexed).
+At this entry there is a reference pointing to the first Node. An additional 29
+trailing null entries have been trimmed in reduce metadata overhead.
 
-(If it wasn't stored as a sparse array, you'd actually see 64 entries (the
-length of the `path`). But since the other 29 entries are also empty, hyperdb
-doesn't bother allocating them.)
+Next we insert a third node with `db.put('/x/y', 'other')`, and get a third Node:
 
-If you visually compare this node's `path` with the previous node's `path`, how
-many entries do they have in common? At which entry do the 2-bit numbers
-diverge?
+```
+{ key: 'x/y',
+  value: 'other',
+  trie:
+   [ , { feed: 0, seq: 1 } ],
+  seq: 2,
+  path: 
+   [ 1, 1, 0, 0, 3, 1, 2, 3, 3, 1, 1, 1, 2, 2, 1, 1, 1, 0, 2, 3, 3, 0, 1, 2, 1, 1, 2, 3, 0, 0, 2, 1,
+     0, 2, 1, 0, 1, 1, 0, 1, 0, 1, 3, 1, 0, 0, 2, 3, 0, 1, 3, 2, 0, 3, 2, 0, 1, 0, 3, 2, 0, 2, 1, 1,
+     4 ] }
+```
 
-At the 35th entry.
+Consider the lookup-up process for `db.get('/a/b')` (which we expect to
+successfully return `'24'`, as written in the first Node). First we calculate
+the `path` for the key `a/b`, which will be the same as the first Node. Then we
+take the "latest" Node, with `seq=2`. We compare the `path` arrays, starting at
+the first element, and find the first difference at index 1 (`1 == 1`, then `1
+!= 2`). We look at index 1 in the current Node's `trie` and find a pointer to
+`seq = 1`, so we fetch that Node and recurse. Comparing `path` arrays, we now
+get all the way to index 34 before there is a difference. We again look in the
+`trie`, find a pointer to `seq = 0`, and fetch the first Node and recurse. Now
+the path elements match exactly; we have found the Node we are looking for, and
+it has an existant `value`, so we return the `value`.
 
-What this is saying is "if the hash of the key you're looking for differs from
-mine on the 35th entry, you want to travel to `{ feed: 0, seq: 0 }` to find the
-node you're looking for.
+Consider a lookup for `db.get('/a/z')`; this key does not exist, so we expect
+to return with "key not found". We calculate the `path` for this key:
 
-This is how finding a node works, starting at any other node:
+    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+      1, 2, 3, 0, 1, 0, 1, 1, 1, 1, 2, 1, 1, 1, 0, 1, 0, 3, 3, 2, 0, 3, 3, 1, 1, 0, 2, 1, 0, 1, 1, 2,
+      4 ]
 
-1. Compute the 2-bit hash sequence of the key you're after (e.g. `a/b`)
-2. Lookup the newest entry in the feed.
-3. Compare its `path` against the hash you just computed.
-4. If you discover that the `path` and your hash match, then this is the node
-   you're looking for!
-5. Otherwise, once a 2-bit character from `path` and your hash disagree, note
-   the index # where they differ and look up that value in the node's `trie`.
-   Fetch that node at the given feed and sequence number, and go back to step 3.
-   Repeat until you reach step 4 (match) or there is no entry in the node's trie
-   for the key you're after (no match).
+Similar to the first lookup, we start with `seq = 2` and follow the pointer to
+`seq = 1`. This time, when we compare `path` arrays, the first differing entry
+is at index `32`. There is no `trie` entry at this index, which tells us that
+the key does not exist in the database.
 
-# Examples
-[examples]: #examples
+## Listing a Prefix
+
+Continuing with the state of the database above, we call `db.list('/a')` to
+list all keys with the prefix `/a`.
+
+We generate a `path` array for the key `/a`, without the terminating symbol
+(`4`):
+
+    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2 ]
 
-TODO: lookup a "deep" key
+Using the same process as a `get()` lookup, we find the first Node that
+entirely matches this prefix, which will be Node `seq = 1`. If we had failed to
+find any Node with a complete prefix match, then we would return an empty list
+of matching keys.
 
-TODO: deleting a key
+Starting with the first prefix-matching node, we save that key as a match
+(unless the Node is a deletion), then select all `trie` pointers with an index
+higher than the prefix length, and recursively inspect all pointed-to Nodes.
 
-TODO: listing a prefix
+## Deleting a Key
+
+Continuing with the state of the database above, we call `db.delete('/a/c')` to
+remove that key from the database.
+
+The process is almost entirely the same as inserting a new Node at that key,
+except that the `value` field is undefined. The new Node (`seq = 3`) is:
+
+```
+{ key: 'a/c',
+  value: ,
+  trie: [ , { feed: 0, seq: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
+          , , { feed: 0, seq: 0 } ],
+  seq: 3,
+  path: 
+   [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+     0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+     4 ] }
+```
 
 # Drawbacks
 [drawbacks]: #drawbacks
@@ -275,12 +453,61 @@ metadata size (and thus disk and network consumption) for archives with few
 files.
 
 
+# Overhead and Scaling
+[overhead]: #overhead
+
+The metadata overhead for a single database entry varies based on the size of
+the database. In a "heavy" case, considering a two-path-segment key with an
+entirely saturated `trie` and `uint32` size feed and sequence pointers, and
+ignoring multi-writer fields:
+
+- `trie`: 4 * 2 * 64 bytes = 512 bytes
+- `seq`: 4 bytes
+- `path`: 65 bytes
+- total: 581 bytes
+
+In a "light" case, with few `trie` entries and single-byte varint sequence
+numbers:
+
+- `trie`: 2 * 2 * 4 bytes = 16 bytes
+- `seqq: 1 byte
+- `path`: 65 bytes
+- total: 82
+
+For a database with most keys having N path segments, the cost of a `get()`
+scales with the number of entries M as `O(log(M))` with best case 1 lookup and
+worst case `4 * 32 * N = 128 * N` lookups (for a saturated `trie`).
+
+TODO: prove or verify the above `O(log(M))` intuition
+
+The cost of a `put()` or `delete()` is proportional to the cost of a `get()`.
+
+The cost of a `list()` is linear (`O(M)`) in the number of matching entries,
+plus the cost of a single `get()`.
+
+The total metadata overhead for a database with M entries scales with `O(M
+* log(M))`.
+
+
 # Rationale and alternatives
 [alternatives]: #alternatives
 
-TODO: scaling issues with existing system; handling many entries
+A major motivator for HyperDB is to improve scaling performance with tens of
+thousands through millions of files per directory in the existing hyperdrive
+implementation. The current implementation requires the most recent node in a
+directory to point to all other nodes in the directory. Even with pointer
+compression, this requires on the order of `O(N^2)` bytes; the HyperDB
+implementation scales with `O(N log(N))`.
 
-TODO: why multi-writer fields included here
+The HyperDB specification (this document) is complicated by the inclusion of
+new protobuf fields to support "multi-writer" features which are not described
+here. The motivation to include these fields now to make only a single
+backwards-incompatible schema change, and to make a second software-only change
+in the future to enable support for these features. Schema and data format
+changes are considered significantly more "expensive" for the community and
+software ecosystem compared to software-only changes. Attempts have been made
+in this specification to indicate the safe "single-writer-only" values to use
+for these fields.
 
 
 # Dat migration logistics
@@ -300,14 +527,31 @@ Upgrading a Dat (hyperdrive) archive to HyperDB will necessitate creating a new
 feed from scratch, meaning new public/private key pairs, and that public key
 URL links will need to change.
 
+Further logistical details are left to the forthcoming Multi-Writer DEP.
+
 
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
-Details of keys to work out (before Draft status):
-
-- must a prefix end in a trailing slash, or optional?
-- can there be, eg, keys `/a/b` and `/a/b/c` at the same time?
+How are hash collisions handled? Eg, two keys which have the same `path` hash
+array. How unlikely is this situation? Eg, how many keys in a database before
+the birthday paradox results in a realistic chance of a collision? How should
+implementations behave if they detect a collision (matching path but not
+matching key)?
+
+Need to think through deletion process with respect to listing a path prefix;
+will previously deleted nodes be occulded, or potentially show up in list
+results?
+
+Are the "deletion" semantics here correct, in that deletion Nodes persist as
+"tombstones", or should deleted keys be omitted from future `trie` fields to
+remove their presence?
+
+Should all-zeros really be used for path hashing, or should we use the
+hypercore feed public key? It certainly makes implementation and documentation
+simpler to use all-zeros, and potentially makes it easier to copy or migrate
+HyperDB content between hypercore feeds. Referencing the feed public key breaks
+abstraction layers and the separation of concerns.
 
 There are implied "reasonable" limits on the size (in bytes) of both keys and
 values, but they are not formally specified. Protobuf messages have a hard

From 271bae5f65417b2963edb3b88714f4dee8bca85a Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 4 Mar 2018 17:11:19 -0800
Subject: [PATCH 05/27] hyperdb metadata updates

---
 proposals/0000-hyperdb.md | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 9a144b2..76b549f 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -7,9 +7,12 @@ Type: Standard
 
 Status: Undefined (as of 2018-03-XX)
 
-Github PR: (add HTTPS link here after PR is opened)
+Github PR: [https://github.com/datprotocol/DEPs/pull/3](https://github.com/datprotocol/DEPs/pull/3)
 
-Authors: [Stephen Whitmore](https://github.com/noffle), [Bryan Newbold](https://github.com/bnewbold)
+Authors:
+[Bryan Newbold](https://github.com/bnewbold),
+[Stephen Whitmore](https://github.com/noffle),
+[Mathias Buus](https://github.com/mafintosh)
 
 
 # Summary
@@ -567,12 +570,12 @@ concerns are out of scope for this DEP.
 # Changelog
 [changelog]: #changelog
 
-As of March 2018, @mafintosh is leading development of a hyperdb nodejs module
-on [github](https://github.com/mafintosh/hyperdb), which is the basis for this
-DEP.
+As of March 2018, Mathias Buus (@mafintosh) is leading development of a hyperdb
+nodejs module on [github](https://github.com/mafintosh/hyperdb), which is the
+basis for this DEP.
 
-- 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
-  [hyperdb github repo][arch_md]
-- 2018-02-XX: First complete draft submitted for review
+- 2017-12-06: Stephen Whitmore (@noffle) publishes `ARCHITECTURE.md` overview
+  in the [hyperdb github repo][arch_md]
+- 2018-03-04: First draft for review
 
 [arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md

From a3433ddb5829082cee6c47bd5a972e51d52490ea Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 4 Mar 2018 17:21:30 -0800
Subject: [PATCH 06/27] change github PR link syntax

---
 proposals/0000-hyperdb.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 76b549f..6209f2f 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -7,7 +7,7 @@ Type: Standard
 
 Status: Undefined (as of 2018-03-XX)
 
-Github PR: [https://github.com/datprotocol/DEPs/pull/3](https://github.com/datprotocol/DEPs/pull/3)
+Github PR: [Draft](https://github.com/datprotocol/DEPs/pull/3)
 
 Authors:
 [Bryan Newbold](https://github.com/bnewbold),

From 885848ae6b18b829bb0295d1d1ea4be9ae911644 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 14 Mar 2018 21:31:19 -0700
Subject: [PATCH 07/27] my spelling is banannas

Thanks @dcposch!
---
 proposals/0000-hyperdb.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 6209f2f..94e9da6 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -116,13 +116,13 @@ relevant value is replicated, or the implementation could return an error.
 An example pseudo-code session working with a database might be:
 
     db.put('/life/animal/mammal/kitten', '{"cuteness": 500.3}')
-    db.put('/life/plant/bush/bananna', '{"delicious": 103.4}')
-    db.delete('/life/plant/bush/bananna')
-    db.put('/life/plant/tree/bananna', '{"delicious": 103.4}')
+    db.put('/life/plant/bush/banana', '{"delicious": 103.4}')
+    db.delete('/life/plant/bush/banana')
+    db.put('/life/plant/tree/banana', '{"delicious": 103.4}')
     db.get('/life/animal/mammal/kitten')
     => {"cuteness": 500.3}
     db.list('/life/')
-    => ['/life/animal/mammal/kitten', '/life/plant/tree/bananna']
+    => ['/life/animal/mammal/kitten', '/life/plant/tree/banana']
 
 
 # Reference Documentation

From 3f1b3fc4d2989eb2c7e1c09184dcf56f8c2e4c15 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 14 Mar 2018 22:15:21 -0700
Subject: [PATCH 08/27] start documenting hash collision procedure

---
 proposals/0000-hyperdb.md | 62 +++++++++++++++++++++++----------------
 1 file changed, 37 insertions(+), 25 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 94e9da6..6de1114 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -182,7 +182,11 @@ TODO(mafintosh): should `feed` actually be the current hypercore publickey?
 
 Every key path has component-wise fixed-size hash representation that is used
 by the trie. This is written to the `path` field each Node protobuf
-message.
+message. The concatenation of all path hashes yields a hash for the entire key.
+Note that analagously to a hash map data structure, there can be more than one
+key (string) with the same key hash in the same database with no problems: the
+hash points to a bucket of keys, which can be iterated over linearly to find
+the correct value. 
 
 The path hash is represented by an array of bytes. Elements are 2-bit encoded
 (values 0, 1, 2, 3), except for an optional terminating element which has value
@@ -194,9 +198,9 @@ element hashes plus a terminator).
 The hash algorithm used is `SipHash-2-4`, with an 8-byte output and
 16-byte key; the input is the UTF-8 encoded path string segment, without any
 `/` separators or terminating null bytes. An implementation of this hash
-algorithm is included in the libsodium library, under the function
-`crypto_shorthash()`. A 16-byte "secret" key is required; for this use case we
-use all zeros.
+algorithm is included in the libsodium library in the form of the
+`crypto_shorthash()` function. A 16-byte "secret" key is required; for this use
+case we use all zeros.
 
 When converting the 8-bytehash to an array of 2-bit bytes, the ordering is
 proceed byte-by-byte, and for each take the two lowest-value bits (aka, `hash &
@@ -229,6 +233,12 @@ As another example, the key `/a/b/c` converts into the 97-byte hash array:
       0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
       4 ]
 
+Note that "hash collisions" are rare with this hashing scheme, but are likely
+to occur with large databases (millions of keys), and collision have been
+generated as a proof of concept. Implementions should take care to properly
+handle collisions by verifying keys and following bucket pointers (see the next
+section).
+
 <!---
 
 Generation code (javascript) for the above:
@@ -283,12 +293,17 @@ To lookup a key in the database, the recipe is to:
 
 1. Calculate the `path` array for the key you are looking for.
 2. Select the most-recent ("latest") Node for the feed.
-3. Compare path arrays. If the paths match exactly, you have found the Node you
-   were looking for!
-4. If not, find the first index at which the two arrays differ, and look up the
-   corresponding element in this Node's `trie` array. If the element is empty,
-   then your key does not exist in this hyperdb.
-5. If the trie element is not empty, then follow that pointer to select the
+3. Compare path arrays. If the paths match exactly, compare keys; they match,
+   you have found the you were looking for! Check whether the `value` is
+   defined or not; if not, this Node represents that the key was deleted from
+   the database.
+4. If the paths match, but not the keys, look for a pointer in the last `trie`
+   array index, and iterate from step #3 with the new Node.
+5. If the paths don't entirely match, find the first index at which the two
+   arrays differ, and look up the corresponding element in this Node's `trie`
+   array. If the element is empty, then your key does not exist in this
+   hyperdb.
+6. If the trie element is not empty, then follow that pointer to select the
    next `Node`. Recursively repeat this process from step #3; you will be
    descending the `trie` in a search, and will either terminate in the Node you
    are looking for, or find that the key is not defined in this hyperdb.
@@ -299,24 +314,27 @@ Similarly, to write a key to the database:
    the same length; you will write to the `trie` array from the current index,
    which starts at 0.
 2. Select the most-recent ("latest") Node for the feed.
-3. Compare path arrays. If the paths match exactly, then you are overwriting
-   the current Node, and can copy the "remainder" of it's `trie` up to your
-   current `trie` index.
-4. If not, find the first index at which the two arrays differ. Copy all `trie`
-   elements (empty or not) into the new `trie` for indicies between the
-   "current index" and the "differing index".
-5. Next look up the corresponding element in this Node's `trie` array at the
+3. Compare path arrays. If the paths match exactly, and the keys match, then
+   you are overwriting the current Node, and can copy the "remainder" of it's
+   `trie` up to your current `trie` index.
+4. If the paths match, but not the keys, you are adding a new key to an
+   existing hash bucket. Copy the `trie` and extend it to the full length. Add
+   a pointer to the Node with the same hash at the final array index.
+5. If the paths don't entirely match, find the first index at which the two
+   arrays differ. Copy all `trie` elements (empty or not) into the new `trie`
+   for indicies between the "current index" and the "differing index".
+6. Next look up the corresponding element in this Node's `trie` array at the
    differing index. If this element is empty, then you have found the most
    similar Node. Write a pointer to this node to the `trie` at
    the differing index, and you are done (all remaining `trie` elements are
    empty, and can be omitted).
-6. If the differing tree element has a pointer (is not empty), then follow that
+7. If the differing tree element has a pointer (is not empty), then follow that
    pointer to select the next `Node`. Recursively repeat this process from step
    #3.
 
 To delete a value, follow the same procedure as adding a key, but write the
 `Node` without a `value` (in protobuf, this is distinct from having a `value`
-field with zero bytes)..
+field with zero bytes).
 
 # Examples
 [examples]: #examples
@@ -536,12 +554,6 @@ Further logistical details are left to the forthcoming Multi-Writer DEP.
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
-How are hash collisions handled? Eg, two keys which have the same `path` hash
-array. How unlikely is this situation? Eg, how many keys in a database before
-the birthday paradox results in a realistic chance of a collision? How should
-implementations behave if they detect a collision (matching path but not
-matching key)?
-
 Need to think through deletion process with respect to listing a path prefix;
 will previously deleted nodes be occulded, or potentially show up in list
 results?

From 6ed881f01d340310a8ce28db05ba4dd0705ad9f0 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 14 Mar 2018 22:45:05 -0700
Subject: [PATCH 09/27] document hyperdb deletions

---
 proposals/0000-hyperdb.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 6de1114..8c054f9 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -334,7 +334,7 @@ Similarly, to write a key to the database:
 
 To delete a value, follow the same procedure as adding a key, but write the
 `Node` without a `value` (in protobuf, this is distinct from having a `value`
-field with zero bytes).
+field with zero bytes). Deletion nodes may persist in the database forever.
 
 # Examples
 [examples]: #examples
@@ -558,9 +558,11 @@ Need to think through deletion process with respect to listing a path prefix;
 will previously deleted nodes be occulded, or potentially show up in list
 results?
 
-Are the "deletion" semantics here correct, in that deletion Nodes persist as
-"tombstones", or should deleted keys be omitted from future `trie` fields to
-remove their presence?
+Can the deletion process (currently leaving "tombstone" entries in the `trie`
+forever) be improved, such that these entries don't need to be iterated over?
+mafintosh mentioned this might be in the works. Does this DEP need to "leave
+room" for those changes, or should we call out the potential for future change?
+(Probably not, should only describe existing solutions)
 
 Should all-zeros really be used for path hashing, or should we use the
 hypercore feed public key? It certainly makes implementation and documentation

From deb2cf2bdec1eecbeed0231de088f44936d28e57 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 14 Mar 2018 22:45:58 -0700
Subject: [PATCH 10/27] confirmed all-zero hashing key

---
 proposals/0000-hyperdb.md | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 8c054f9..19f1d5d 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -564,12 +564,6 @@ mafintosh mentioned this might be in the works. Does this DEP need to "leave
 room" for those changes, or should we call out the potential for future change?
 (Probably not, should only describe existing solutions)
 
-Should all-zeros really be used for path hashing, or should we use the
-hypercore feed public key? It certainly makes implementation and documentation
-simpler to use all-zeros, and potentially makes it easier to copy or migrate
-HyperDB content between hypercore feeds. Referencing the feed public key breaks
-abstraction layers and the separation of concerns.
-
 There are implied "reasonable" limits on the size (in bytes) of both keys and
 values, but they are not formally specified. Protobuf messages have a hard
 specified limit of 2 GByte (due to 32-bit signed arthimetic), and most

From 1dea887643575a2fd31e62570618f43fd42e3aa6 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 14 Mar 2018 22:55:34 -0700
Subject: [PATCH 11/27] it's all 'new'

---
 proposals/0000-hyperdb.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 19f1d5d..c399efb 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -18,8 +18,8 @@ Authors:
 # Summary
 [summary]: #summary
 
-HyperDB is a new abstraction layer providing a general purpose distributed
-key/value store over the Dat protocol. It is an iteration on the hyperdrive
+HyperDB is an abstraction layer providing a general purpose distributed
+key/value store over the hypercore protocol. It is an iteration on the hyperdrive
 directory tree implementation, building top of the hypercore append-only log
 abstraction layer. Keys are path-like strings (eg, `/food/fruit/kiwi`), and
 values are arbitrary binary blobs (generally under a megabyte).

From d8cfc17e1e6efae943cf4de0d4d042d43aae57a9 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 14 Mar 2018 22:55:49 -0700
Subject: [PATCH 12/27] hyperdb or HyperDB?

---
 proposals/0000-hyperdb.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index c399efb..1f2866f 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -574,6 +574,9 @@ status.
 Apart from leaving fields in the protobuf message specification, multi-writer
 concerns are out of scope for this DEP.
 
+Should we use "hyperdb" or "HyperDB"? bnewbold intends to switch to "hyperdb"
+in this document before Draft status.
+
 
 # Changelog
 [changelog]: #changelog

From 639fa24f0e1f95a2a1d9f5c4f920a08fd3541d9a Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 18 Mar 2018 15:34:25 -0700
Subject: [PATCH 13/27] start refactoring for 'version 3' hyperdb changes

Will need copy editing, and there are a number of new changes that need
to be understood and integrated.
---
 proposals/0000-hyperdb.md | 344 ++++++++++++++++++++++----------------
 1 file changed, 196 insertions(+), 148 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 1f2866f..3e4fb07 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -19,10 +19,11 @@ Authors:
 [summary]: #summary
 
 HyperDB is an abstraction layer providing a general purpose distributed
-key/value store over the hypercore protocol. It is an iteration on the hyperdrive
-directory tree implementation, building top of the hypercore append-only log
-abstraction layer. Keys are path-like strings (eg, `/food/fruit/kiwi`), and
-values are arbitrary binary blobs (generally under a megabyte).
+key/value store over the hypercore protocol. It is an iteration on the
+hyperdrive directory tree implementation, building top of the hypercore
+append-only log abstraction layer. Keys are path-like strings (eg,
+`/food/fruit/kiwi`), and values are arbitrary binary blobs (generally under a
+megabyte).
 
 Hyperdrive (used by the Dat application) is expected to be re-implemented on
 top of HyperDB for improved performance with many files (eg, millions). The
@@ -128,65 +129,82 @@ An example pseudo-code session working with a database might be:
 # Reference Documentation
 [reference-documentation]: #reference-documentation
 
-A HyperDB hypercore feed typically consists of a sequence of entries, all of
-which are protobuf-encoded Node messages. Higher-level protocols may make
-exception to this: for example, hyperdrive reserves the first entry of the
-(`metadata`) feed for a special entry that refers to the second (`content`)
-feed.
+A HyperDB hypercore feed typically consists of a sequence of protobuf-encoded
+messages of either "Entry" or "InflatedEntry" type. Higher-level protocols may
+make exception to this, for example by prepending an application-specific
+metadata message as the first entry in the feed. There is sometimes a second
+"content" feed associated with the primary hyperdb key/value feed, to store
+data that does not fit in the (limited) `value` size constraint.
 
 The sequence of entries includes an incremental index: the most recent entry in
 the feed contains metadata pointers that can be followed to efficiently look up
 any key in the database without needing to linear scan the entire history or
-generate an independent index data structure. Of course implementations are
-free to maintain such an index if they prefer.
-
-The Node protobuf message schema is:
-
-```
-  message Node {
-    optional string key = 1;
-    optional bytes value = 2;
-    repeated uint64 clock = 3;
-    optional bytes trie = 4;    // TODO: actually a {feed, seq} pair
-    optional bytes path = 5;    // TODO: actual type?
-    optional uint64 seq = 6;
-    optional bytes feed = 7;    // TODO: actual type?
-  }
-```
-
-TODO(mafintosh): where is this schema actually defined for the `next` branch of
-the hyperdb repo?
+generate an independent index data structure. Implementations are, of course,
+free to maintain their own index if they prefer.
+
+The Entry and InflatedEntry protobuf message schemas are:
+
+    message Entry {
+      required string key = 1;
+      optional bytes value = 2;
+      required bytes trie = 3;
+      repeated uint64 clock = 4;
+      optional uint64 inflate = 5;
+    }
+    
+    message InflatedEntry {
+      message Feed {
+        required bytes key = 1;
+      }
+    
+      required string key = 1;
+      optional bytes value = 2;
+      required bytes trie = 3;
+      repeated uint64 clock = 4;
+      optional uint64 inflate = 5;
+      repeated Feed feeds = 6;
+      optional bytes contentFeed = 7;
+    }
+
+Some fields are are specific to the multi-writer features described in their
+own DEP and mentioned only partially here. The fields common to both message
+types are:
 
 - `key`: UTF-8 key that this node describes. Leading and trailing forward
   slashes (`/`) are always striped before storing in protobuf.
 - `value`: arbitrary byte array. A non-existant `value` entry indicates that
-  this Node indicates a deletion of the key; this is distinct from specifying
+  this Entry indicates a deletion of the key; this is distinct from specifying
   an empty (zero-length) value.
-- `clock`: reserved for use in the forthcoming `multi-writer` standard, not
-  described here. An empty list is the safe (and expected) value for `clock` in
-  single-writer use cases.
-- `trie`: a structured array of pointers to other Node entries in the feed,
+- `trie`: a structured array of pointers to other Entry entries in the feed,
   used for navigating the tree of keys.
-- `path`: a 2-bit hash sequence of `key`'s components.
-- `seq`: the zero-indexed sequence number of this Node (hyperdb) entry in
-  the feed. Note that this may be offset from the hypercore entry index if
-  there are any leading non-hyperdb entries.
-- `feed`: reserved for use with `multi-writer`. The safe single-writer value is
+- `clock`: reserved for use in the forthcoming `multi-writer` standard An empty
+  list is the safe (and expected) value for `clock` in single-writer use cases.
+- `inflate`: TODO(mafintosh)
+
+The fields specific to InflatedEntry are:
+
+- `feeds`: reserved for use with `multi-writer`. The safe single-writer value is
   to use the current feed's hypercore public key.
+- `contentFeed`: for applications which require a parallel "content" hypercore
+  feed for larger data, this field can be used to store the 32-byte public key
+  for that feed.
 
-TODO(mafintosh): should `feed` actually be the current hypercore publickey?
+For the case of a single-writer feed, not using multi-writer features, it is
+sufficient to write a single InflatedEntry message as the first entry in the
+hypercore feed, with `feeds` containing a single entry (a pointer to the
+current feed itself), and `contendFeed` optionally set to a pointer to a paired
+contend feed.
 
 
 ## Path Hashing 
 [path_hashing]: #path_hashing
 
 Every key path has component-wise fixed-size hash representation that is used
-by the trie. This is written to the `path` field each Node protobuf
-message. The concatenation of all path hashes yields a hash for the entire key.
-Note that analagously to a hash map data structure, there can be more than one
-key (string) with the same key hash in the same database with no problems: the
-hash points to a bucket of keys, which can be iterated over linearly to find
-the correct value. 
+by the trie. The concatenation of all path hashes yields a "path hash array"
+for the entire key.  Note that analagously to a hash map data structure, there
+can be more than one key (string) with the same key hash in the same database
+with no problems: the hash points to a linked-list "bucket" of Entries, which
+can be iterated over linearly to find the correct value. 
 
 The path hash is represented by an array of bytes. Elements are 2-bit encoded
 (values 0, 1, 2, 3), except for an optional terminating element which has value
@@ -212,26 +230,34 @@ the terminator (`4`) will have the highest byte index.
 For example, consider the key `/tree/willow`. `tree` has a hash `[0xAC, 0xDC,
 0x05, 0x6C, 0x63, 0x9D, 0x87, 0xCA]`, which converts into the array:
 
-    [ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3 ]
+```
+[ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3 ]
+```
 
 
 `willow` has a 64-bit hash `[0x72, 0x30, 0x34, 0x39, 0x35, 0xA8, 0x21, 0x44]`,
 which converts into the array:
 
-    [ 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1 ]
+```
+[ 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1 ]
+```
 
 These combine into the unified byte array with 65 elements:
 
-    [ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3,
-      2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1,
-      4 ]
+```
+[ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3,
+  2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1,
+  4 ]
+```
 
 As another example, the key `/a/b/c` converts into the 97-byte hash array:
 
-    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
-      0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
-      0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
-      4 ]
+```
+[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+  0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
+  0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+  4 ]
+```
 
 Note that "hash collisions" are rare with this hashing scheme, but are likely
 to occur with large databases (millions of keys), and collision have been
@@ -277,190 +303,216 @@ Then, to manually "expand" arrays in python3:
 
 Each node stores a *prefix [trie](https://en.wikipedia.org/wiki/Trie)* that
 can be used to look up other keys, or to list all keys with a given prefix.
-This is stored under the `trie` field of the Node protobuf message.
+This is stored under the `trie` field of the Entry protobuf message.
 
-The trie effectively mirrors the `path` hash array. Each element in the `trie`
-points to the newest Node which has an identical path up to that specific
+The trie effectively mirrors the path hash array. Each element in the `trie`
+points to the newest Entry which has an identical path up to that specific
 prefix location. Elements can be null; any trailing null elements can be left
 blank.
 
-The data structure of the trie is a sparse array of pointers to other Node
-entries. Each pointer indicates a feed index and a sequence number; for the
-non-multi-writer case, the feed index is always 0, so we consider only the
-sequence number (aka, entry index).
+The data structure of the trie is a sparse array of pointers to other Entry
+entries. Each pointer indicates a feed index and an entry index pointer; for
+the non-multi-writer case, the feed index is always 0, so we consider only the
+entry index.
 
 To lookup a key in the database, the recipe is to:
 
-1. Calculate the `path` array for the key you are looking for.
-2. Select the most-recent ("latest") Node for the feed.
-3. Compare path arrays. If the paths match exactly, compare keys; they match,
+1. Calculate the path hash array for the key you are looking for.
+2. Select the most-recent ("latest") Entry for the feed.
+3. Compare path hash arrays. If the paths match exactly, compare keys; they match,
    you have found the you were looking for! Check whether the `value` is
-   defined or not; if not, this Node represents that the key was deleted from
+   defined or not; if not, this Entry represents that the key was deleted from
    the database.
 4. If the paths match, but not the keys, look for a pointer in the last `trie`
-   array index, and iterate from step #3 with the new Node.
+   array index, and iterate from step #3 with the new Entry.
 5. If the paths don't entirely match, find the first index at which the two
-   arrays differ, and look up the corresponding element in this Node's `trie`
+   arrays differ, and look up the corresponding element in this Entry's `trie`
    array. If the element is empty, then your key does not exist in this
    hyperdb.
 6. If the trie element is not empty, then follow that pointer to select the
-   next `Node`. Recursively repeat this process from step #3; you will be
-   descending the `trie` in a search, and will either terminate in the Node you
+   next Entry. Recursively repeat this process from step #3; you will be
+   descending the `trie` in a search, and will either terminate in the Entry you
    are looking for, or find that the key is not defined in this hyperdb.
 
 Similarly, to write a key to the database:
 
-1. Calculate the `path` array for the key, and start with an empty `trie` of
+1. Calculate the path hash array for the key, and start with an empty `trie` of
    the same length; you will write to the `trie` array from the current index,
    which starts at 0.
-2. Select the most-recent ("latest") Node for the feed.
-3. Compare path arrays. If the paths match exactly, and the keys match, then
-   you are overwriting the current Node, and can copy the "remainder" of it's
+2. Select the most-recent ("latest") Entry for the feed.
+3. Compare path hash arrays. If the paths match exactly, and the keys match, then
+   you are overwriting the current Entry, and can copy the "remainder" of it's
    `trie` up to your current `trie` index.
 4. If the paths match, but not the keys, you are adding a new key to an
    existing hash bucket. Copy the `trie` and extend it to the full length. Add
-   a pointer to the Node with the same hash at the final array index.
+   a pointer to the Entry with the same hash at the final array index.
 5. If the paths don't entirely match, find the first index at which the two
    arrays differ. Copy all `trie` elements (empty or not) into the new `trie`
    for indicies between the "current index" and the "differing index".
-6. Next look up the corresponding element in this Node's `trie` array at the
+6. Next look up the corresponding element in this Entry's `trie` array at the
    differing index. If this element is empty, then you have found the most
-   similar Node. Write a pointer to this node to the `trie` at
+   similar Entry. Write a pointer to this node to the `trie` at
    the differing index, and you are done (all remaining `trie` elements are
    empty, and can be omitted).
 7. If the differing tree element has a pointer (is not empty), then follow that
-   pointer to select the next `Node`. Recursively repeat this process from step
+   pointer to select the next `Entry`. Recursively repeat this process from step
    #3.
 
 To delete a value, follow the same procedure as adding a key, but write the
-`Node` without a `value` (in protobuf, this is distinct from having a `value`
+`Entry` without a `value` (in protobuf, this is distinct from having a `value`
 field with zero bytes). Deletion nodes may persist in the database forever.
 
+## Binary Trie Encoding
+[trie_encoding]: #trie_encoding
+
+TODO(mafintosh)
+
+
 # Examples
 [examples]: #examples
 
 ## Simple Put and Get
 
 Starting with an empty HyperDB `db`, if we `db.put('/a/b', '24')` we expect to
-see a single `Node`:
+see a single `Entry` and index 0:
 
 ```
 { key: 'a/b',
   value: '24',
   trie:
-   [ ],
-  seq: 0,
-  path:
-   [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
-     0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
-     4 ] }
+   [ ] }
 ```
 
-Note that the first 64 bytes in `path` match those of the `/a/b/c` example from
+For reference, the path hash array for this key (index 0) is:
+
+```
+[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+  0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3,
+  4 ]
+```
+
+Note that the first 64 bytes in path match those of the `/a/b/c` example from
 the [path hashing][path_hash] section, because the first two path components
-are the same. Since this is the first entry, `seq` is 0.
+are the same. Since this is the first entry, the entry index is 0.
 
-Now we `db.put('/a/c', 'hello')` and expect a second Node:
+Now we `db.put('/a/c', 'hello')` and expect a second Entry:
 
 ```
 { key: 'a/c',
   value: 'hello',
   trie:
    [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
-     , , { feed: 0, seq: 0 } ],
-  seq: 1,
-  path: 
-   [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
-     0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
-     4 ] }
+     , , { feed: 0, index: 0 } ] }
+```
+
+The path hash array for this key (index 1) is:
+
+```
+[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+  0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+  4 ]
 ```
 
-The `seq` is incremented to 1. The first 32 characters of `path` are common
-with the first Node (they share a common prefix `/a`).
+The first 32 characters of path are common with the first Entry (they share a
+common prefix, `/a`).
 
 `trie` is defined, but mostly sparse. The first 32 elements of common prefix
-match the first Node, and then two additional hash elements (`[0, 1]`) happen
+match the first Entry, and then two additional hash elements (`[0, 1]`) happen
 to match as well; there is not a differing entry until index 34 (zero-indexed).
-At this entry there is a reference pointing to the first Node. An additional 29
+At this entry there is a reference pointing to the first Entry. An additional 29
 trailing null entries have been trimmed in reduce metadata overhead.
 
-Next we insert a third node with `db.put('/x/y', 'other')`, and get a third Node:
+Next we insert a third node with `db.put('/x/y', 'other')`, and get a third Entry:
 
 ```
 { key: 'x/y',
   value: 'other',
   trie:
-   [ , { feed: 0, seq: 1 } ],
-  seq: 2,
-  path: 
-   [ 1, 1, 0, 0, 3, 1, 2, 3, 3, 1, 1, 1, 2, 2, 1, 1, 1, 0, 2, 3, 3, 0, 1, 2, 1, 1, 2, 3, 0, 0, 2, 1,
-     0, 2, 1, 0, 1, 1, 0, 1, 0, 1, 3, 1, 0, 0, 2, 3, 0, 1, 3, 2, 0, 3, 2, 0, 1, 0, 3, 2, 0, 2, 1, 1,
-     4 ] }
+   [ , { feed: 0, index: 1 } ],
+```
+
+The path hash array for this key (index 2) is:
+
+```
+[ 1, 1, 0, 0, 3, 1, 2, 3, 3, 1, 1, 1, 2, 2, 1, 1, 1, 0, 2, 3, 3, 0, 1, 2, 1, 1, 2, 3, 0, 0, 2, 1,
+  0, 2, 1, 0, 1, 1, 0, 1, 0, 1, 3, 1, 0, 0, 2, 3, 0, 1, 3, 2, 0, 3, 2, 0, 1, 0, 3, 2, 0, 2, 1, 1,
+  4 ]
 ```
 
 Consider the lookup-up process for `db.get('/a/b')` (which we expect to
-successfully return `'24'`, as written in the first Node). First we calculate
-the `path` for the key `a/b`, which will be the same as the first Node. Then we
-take the "latest" Node, with `seq=2`. We compare the `path` arrays, starting at
-the first element, and find the first difference at index 1 (`1 == 1`, then `1
-!= 2`). We look at index 1 in the current Node's `trie` and find a pointer to
-`seq = 1`, so we fetch that Node and recurse. Comparing `path` arrays, we now
-get all the way to index 34 before there is a difference. We again look in the
-`trie`, find a pointer to `seq = 0`, and fetch the first Node and recurse. Now
-the path elements match exactly; we have found the Node we are looking for, and
-it has an existant `value`, so we return the `value`.
+successfully return `'24'`, as written in the first Entry). First we calculate
+the path for the key `a/b`, which will be the same as the first Entry. Then we
+take the "latest" Entry, with entry index 2. We compare the path hash arrays,
+starting at the first element, and find the first difference at index 1 (`1 ==
+1`, then `1 != 2`). We look at index 1 in the current Entry's `trie` and find a
+pointer to entry index 1, so we fetch that Entry and recurse. Comparing path
+hash arrays, we now get all the way to index 34 before there is a difference.
+We again look in the `trie`, find a pointer to entry index 0, and fetch the
+first Entry and recurse. Now the path elements match exactly; we have found the
+Entry we are looking for, and it has an existant `value`, so we return the
+`value`.
 
 Consider a lookup for `db.get('/a/z')`; this key does not exist, so we expect
-to return with "key not found". We calculate the `path` for this key:
+to return with "key not found". We calculate the path hash array for this key:
 
-    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
-      1, 2, 3, 0, 1, 0, 1, 1, 1, 1, 2, 1, 1, 1, 0, 1, 0, 3, 3, 2, 0, 3, 3, 1, 1, 0, 2, 1, 0, 1, 1, 2,
-      4 ]
+```
+[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+  1, 2, 3, 0, 1, 0, 1, 1, 1, 1, 2, 1, 1, 1, 0, 1, 0, 3, 3, 2, 0, 3, 3, 1, 1, 0, 2, 1, 0, 1, 1, 2,
+  4 ]
+```
 
-Similar to the first lookup, we start with `seq = 2` and follow the pointer to
-`seq = 1`. This time, when we compare `path` arrays, the first differing entry
-is at index `32`. There is no `trie` entry at this index, which tells us that
-the key does not exist in the database.
+Similar to the first lookup, we start with entry index 2 and follow the pointer to
+entry index 1. This time, when we compare path hash arrays, the first differing
+entry is at array index `32`. There is no `trie` entry at this index, which
+tells us that the key does not exist in the database.
 
 ## Listing a Prefix
 
 Continuing with the state of the database above, we call `db.list('/a')` to
 list all keys with the prefix `/a`.
 
-We generate a `path` array for the key `/a`, without the terminating symbol
+We generate a path hash array for the key `/a`, without the terminating symbol
 (`4`):
 
-    [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2 ]
+```
+[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2 ]
+```
 
-Using the same process as a `get()` lookup, we find the first Node that
-entirely matches this prefix, which will be Node `seq = 1`. If we had failed to
-find any Node with a complete prefix match, then we would return an empty list
+Using the same process as a `get()` lookup, we find the first Entry that
+entirely matches this prefix, which will be entry index 1. If we had failed to
+find any Entry with a complete prefix match, then we would return an empty list
 of matching keys.
 
 Starting with the first prefix-matching node, we save that key as a match
-(unless the Node is a deletion), then select all `trie` pointers with an index
-higher than the prefix length, and recursively inspect all pointed-to Nodes.
+(unless the Entry is a deletion), then select all `trie` pointers with an index
+higher than the prefix length, and recursively inspect all pointed-to Entries.
 
 ## Deleting a Key
 
 Continuing with the state of the database above, we call `db.delete('/a/c')` to
 remove that key from the database.
 
-The process is almost entirely the same as inserting a new Node at that key,
-except that the `value` field is undefined. The new Node (`seq = 3`) is:
+The process is almost entirely the same as inserting a new Entry at that key,
+except that the `value` field is undefined. The new Entry (at entry index 3)
+is:
 
 ```
 { key: 'a/c',
   value: ,
-  trie: [ , { feed: 0, seq: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
-          , , { feed: 0, seq: 0 } ],
-  seq: 3,
-  path: 
-   [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
-     0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
-     4 ] }
+  trie: [ , { feed: 0, index: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
+          , , { feed: 0, index: 0 } ] }
+```
+
+The path hash array for this Entry (key) is:
+
+```
+[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2,
+  0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0,
+  4 ]
 ```
 
+
+
 # Drawbacks
 [drawbacks]: #drawbacks
 
@@ -479,21 +531,17 @@ files.
 
 The metadata overhead for a single database entry varies based on the size of
 the database. In a "heavy" case, considering a two-path-segment key with an
-entirely saturated `trie` and `uint32` size feed and sequence pointers, and
+entirely saturated `trie` and `uint32`-sized feed and entry index pointers, and
 ignoring multi-writer fields:
 
 - `trie`: 4 * 2 * 64 bytes = 512 bytes
-- `seq`: 4 bytes
-- `path`: 65 bytes
-- total: 581 bytes
+- total: 512 bytes
 
-In a "light" case, with few `trie` entries and single-byte varint sequence
-numbers:
+In a "light" case, with few `trie` entries and single-byte varint feed and
+entry index pointers:
 
 - `trie`: 2 * 2 * 4 bytes = 16 bytes
-- `seqq: 1 byte
-- `path`: 65 bytes
-- total: 82
+- total: 16
 
 For a database with most keys having N path segments, the cost of a `get()`
 scales with the number of entries M as `O(log(M))` with best case 1 lookup and

From 355cb31ab7f4da540dd276f55ecb1d57999ef6a2 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 18 Mar 2018 21:20:34 -0700
Subject: [PATCH 14/27] HyperDB -> hyperdb

---
 proposals/0000-hyperdb.md | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 3e4fb07..277cafb 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -1,5 +1,5 @@
 
-Title: **DEP-0000: HyperDB**
+Title: **DEP-0000: Hyperdb**
 
 Short Name: `0000-hyperdb`
 
@@ -18,7 +18,7 @@ Authors:
 # Summary
 [summary]: #summary
 
-HyperDB is an abstraction layer providing a general purpose distributed
+Hyperdb is an abstraction layer providing a general purpose distributed
 key/value store over the hypercore protocol. It is an iteration on the
 hyperdrive directory tree implementation, building top of the hypercore
 append-only log abstraction layer. Keys are path-like strings (eg,
@@ -26,7 +26,7 @@ append-only log abstraction layer. Keys are path-like strings (eg,
 megabyte).
 
 Hyperdrive (used by the Dat application) is expected to be re-implemented on
-top of HyperDB for improved performance with many files (eg, millions). The
+top of hyperdb for improved performance with many files (eg, millions). The
 hyperdrive API should be largely unchanged, but the `metadata` format will be
 backwards-incompatible.
 
@@ -34,7 +34,7 @@ backwards-incompatible.
 # Motivation
 [motivation]: #motivation
 
-HyperDB is expected to drastically improve performance of dat clients when
+Hyperdb is expected to drastically improve performance of dat clients when
 working with archives containing tens of thousands of files in single
 directories. This is a real-world bottleneck for several current users, with
 basic local actions such as adding a directory taking an unacceptably long time
@@ -50,24 +50,24 @@ this abstraction layer.
 # Usage Documentation
 [usage-documentation]: #usage-documentation
 
-*This section describes HyperDB's interface and behavior in the abstract for
+*This section describes Hyperdb's interface and behavior in the abstract for
 application programmers. It is not intended to be exact documentation of any
 particular implementation (including the reference Javascript module).*
 
-HyperDB is structured to be used much like a traditional hierarchical
+Hyperdb is structured to be used much like a traditional hierarchical
 filesystem. A value can be written and read at locations like `/foo/bar/baz`,
 and the API supports querying or tracking values at subpaths, like how watching
 for changes on `/foo/bar` will report both changes to `/foo/bar/baz` and also
 `/foo/bar/19`.
 
 Lower-level details of the hypercore append-only log, disk serialization, and
-networked synchronization features that HyperDB builds on top of are not
+networked synchronization features that Hyperdb builds on top of are not
 described in detail here; see the [DEP repository][deps]. Multi-writer
 hypercore semantics are also not discussed in this DEP.
 
 [deps]: https://github.com/datprotocol/DEPs
 
-A HyperDB database instance can be represented by a single hypercore feed (or
+A Hyperdb database instance can be represented by a single hypercore feed (or
 several feeds in a multi-writer context), and is named, referenced, and
 discovered using the public and discovery keys of the hypercore feed (or the
 original feed if there are several). In a single-writer configuration, only a
@@ -129,7 +129,7 @@ An example pseudo-code session working with a database might be:
 # Reference Documentation
 [reference-documentation]: #reference-documentation
 
-A HyperDB hypercore feed typically consists of a sequence of protobuf-encoded
+A hyperdb hypercore feed typically consists of a sequence of protobuf-encoded
 messages of either "Entry" or "InflatedEntry" type. Higher-level protocols may
 make exception to this, for example by prepending an application-specific
 metadata message as the first entry in the feed. There is sometimes a second
@@ -373,7 +373,7 @@ TODO(mafintosh)
 
 ## Simple Put and Get
 
-Starting with an empty HyperDB `db`, if we `db.put('/a/b', '24')` we expect to
+Starting with an empty hyperdb `db`, if we `db.put('/a/b', '24')` we expect to
 see a single `Entry` and index 0:
 
 ```
@@ -521,7 +521,7 @@ ecosystem: clients will need to support both versions protocol for some time
 (increasing maintenance burden), future clients may not interoperate with old
 archives, etc. These downsides can partially be avoided by careful roll-out.
 
-For the specific use case of Dat archives, HyperDB will trivially increase
+For the specific use case of Dat archives, hyperdb will trivially increase
 metadata size (and thus disk and network consumption) for archives with few
 files.
 
@@ -561,14 +561,14 @@ The total metadata overhead for a database with M entries scales with `O(M
 # Rationale and alternatives
 [alternatives]: #alternatives
 
-A major motivator for HyperDB is to improve scaling performance with tens of
+A major motivator for hyperdb is to improve scaling performance with tens of
 thousands through millions of files per directory in the existing hyperdrive
 implementation. The current implementation requires the most recent node in a
 directory to point to all other nodes in the directory. Even with pointer
-compression, this requires on the order of `O(N^2)` bytes; the HyperDB
+compression, this requires on the order of `O(N^2)` bytes; the hyperdb
 implementation scales with `O(N log(N))`.
 
-The HyperDB specification (this document) is complicated by the inclusion of
+The hyperdb specification (this document) is complicated by the inclusion of
 new protobuf fields to support "multi-writer" features which are not described
 here. The motivation to include these fields now to make only a single
 backwards-incompatible schema change, and to make a second software-only change
@@ -582,17 +582,17 @@ for these fields.
 # Dat migration logistics
 [migration]: #migration
 
-HyperDB is not backwards compatible with the existing hyperdrive metadata,
+Hyperdb is not backwards compatible with the existing hyperdrive metadata,
 meaning dat clients may need to support both versions during a transition
 period. This applies both to archives saved to disk (eg, in SLEEP) and to
 archives received and published to peers over the network.
 
 No changes to the Dat network wire protocol itself are necessary, only changes
 to content passed over the protocol. The Dat `content` feed, containing raw
-file data, is not impacted by HyperDB, only the contents of the `metadata`
+file data, is not impacted by hyperdb, only the contents of the `metadata`
 feed.
 
-Upgrading a Dat (hyperdrive) archive to HyperDB will necessitate creating a new
+Upgrading a Dat (hyperdrive) archive to hyperdb will necessitate creating a new
 feed from scratch, meaning new public/private key pairs, and that public key
 URL links will need to change.
 
@@ -622,9 +622,6 @@ status.
 Apart from leaving fields in the protobuf message specification, multi-writer
 concerns are out of scope for this DEP.
 
-Should we use "hyperdb" or "HyperDB"? bnewbold intends to switch to "hyperdb"
-in this document before Draft status.
-
 
 # Changelog
 [changelog]: #changelog

From 5c0d406ccd09fb4ad2142591f27a80718ddef7b4 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 18 Mar 2018 21:27:19 -0700
Subject: [PATCH 15/27] what should be resolved before Draft

---
 proposals/0000-hyperdb.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 277cafb..f9a180d 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -604,13 +604,14 @@ Further logistical details are left to the forthcoming Multi-Writer DEP.
 
 Need to think through deletion process with respect to listing a path prefix;
 will previously deleted nodes be occulded, or potentially show up in list
-results?
+results? Should be resolved before Draft.
 
 Can the deletion process (currently leaving "tombstone" entries in the `trie`
 forever) be improved, such that these entries don't need to be iterated over?
 mafintosh mentioned this might be in the works. Does this DEP need to "leave
 room" for those changes, or should we call out the potential for future change?
-(Probably not, should only describe existing solutions)
+Probably not, should only describe existing solutions. This can be resolved
+after Draft.
 
 There are implied "reasonable" limits on the size (in bytes) of both keys and
 values, but they are not formally specified. Protobuf messages have a hard

From c0069fcdd2d75e6a0c6278df3accdfa9491cabcd Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 18 Mar 2018 21:27:32 -0700
Subject: [PATCH 16/27] changelog: hyperdb v3 released

---
 proposals/0000-hyperdb.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index f9a180d..5e2b6b4 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -634,5 +634,6 @@ basis for this DEP.
 - 2017-12-06: Stephen Whitmore (@noffle) publishes `ARCHITECTURE.md` overview
   in the [hyperdb github repo][arch_md]
 - 2018-03-04: First draft for review
+- 2018-03-15: Hyperdb v3.0.0 is released
 
 [arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md

From 1659b7ea7c8aacc9a9cfc20cd7714adfa0e9bd6e Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 8 Apr 2018 20:06:21 -0700
Subject: [PATCH 17/27] syntax; trie encoding; security section

---
 proposals/0000-hyperdb.md | 202 +++++++++++++++++++++++++++++---------
 1 file changed, 158 insertions(+), 44 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 5e2b6b4..66ca162 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -130,11 +130,11 @@ An example pseudo-code session working with a database might be:
 [reference-documentation]: #reference-documentation
 
 A hyperdb hypercore feed typically consists of a sequence of protobuf-encoded
-messages of either "Entry" or "InflatedEntry" type. Higher-level protocols may
-make exception to this, for example by prepending an application-specific
-metadata message as the first entry in the feed. There is sometimes a second
-"content" feed associated with the primary hyperdb key/value feed, to store
-data that does not fit in the (limited) `value` size constraint.
+messages of "Entry" type. Higher-level protocols may make exception to this,
+for example by prepending an application-specific metadata message as the first
+entry in the feed. There is sometimes a second "content" feed associated with
+the primary hyperdb key/value feed, to store data that does not fit in the
+(limited) `value` size constraint.
 
 The sequence of entries includes an incremental index: the most recent entry in
 the feed contains metadata pointers that can be followed to efficiently look up
@@ -142,17 +142,9 @@ any key in the database without needing to linear scan the entire history or
 generate an independent index data structure. Implementations are, of course,
 free to maintain their own index if they prefer.
 
-The Entry and InflatedEntry protobuf message schemas are:
+The Entry protobuf message schema is:
 
     message Entry {
-      required string key = 1;
-      optional bytes value = 2;
-      required bytes trie = 3;
-      repeated uint64 clock = 4;
-      optional uint64 inflate = 5;
-    }
-    
-    message InflatedEntry {
       message Feed {
         required bytes key = 1;
       }
@@ -179,10 +171,9 @@ types are:
   used for navigating the tree of keys.
 - `clock`: reserved for use in the forthcoming `multi-writer` standard An empty
   list is the safe (and expected) value for `clock` in single-writer use cases.
-- `inflate`: TODO(mafintosh)
-
-The fields specific to InflatedEntry are:
-
+- `inflate`: a "pointer" (reference to a feed index number) to the most recent
+  entry in the feed that has the optional `feeds` and `contentFeed` fields set.
+  Not defined if `feeds` and `contentFeed` are defined.
 - `feeds`: reserved for use with `multi-writer`. The safe single-writer value is
   to use the current feed's hypercore public key.
 - `contentFeed`: for applications which require a parallel "content" hypercore
@@ -190,10 +181,14 @@ The fields specific to InflatedEntry are:
   for that feed.
 
 For the case of a single-writer feed, not using multi-writer features, it is
-sufficient to write a single InflatedEntry message as the first entry in the
-hypercore feed, with `feeds` containing a single entry (a pointer to the
-current feed itself), and `contendFeed` optionally set to a pointer to a paired
-contend feed.
+sufficient to write a single Entry message as the first entry in the hypercore
+feed, with `feeds` containing a single entry (a pointer to the current feed
+itself), and `contendFeed` optionally set to a pointer to a paired contend
+feed.
+
+If *either* `feeds` *or* `contentFeed` are definined in an entry, *both* fields
+must be defined, so the new message can be refered to via `inflated`. In this
+case the entry is refered to as an "inflated entry".
 
 
 ## Path Hashing 
@@ -265,7 +260,7 @@ generated as a proof of concept. Implementions should take care to properly
 handle collisions by verifying keys and following bucket pointers (see the next
 section).
 
-<!---
+<!--
 
 Generation code (javascript) for the above:
 
@@ -295,7 +290,7 @@ Then, to manually "expand" arrays in python3:
     [path.extend(e) for e in tmp]
     path
 
---->
+-->
 
 
 ## Incremental Index Trie
@@ -306,29 +301,39 @@ can be used to look up other keys, or to list all keys with a given prefix.
 This is stored under the `trie` field of the Entry protobuf message.
 
 The trie effectively mirrors the path hash array. Each element in the `trie`
-points to the newest Entry which has an identical path up to that specific
-prefix location. Elements can be null; any trailing null elements can be left
-blank.
+is called a "bucket". Each non-empty bucket points to the newest Entries which
+have an identical path up to that specific prefix location; because the trie
+has 4 "values" at each node, there can be pointers to up to 3 other values at a
+given element in the trie array. Buckets can be empty if there are no nodes
+with path hashes that differ for the first time the given bucket (aka, there
+are no "branches" at this node in the trie). Only non-null elements will be
+transmitted or stored on disk.
 
 The data structure of the trie is a sparse array of pointers to other Entry
-entries. Each pointer indicates a feed index and an entry index pointer; for
-the non-multi-writer case, the feed index is always 0, so we consider only the
-entry index.
+entries. Each pointer indicates a feed index and an entry index pointer, and is
+associated with a 2-bit value; for the non-multi-writer case, the feed index is
+always 0, so we consider only the entry index.
+
+For a `trie` with `N` buckets, each may have zero or more pointers. Typically
+there are a maximum of 3 pointers per bucket, corresponding to the 4 possible
+values minus the current Entry's value, but in the event of hash collisions (in
+the path array space) , there may be multiple pointers in the same bucket
+corresponding to the same value.
 
 To lookup a key in the database, the recipe is to:
 
 1. Calculate the path hash array for the key you are looking for.
 2. Select the most-recent ("latest") Entry for the feed.
-3. Compare path hash arrays. If the paths match exactly, compare keys; they match,
-   you have found the you were looking for! Check whether the `value` is
+3. Compare path hash arrays. If the paths match exactly, compare keys; they
+   match, you have found the you were looking for! Check whether the `value` is
    defined or not; if not, this Entry represents that the key was deleted from
    the database.
 4. If the paths match, but not the keys, look for a pointer in the last `trie`
    array index, and iterate from step #3 with the new Entry.
 5. If the paths don't entirely match, find the first index at which the two
    arrays differ, and look up the corresponding element in this Entry's `trie`
-   array. If the element is empty, then your key does not exist in this
-   hyperdb.
+   array. If the element is empty, or doesn't have a pointer corresponding to
+   your 2-bit value, then your key does not exist in this hyperdb.
 6. If the trie element is not empty, then follow that pointer to select the
    next Entry. Recursively repeat this process from step #3; you will be
    descending the `trie` in a search, and will either terminate in the Entry you
@@ -362,15 +367,100 @@ To delete a value, follow the same procedure as adding a key, but write the
 `Entry` without a `value` (in protobuf, this is distinct from having a `value`
 field with zero bytes). Deletion nodes may persist in the database forever.
 
+
 ## Binary Trie Encoding
 [trie_encoding]: #trie_encoding
 
-TODO(mafintosh)
+The following scheme is used to encode trie data structures (sparse, indexed
+arrays of pointers to entries) into a variable-length bytestring as the `trie`
+field of an Entry protobuf message.
+
+Consider a trie array with `N` buckets and `M` non-empty buckets (`0 <= M <=
+N`). In the encoded field, there will be `M` concatanated bytestrings of the
+form:
+
+- trie index (varint), followed by...
+- bucket bitfield (packed in a varint), encoding which of the 4 values at this
+  node of the trie have pointers, followed by one or more...
+- pointer sets, each referencing an entry at (feed index, entry index):
+    - feed index (varint, with a extra "more pointers at this value" low bit,
+      encoded as `feed_index << 1 | more_bit`)
+    - entry index (varint)
+
+In the common case for a small/sparse hyperdb, there will a small number of
+non-empty buckets (small `M`), a usually a single `(feed index, entry index)`
+pointer for those non-empty buckets. For a very large/dense hyperdb (millions
+of key/value pairs), there will be many non-empty buckets (`M` approaching
+`N`), and buckets may have up to the full 4 pointer sets. Even with millions of
+entries, hash collisions will be very rare; in those cases there will be
+multiple pointers in the same pointer set.
+
+Consider an entry with path hash:
+
+```
+[ 1, 1, 0, 0, 3, 1, 2, 3, 3, 1, 1, 1, 2, 2, 1, 1, 1, 0, 2, 3, 3, 0, 1, 2, 1, 1, 2, 3, 0, 0, 2, 1,
+  0, 2, 1, 0, 1, 1, 0, 1, 0, 1, 3, 1, 0, 0, 2, 3, 0, 1, 3, 2, 0, 3, 2, 0, 1, 0, 3, 2, 0, 2, 1, 1,
+  4 ]
+```
+
+and trie:
+
+```
+[ , { val: 1, feed: 0, index: 1 } ]
+```
+
+In this case `N` is 64 (or you could count as 2 if you ignore trailing empty
+entries) and `M` is 1. There will be a single bytestring chunk:
+
+- index varint is `1` (second element in the trie array)
+- bitfield is `0b0010` (varint 2): there is only a pointer set for value 1 (the second value)
+- there is a single pointer in the pointer set, which is (`feed=0 << 1 | 0`,
+  `index=1`), or (varint 2, varint 1)
+
+Combined, the `trie` bytestring will be:
+
+```
+[0x01, 0x02, 0x02, 0x02]
+```
+
+For a more complex example, consider the same path hash, but the trie:
+
+```
+[ , { val: 1, feed: 0, index: 1; val: 2, feed: 5, index: 3; val: 3, feed: 6, index: 98 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
+  , , , , , { val: 0, feed: 0, index, 23; val: 0, feed: 1, index: 17 ]
+```
+
+Now `M` is 2. The first bytestring chunk will be:
+
+- index varint is `1` (second element in the trie array)
+- bitfield is `0b1110` (varint 9): there are three pointer sets
+- first pointer set is (`feed=0 << 1 | 0`, `index=1`) or (varint 2, varint 1)
+- second pointer set is (`feed=5 << 1 | 0`, `index=3`) or (varint 10, varint 3)
+- third pointer set is (`feed=6 << 1 | 0`, `index=98`) or (varint 12, varint 98)
+
+the second bytestring chunk would be:
+
+- index varint is `37` (38th element in the trie array)
+- btifield is `0b0001` (varint 1); there is a single pointer set... but it
+  contains a hash collision, so there are two pointers
+- first pointer is (`feed=0` << 1 | 1`, `index=23`) or (varint 1, varint=23);
+  note the `more` bit is set high!
+- second pointer is (`feed=1 << 1 | 0`, `index=17`) or (varint 2, varint 17);
+  note the `more` bit is low, as usual. In the extremely unlikely case of
+  multiple collisions there could have been more pointers with `more` high
+  preceeding this one.
+
+The overall bytestring would be:
+
+```
+[0x01, 0x09, 0x02, 0x01, 0x0A, 0x03, 0x0C, 0x62, 0x25, 0x1, 0x01, 0x17, 0x02, 0x11]
+```
 
 
 # Examples
 [examples]: #examples
 
+
 ## Simple Put and Get
 
 Starting with an empty hyperdb `db`, if we `db.put('/a/b', '24')` we expect to
@@ -402,7 +492,7 @@ Now we `db.put('/a/c', 'hello')` and expect a second Entry:
   value: 'hello',
   trie:
    [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
-     , , { feed: 0, index: 0 } ] }
+     , , { element: 2, feed: 0, index: 0 } ] }
 ```
 
 The path hash array for this key (index 1) is:
@@ -428,7 +518,7 @@ Next we insert a third node with `db.put('/x/y', 'other')`, and get a third Entr
 { key: 'x/y',
   value: 'other',
   trie:
-   [ , { feed: 0, index: 1 } ],
+   [ , { val: 1, feed: 0, index: 1 } ],
 ```
 
 The path hash array for this key (index 2) is:
@@ -499,8 +589,8 @@ is:
 ```
 { key: 'a/c',
   value: ,
-  trie: [ , { feed: 0, index: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
-          , , { feed: 0, index: 0 } ] }
+  trie: [ , { val: 1, feed: 0, index: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 
+          , , { val: 1, feed: 0, index: 0 } ] }
 ```
 
 The path hash array for this Entry (key) is:
@@ -512,7 +602,6 @@ The path hash array for this Entry (key) is:
 ```
 
 
-
 # Drawbacks
 [drawbacks]: #drawbacks
 
@@ -547,8 +636,6 @@ For a database with most keys having N path segments, the cost of a `get()`
 scales with the number of entries M as `O(log(M))` with best case 1 lookup and
 worst case `4 * 32 * N = 128 * N` lookups (for a saturated `trie`).
 
-TODO: prove or verify the above `O(log(M))` intuition
-
 The cost of a `put()` or `delete()` is proportional to the cost of a `get()`.
 
 The cost of a `list()` is linear (`O(M)`) in the number of matching entries,
@@ -558,6 +645,26 @@ The total metadata overhead for a database with M entries scales with `O(M
 * log(M))`.
 
 
+# Privacy and Security Considerations
+[privacy]: #privacy
+
+The basic key/value semantics of hyperdb (as discussed in this DEP, not
+considering multi-writer changes) are not known to introduce new privacy issues
+when compared with, eg, storing binary values at key-like paths using the
+current ("legacy") hyperdrive system.
+
+A malicious writer could cause trouble for readers, even readers who do not
+trust the application-level contents of a feed. Implementions which may be
+exposed to arbitrary feeds from unknown sources on the internet should take
+care to the following scenarios: A malicious writer may be able to produce very
+frequent key path hash collisions, which could degrade to linear performance. A
+malicious writer could send broken trie structures that contain pointer loops,
+duplicate pointers, or other invalid contents. A malicous writer could write
+arbitrary data in value fields in an attempt to exploit de-serialization bugs.
+A malicious writer could leverage non-printing unicode characters to create
+database entries with user-indistinguishable names (keys).
+
+
 # Rationale and alternatives
 [alternatives]: #alternatives
 
@@ -602,9 +709,13 @@ Further logistical details are left to the forthcoming Multi-Writer DEP.
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
+Should we declare that `contendFeed` pointers *not* change over the history of
+a feed? See also <https://github.com/datprotocol/DEPs/issues/13>
+
 Need to think through deletion process with respect to listing a path prefix;
 will previously deleted nodes be occulded, or potentially show up in list
-results? Should be resolved before Draft.
+results? Should be reviewed (by a non-author of this document) before accepted
+as a Draft.
 
 Can the deletion process (currently leaving "tombstone" entries in the `trie`
 forever) be improved, such that these entries don't need to be iterated over?
@@ -613,6 +724,9 @@ room" for those changes, or should we call out the potential for future change?
 Probably not, should only describe existing solutions. This can be resolved
 after Draft.
 
+Review (or prove?) the `O(log(M))` intuition for `get()` operations. This could
+happen after Draft status.
+
 There are implied "reasonable" limits on the size (in bytes) of both keys and
 values, but they are not formally specified. Protobuf messages have a hard
 specified limit of 2 GByte (due to 32-bit signed arthimetic), and most
@@ -621,7 +735,7 @@ specific limits on key and value sizes? Would be good to decide before Draft
 status.
 
 Apart from leaving fields in the protobuf message specification, multi-writer
-concerns are out of scope for this DEP.
+concerns are explicitly out of scope for this DEP.
 
 
 # Changelog

From e9e1f25bb7d9a027149770be21fad5d2dad3edfa Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 18 Apr 2018 08:03:11 -0700
Subject: [PATCH 18/27] months have passed!

---
 proposals/0000-hyperdb.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 66ca162..546a4e0 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -5,7 +5,7 @@ Short Name: `0000-hyperdb`
 
 Type: Standard
 
-Status: Undefined (as of 2018-03-XX)
+Status: Undefined (as of 2018-04-18)
 
 Github PR: [Draft](https://github.com/datprotocol/DEPs/pull/3)
 
@@ -749,5 +749,6 @@ basis for this DEP.
   in the [hyperdb github repo][arch_md]
 - 2018-03-04: First draft for review
 - 2018-03-15: Hyperdb v3.0.0 is released
+- 2018-04-18: This DEP submitted for Draft review.
 
 [arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md

From 30a910404a796e5a1dfa2c6e41202e469a41f5b9 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 18 Apr 2018 08:59:24 -0700
Subject: [PATCH 19/27] add hash collision examples

---
 proposals/0000-hyperdb.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 546a4e0..26411ff 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -260,6 +260,12 @@ generated as a proof of concept. Implementions should take care to properly
 handle collisions by verifying keys and following bucket pointers (see the next
 section).
 
+An example hash collision (useful for testing; thanks to Github user
+`dcposch`):
+
+    /mpomeiehc
+    /idgcmnmna
+
 <!--
 
 Generation code (javascript) for the above:

From a3a23eb9822c845334343cfc3ae2824006a35fb1 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 18 Apr 2018 08:59:46 -0700
Subject: [PATCH 20/27] rework trie 'termination' case and example

---
 proposals/0000-hyperdb.md | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 26411ff..4663d3e 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -386,8 +386,9 @@ N`). In the encoded field, there will be `M` concatanated bytestrings of the
 form:
 
 - trie index (varint), followed by...
-- bucket bitfield (packed in a varint), encoding which of the 4 values at this
-  node of the trie have pointers, followed by one or more...
+- bucket bitfield (packed in a varint), encoding which of the 5 values (4
+  values if the index is not modulo 32) at this node of the trie have pointers,
+  followed by one or more...
 - pointer sets, each referencing an entry at (feed index, entry index):
     - feed index (varint, with a extra "more pointers at this value" low bit,
       encoded as `feed_index << 1 | more_bit`)
@@ -433,21 +434,22 @@ For a more complex example, consider the same path hash, but the trie:
 
 ```
 [ , { val: 1, feed: 0, index: 1; val: 2, feed: 5, index: 3; val: 3, feed: 6, index: 98 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
-  , , , , , { val: 0, feed: 0, index, 23; val: 0, feed: 1, index: 17 ]
+  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
+  { val: 4, feed: 0, index, 23; val: 4, feed: 1, index: 17 ]
 ```
 
 Now `M` is 2. The first bytestring chunk will be:
 
 - index varint is `1` (second element in the trie array)
-- bitfield is `0b1110` (varint 9): there are three pointer sets
+- bitfield is `0b01110` (varint 9): there are three pointer sets
 - first pointer set is (`feed=0 << 1 | 0`, `index=1`) or (varint 2, varint 1)
 - second pointer set is (`feed=5 << 1 | 0`, `index=3`) or (varint 10, varint 3)
 - third pointer set is (`feed=6 << 1 | 0`, `index=98`) or (varint 12, varint 98)
 
 the second bytestring chunk would be:
 
-- index varint is `37` (38th element in the trie array)
-- btifield is `0b0001` (varint 1); there is a single pointer set... but it
+- index varint is `64` (65th element in the trie array; the terminating value)
+- btifield is `0b10000` (varint 1); there is a single pointer set... but it
   contains a hash collision, so there are two pointers
 - first pointer is (`feed=0` << 1 | 1`, `index=23`) or (varint 1, varint=23);
   note the `more` bit is set high!
@@ -459,7 +461,7 @@ the second bytestring chunk would be:
 The overall bytestring would be:
 
 ```
-[0x01, 0x09, 0x02, 0x01, 0x0A, 0x03, 0x0C, 0x62, 0x25, 0x1, 0x01, 0x17, 0x02, 0x11]
+[0x01, 0x09, 0x02, 0x01, 0x0A, 0x03, 0x0C, 0x62, 0x40, 0x10, 0x01, 0x17, 0x02, 0x11]
 ```
 
 

From 2cffaa1c5b5215a83558647a0658aa691428d127 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 18 Apr 2018 17:29:24 -0700
Subject: [PATCH 21/27] embaressing nmber of tipos

Thank you @ralphtheninja!
---
 proposals/0000-hyperdb.md | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 4663d3e..6fd1872 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -20,7 +20,7 @@ Authors:
 
 Hyperdb is an abstraction layer providing a general purpose distributed
 key/value store over the hypercore protocol. It is an iteration on the
-hyperdrive directory tree implementation, building top of the hypercore
+hyperdrive directory tree implementation, building on top of the hypercore
 append-only log abstraction layer. Keys are path-like strings (eg,
 `/food/fruit/kiwi`), and values are arbitrary binary blobs (generally under a
 megabyte).
@@ -99,9 +99,9 @@ or the "latest" revision.
 Requires read-write access.  Returns an error (eg, via callback) if there was a
 problem.
 
-`db.get(key)`: Reading a non-existant `key` is an error. Read-only.
+`db.get(key)`: Reading a non-existent `key` is an error. Read-only.
 
-`db.delete(key)`: Removes the key from the database. Deleting a non-existant
+`db.delete(key)`: Removes the key from the database. Deleting a non-existent
 key is an error. Requires read-write access.
 
 `db.list(prefix)`: returns a flat (not nested) list of all keys currently in
@@ -163,14 +163,15 @@ own DEP and mentioned only partially here. The fields common to both message
 types are:
 
 - `key`: UTF-8 key that this node describes. Leading and trailing forward
-  slashes (`/`) are always striped before storing in protobuf.
-- `value`: arbitrary byte array. A non-existant `value` entry indicates that
+  slashes (`/`) are always stripped before storing in protobuf.
+- `value`: arbitrary byte array. A non-existent `value` entry indicates that
   this Entry indicates a deletion of the key; this is distinct from specifying
   an empty (zero-length) value.
 - `trie`: a structured array of pointers to other Entry entries in the feed,
   used for navigating the tree of keys.
-- `clock`: reserved for use in the forthcoming `multi-writer` standard An empty
-  list is the safe (and expected) value for `clock` in single-writer use cases.
+- `clock`: reserved for use in the forthcoming `multi-writer` standard. An
+  empty list is the safe (and expected) value for `clock` in single-writer use
+  cases.
 - `inflate`: a "pointer" (reference to a feed index number) to the most recent
   entry in the feed that has the optional `feeds` and `contentFeed` fields set.
   Not defined if `feeds` and `contentFeed` are defined.
@@ -218,7 +219,7 @@ case we use all zeros.
 When converting the 8-bytehash to an array of 2-bit bytes, the ordering is
 proceed byte-by-byte, and for each take the two lowest-value bits (aka, `hash &
 0x3`) as byte index `0`, the next two bits (aka, `hash & 0xC`) as byte index
-`1`, etc. When concatanating path hashes into a longer array, the first
+`1`, etc. When concatenating path hashes into a longer array, the first
 ("left-most") path element hash will correspond to byte indexes 0 through 31;
 the terminator (`4`) will have the highest byte index.
 
@@ -323,7 +324,7 @@ always 0, so we consider only the entry index.
 For a `trie` with `N` buckets, each may have zero or more pointers. Typically
 there are a maximum of 3 pointers per bucket, corresponding to the 4 possible
 values minus the current Entry's value, but in the event of hash collisions (in
-the path array space) , there may be multiple pointers in the same bucket
+the path array space), there may be multiple pointers in the same bucket
 corresponding to the same value.
 
 To lookup a key in the database, the recipe is to:
@@ -359,7 +360,7 @@ Similarly, to write a key to the database:
    a pointer to the Entry with the same hash at the final array index.
 5. If the paths don't entirely match, find the first index at which the two
    arrays differ. Copy all `trie` elements (empty or not) into the new `trie`
-   for indicies between the "current index" and the "differing index".
+   for indices between the "current index" and the "differing index".
 6. Next look up the corresponding element in this Entry's `trie` array at the
    differing index. If this element is empty, then you have found the most
    similar Entry. Write a pointer to this node to the `trie` at
@@ -382,7 +383,7 @@ arrays of pointers to entries) into a variable-length bytestring as the `trie`
 field of an Entry protobuf message.
 
 Consider a trie array with `N` buckets and `M` non-empty buckets (`0 <= M <=
-N`). In the encoded field, there will be `M` concatanated bytestrings of the
+N`). In the encoded field, there will be `M` concatenated bytestrings of the
 form:
 
 - trie index (varint), followed by...
@@ -449,9 +450,9 @@ Now `M` is 2. The first bytestring chunk will be:
 the second bytestring chunk would be:
 
 - index varint is `64` (65th element in the trie array; the terminating value)
-- btifield is `0b10000` (varint 1); there is a single pointer set... but it
+- bitfield is `0b10000` (varint 1); there is a single pointer set... but it
   contains a hash collision, so there are two pointers
-- first pointer is (`feed=0` << 1 | 1`, `index=23`) or (varint 1, varint=23);
+- first pointer is (`feed=0 << 1 | 1`, `index=23`) or (varint 1, varint=23);
   note the `more` bit is set high!
 - second pointer is (`feed=1 << 1 | 0`, `index=17`) or (varint 2, varint 17);
   note the `more` bit is low, as usual. In the extremely unlikely case of
@@ -547,7 +548,7 @@ pointer to entry index 1, so we fetch that Entry and recurse. Comparing path
 hash arrays, we now get all the way to index 34 before there is a difference.
 We again look in the `trie`, find a pointer to entry index 0, and fetch the
 first Entry and recurse. Now the path elements match exactly; we have found the
-Entry we are looking for, and it has an existant `value`, so we return the
+Entry we are looking for, and it has an existent `value`, so we return the
 `value`.
 
 Consider a lookup for `db.get('/a/z')`; this key does not exist, so we expect
@@ -658,7 +659,7 @@ The total metadata overhead for a database with M entries scales with `O(M
 
 The basic key/value semantics of hyperdb (as discussed in this DEP, not
 considering multi-writer changes) are not known to introduce new privacy issues
-when compared with, eg, storing binary values at key-like paths using the
+when compared with, e.g., storing binary values at key-like paths using the
 current ("legacy") hyperdrive system.
 
 A malicious writer could cause trouble for readers, even readers who do not
@@ -699,7 +700,7 @@ for these fields.
 
 Hyperdb is not backwards compatible with the existing hyperdrive metadata,
 meaning dat clients may need to support both versions during a transition
-period. This applies both to archives saved to disk (eg, in SLEEP) and to
+period. This applies both to archives saved to disk (e,g., in SLEEP) and to
 archives received and published to peers over the network.
 
 No changes to the Dat network wire protocol itself are necessary, only changes
@@ -721,7 +722,7 @@ Should we declare that `contendFeed` pointers *not* change over the history of
 a feed? See also <https://github.com/datprotocol/DEPs/issues/13>
 
 Need to think through deletion process with respect to listing a path prefix;
-will previously deleted nodes be occulded, or potentially show up in list
+will previously deleted nodes be occluded, or potentially show up in list
 results? Should be reviewed (by a non-author of this document) before accepted
 as a Draft.
 

From 0f6294fb18cad09dab2ac7fc101dfbaeb3f40773 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 18 Apr 2018 17:35:24 -0700
Subject: [PATCH 22/27] run a spellcheck

---
 proposals/0000-hyperdb.md | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 6fd1872..89ba2c8 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -21,12 +21,12 @@ Authors:
 Hyperdb is an abstraction layer providing a general purpose distributed
 key/value store over the hypercore protocol. It is an iteration on the
 hyperdrive directory tree implementation, building on top of the hypercore
-append-only log abstraction layer. Keys are path-like strings (eg,
+append-only log abstraction layer. Keys are path-like strings (e.g.,
 `/food/fruit/kiwi`), and values are arbitrary binary blobs (generally under a
 megabyte).
 
 Hyperdrive (used by the Dat application) is expected to be re-implemented on
-top of hyperdb for improved performance with many files (eg, millions). The
+top of hyperdb for improved performance with many files (e.g., millions). The
 hyperdrive API should be largely unchanged, but the `metadata` format will be
 backwards-incompatible.
 
@@ -71,13 +71,13 @@ A Hyperdb database instance can be represented by a single hypercore feed (or
 several feeds in a multi-writer context), and is named, referenced, and
 discovered using the public and discovery keys of the hypercore feed (or the
 original feed if there are several). In a single-writer configuration, only a
-single node (holding the secret key) can mutate the database (eg, via `put` or
+single node (holding the secret key) can mutate the database (e.g., via `put` or
 `delete` actions).
 
 **Keys** can be any UTF-8 string. Path segments are separated by the forward
 slash character (`/`). Repeated slashes (`//`) are disallowed. Leading and
 trailing `/` are optional in application code: `/hello` and `hello` are
-equivalent. A key can be both a "path segment" and key at the same time; eg,
+equivalent. A key can be both a "path segment" and key at the same time; e.g.,
 `/a/b/c` and `/a/b` can both be keys at the same time.
 
 **Values** can be any binary blob, including empty (of zero length). For
@@ -96,7 +96,7 @@ creating a new one. A handle could represent any specific revision in history,
 or the "latest" revision.
 
 `db.put(key, value)`: inserts `value` (arbitrary bytes) under the path `key`.
-Requires read-write access.  Returns an error (eg, via callback) if there was a
+Requires read-write access.  Returns an error (e.g., via callback) if there was a
 problem.
 
 `db.get(key)`: Reading a non-existent `key` is an error. Read-only.
@@ -187,9 +187,9 @@ feed, with `feeds` containing a single entry (a pointer to the current feed
 itself), and `contendFeed` optionally set to a pointer to a paired contend
 feed.
 
-If *either* `feeds` *or* `contentFeed` are definined in an entry, *both* fields
-must be defined, so the new message can be refered to via `inflated`. In this
-case the entry is refered to as an "inflated entry".
+If *either* `feeds` *or* `contentFeed` are defined in an entry, *both* fields
+must be defined, so the new message can be referred to via `inflated`. In this
+case the entry is refereed to as an "inflated entry".
 
 
 ## Path Hashing 
@@ -197,7 +197,7 @@ case the entry is refered to as an "inflated entry".
 
 Every key path has component-wise fixed-size hash representation that is used
 by the trie. The concatenation of all path hashes yields a "path hash array"
-for the entire key.  Note that analagously to a hash map data structure, there
+for the entire key.  Note that analogously to a hash map data structure, there
 can be more than one key (string) with the same key hash in the same database
 with no problems: the hash points to a linked-list "bucket" of Entries, which
 can be iterated over linearly to find the correct value. 
@@ -257,7 +257,7 @@ As another example, the key `/a/b/c` converts into the 97-byte hash array:
 
 Note that "hash collisions" are rare with this hashing scheme, but are likely
 to occur with large databases (millions of keys), and collision have been
-generated as a proof of concept. Implementions should take care to properly
+generated as a proof of concept. Implementations should take care to properly
 handle collisions by verifying keys and following bucket pointers (see the next
 section).
 
@@ -457,7 +457,7 @@ the second bytestring chunk would be:
 - second pointer is (`feed=1 << 1 | 0`, `index=17`) or (varint 2, varint 17);
   note the `more` bit is low, as usual. In the extremely unlikely case of
   multiple collisions there could have been more pointers with `more` high
-  preceeding this one.
+  preceding this one.
 
 The overall bytestring would be:
 
@@ -616,7 +616,7 @@ The path hash array for this Entry (key) is:
 
 A backwards-incompatible change will have negative effects on the broader dat
 ecosystem: clients will need to support both versions protocol for some time
-(increasing maintenance burden), future clients may not interoperate with old
+(increasing maintenance burden), future clients may not inter-operate with old
 archives, etc. These downsides can partially be avoided by careful roll-out.
 
 For the specific use case of Dat archives, hyperdb will trivially increase
@@ -663,12 +663,12 @@ when compared with, e.g., storing binary values at key-like paths using the
 current ("legacy") hyperdrive system.
 
 A malicious writer could cause trouble for readers, even readers who do not
-trust the application-level contents of a feed. Implementions which may be
+trust the application-level contents of a feed. Implementations which may be
 exposed to arbitrary feeds from unknown sources on the internet should take
 care to the following scenarios: A malicious writer may be able to produce very
 frequent key path hash collisions, which could degrade to linear performance. A
 malicious writer could send broken trie structures that contain pointer loops,
-duplicate pointers, or other invalid contents. A malicous writer could write
+duplicate pointers, or other invalid contents. A malicious writer could write
 arbitrary data in value fields in an attempt to exploit de-serialization bugs.
 A malicious writer could leverage non-printing unicode characters to create
 database entries with user-indistinguishable names (keys).
@@ -738,7 +738,7 @@ happen after Draft status.
 
 There are implied "reasonable" limits on the size (in bytes) of both keys and
 values, but they are not formally specified. Protobuf messages have a hard
-specified limit of 2 GByte (due to 32-bit signed arthimetic), and most
+specified limit of 2 GByte (due to 32-bit signed arithmetic), and most
 implementations impose a (configurable) 64 MByte limit. Should this DEP impose
 specific limits on key and value sizes? Would be good to decide before Draft
 status.

From 89e0315d74741696c497d09ef09320bbffc5bfa1 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Wed, 18 Apr 2018 17:39:56 -0700
Subject: [PATCH 23/27] clarify list() defaults recursive

---
 proposals/0000-hyperdb.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 89ba2c8..c05bd0b 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -108,7 +108,8 @@ key is an error. Requires read-write access.
 the database under the given prefix. Prefixes operate on a path-segment basis:
 `/ab` is not a valid prefix for key `/abcd`, but is valid for `/ab/cd`. If the
 prefix does not exist, returns an empty list. The order of returned keys is
-implementation (or configuration) specific. Read-only.
+implementation (or configuration) specific. Default listing is recursive
+(implementations may have a flag to control this behavior).  Read-only.
 
 If the hypercore underlying a hyperdb is only partially replicated, behavior is
 implementation-specific. For example, a `get()` call could block until the

From 21f48cf852cbdcae38163dd58fb68eaaf9dd70e1 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 6 May 2018 19:51:14 -0700
Subject: [PATCH 24/27] log(M) reviewed by mafintosh

---
 proposals/0000-hyperdb.md | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index c05bd0b..8990bd4 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -734,9 +734,6 @@ room" for those changes, or should we call out the potential for future change?
 Probably not, should only describe existing solutions. This can be resolved
 after Draft.
 
-Review (or prove?) the `O(log(M))` intuition for `get()` operations. This could
-happen after Draft status.
-
 There are implied "reasonable" limits on the size (in bytes) of both keys and
 values, but they are not formally specified. Protobuf messages have a hard
 specified limit of 2 GByte (due to 32-bit signed arithmetic), and most

From 6643a253f589e80d092cdff14fa007eff69f6390 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 6 May 2018 20:01:31 -0700
Subject: [PATCH 25/27] contend typos

---
 proposals/0000-hyperdb.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 8990bd4..e0f9f2c 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -185,7 +185,7 @@ types are:
 For the case of a single-writer feed, not using multi-writer features, it is
 sufficient to write a single Entry message as the first entry in the hypercore
 feed, with `feeds` containing a single entry (a pointer to the current feed
-itself), and `contendFeed` optionally set to a pointer to a paired contend
+itself), and `contentFeed` optionally set to a pointer to a paired content
 feed.
 
 If *either* `feeds` *or* `contentFeed` are defined in an entry, *both* fields

From 7eadbff9cfd4899e59627e823ae9fe0df1aef06f Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 6 May 2018 20:01:46 -0700
Subject: [PATCH 26/27] clarify that contentFeed is immutable

---
 proposals/0000-hyperdb.md | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index e0f9f2c..634ef15 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -175,12 +175,13 @@ types are:
   cases.
 - `inflate`: a "pointer" (reference to a feed index number) to the most recent
   entry in the feed that has the optional `feeds` and `contentFeed` fields set.
-  Not defined if `feeds` and `contentFeed` are defined.
+  Not defined if `feeds` and `contentFeed` are defined in the same message.
 - `feeds`: reserved for use with `multi-writer`. The safe single-writer value is
   to use the current feed's hypercore public key.
 - `contentFeed`: for applications which require a parallel "content" hypercore
   feed for larger data, this field can be used to store the 32-byte public key
-  for that feed.
+  for that feed. This field should have a single value for the entire history
+  of the feed (aka, it is not mutable).
 
 For the case of a single-writer feed, not using multi-writer features, it is
 sufficient to write a single Entry message as the first entry in the hypercore
@@ -719,9 +720,6 @@ Further logistical details are left to the forthcoming Multi-Writer DEP.
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
-Should we declare that `contendFeed` pointers *not* change over the history of
-a feed? See also <https://github.com/datprotocol/DEPs/issues/13>
-
 Need to think through deletion process with respect to listing a path prefix;
 will previously deleted nodes be occluded, or potentially show up in list
 results? Should be reviewed (by a non-author of this document) before accepted

From aede603f4cf2ef68cc5e0ade7c819f518d27f420 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 6 May 2018 20:04:28 -0700
Subject: [PATCH 27/27] DEP-0003: Hyperdb as Draft

---
 proposals/{0000-hyperdb.md => 0003-hyperdb.md} | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
 rename proposals/{0000-hyperdb.md => 0003-hyperdb.md} (99%)

diff --git a/proposals/0000-hyperdb.md b/proposals/0003-hyperdb.md
similarity index 99%
rename from proposals/0000-hyperdb.md
rename to proposals/0003-hyperdb.md
index 634ef15..1158e4e 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0003-hyperdb.md
@@ -1,11 +1,11 @@
 
-Title: **DEP-0000: Hyperdb**
+Title: **DEP-0003: Hyperdb**
 
-Short Name: `0000-hyperdb`
+Short Name: `0003-hyperdb`
 
 Type: Standard
 
-Status: Undefined (as of 2018-04-18)
+Status: Draft (as of 2018-05-06)
 
 Github PR: [Draft](https://github.com/datprotocol/DEPs/pull/3)
 
@@ -755,5 +755,6 @@ basis for this DEP.
 - 2018-03-04: First draft for review
 - 2018-03-15: Hyperdb v3.0.0 is released
 - 2018-04-18: This DEP submitted for Draft review.
+- 2018-05-06: Merged as Draft after WG approval.
 
 [arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md