Hypercore Header DEP #34

pfrazee · 2018-06-26T18:02:05Z

Standardizes a HypercoreHeader message which can be used as the first entry in hypercores to declare the datastructure along with custom fields.

RangerMauve

I really like the MyCustomHeader example! Didn't know you could do something like that. :D

bnewbold

I still kind of wish we could deal with nested schemas like: hyperdb-v2/hyperdrive-v4, but I think the discussion consensus was to just keep it simple.

I'm really excited that this came out backwards compatible with existing
hyperdrive!

I don't remember that much about the wire representation of protobuf, but we
will implicitly still be trying to parse a lot of random bytestrings to see if
they "match" this schema. Should we mention that folks should specifically
avoid having a protobuf message that has a string field as the first entry,
because that could accidentally get interpreted as a "type" string? Do we need
to set a sane max version length (1024 bytes?) to prevent gnarly/weird parsing?

bnewbold · 2018-06-27T06:19:39Z

proposals/0000-hypercore-header.md

+# Rationale and alternatives
+[alternatives]: #alternatives
+
+This standard is backwards compatible with v1 of hyperdrive, which used an `Index` message as the first entry in the metadata hypercore.


Oh, :party:!

bnewbold · 2018-06-27T06:19:43Z

proposals/0000-hypercore-header.md

+}
+```
+
+Data structures should not add additional fields to the `HypercoreHeader`. They should put any additional data into the `extension`. This is easy to do with protobuf's nested messages. For example:


Should implementations fail fast on additional fields, or should they ignore additional fields? The later is future-compatible, the former is not.

Ignore, probably. I'll add a note.

bnewbold · 2018-06-27T06:19:48Z

proposals/0000-hypercore-header.md

+}
+```
+
+The `type` string should be unique to the data structure (refer to existing DEPs to avoid conflicts). When a breaking change is made to the data-structure, it should be given a new `type` string. For example, the `"hyperdrive"` type string will be updated to `"hyperdrive-v2"` for its second version.


Does this imply that the - character is special or reserved? Should successor versions use the -vNNN syntax?
To clarify my bikeshedding, I think encoding the version as an unstructured string is a reasonable choice, but I think a default delimiter between "schema" and "version/variant" should be chosen.

Some common aesthetically pleasing separators would be -, /, ., _. How do you read one-two_three-four? As two identifiers (one-two followed by three-four), or three (one, two_three, four)?

To propose something concrete, lets say dash (-) is the separator, and identifiers should have dashes in them. Let's also say that the version is optional but recommended ("hyperdrive-v0"), and that the x- or X- prefixes are interpreted as "experimental" (aka, no DEP exists yet).

I'm okay with some conventions but I'm not sure I want to standardize a syntax. Just not sure it's worth it. Reason being:

Let's also say that the version is optional but recommended ("hyperdrive-v0"), and that the x- or X- prefixes are interpreted as "experimental" (aka, no DEP exists yet)

This is totally reasonable, but you then need to interpret "hyperdrive-v0" == "hyperdrive", and I just don't know if we need that level of detail for this.

bnewbold · 2018-06-27T06:19:51Z

proposals/0000-hypercore-header.md

+
+A program that is reading a hypercore will examine the `type` to determine how to process the hypercore.
+
+```js


I have a soft preference for using pseudo code rather than literal javascript in the DEPs. In additional to being more implementation-agnostic, an argument against using actual code is that it comes with an implication that this exact API is stable, or that this exact code is the canonical code that people should always copy/paste into their applications even years down the line.

Okay I'll rewrite with that in mind.

pfrazee · 2018-06-27T16:07:39Z

I still kind of wish we could deal with nested schemas

I'm not against revisiting this. It should pretty much always be layered (one structure on top of another) so all we need is a standard separator. The downside is that it adds some complexity (multiple types emitted instead of just 1).

I don't remember that much about the wire representation of protobuf, but we
will implicitly still be trying to parse a lot of random bytestrings to see if
they "match" this schema.

Hmm, yeah that's true.

Should we mention that folks should specifically avoid having a protobuf message that has a string field as the first entry, because that could accidentally get interpreted as a "type" string?

Tricky. If it ends up possible that non-header messages have strings at tag 1, then we have to interpret an unknown type string in the same way we interpret no header at all. We could add some magic numbers to the schema, except that wouldn't be backwards compatible. Might not be a huge deal.

Do we need to set a sane max version length (1024 bytes?) to prevent gnarly/weird parsing?

Is there a way to put a max string length on protobuf fields? @mafintosh

pfrazee · 2018-07-02T17:21:58Z

proposals/0000-hypercore-header.md

+
+```protobuf
+message HypercoreHeader {
+  required string type = 1;


I'm starting to wonder if we should call this dataStructure instead of type, because "type" is a really generic word that could be used elsewhere in the stack while "data structure" is very specific to what this is.

Noting that this can be updated without any backwards compat issues because of protobuf

type is very generic. I'd bikeshed to contentType or contentSchema, but loosely held opinion (dataStructure works for me).

yoshuawuyts · 2018-07-02T21:14:31Z

proposals/0000-hypercore-header.md

+
+A program that is reading a hypercore will examine the `type` to determine how to process the hypercore.
+
+```


file type here might be helpful to read the rendered version

We intentionally are trying to write pseudo code

yoshuawuyts · 2018-07-02T21:16:18Z

proposals/0000-hypercore-header.md

+
+At time of writing, Dat's standard data structures are being expanded upon (hyperdb) while existing data structures are being updated with breaking changes (hyperdrive). In order for these changes to be deployed smoothly to the network, it's important that the following needs be met:
+
+ 1. Programs should correctly identify the data structure type and version within a hypercore. This will enable the program to use the correct handling code (eg hyperdb, hyperdrive-v1, hyperdrive-v2).


the example of hyperdrive-v1 is somewhat at odds with the example of hyperdrive throughout the rest of the article. Might be best to pick one and stick with it?

Agreed that we should change it to hyperdrive

yoshuawuyts · 2018-07-02T21:18:17Z

Are the proposed version names for hyperdrive / hyperdb also part of the DEP, or do they only serve as an example? If they only serve as an example, it might be good to make it extra clear that the names aren't being specified in this DEP.

edit: typo

pfrazee · 2018-07-02T21:21:36Z

Are the proposed version names for hyperdrive / hyperdb also part of the DEP, or do they only serve as an example? If they only serve as an example, it might be good to make it extra clear that the names aren't being specified in this DEP.

Agree. They're just examples

bnewbold · 2018-07-06T15:46:14Z

To summarize the current state of this in my eyes:

needs a couple tweaks to naming and examples in the DEP text, but I would approve to merge trusting that Paul will make the changes first
I'd like to move forward, but with the expectation that we should code this up ("working code") and make sure that it allows beaker/dat to disambiguate and handle both old and new hyperdrive before 100% commiting to this scheme (aka, move to Draft ASAP, also prove this out in code to get to Active status ASAP, as getting this correct for the long-term is crucial for the ecosystem)

pfrazee · 2018-07-06T15:54:53Z

Just fixed that "hypercore-v1" thing and added a note that the strings will be in the data structures' DEPs.

I renamed type to dataStructureType. I'm open to discussion on this. I have some code that's trying to handle dats generally in preparation for more types. In it, I referred to each general item as a "dat" and the instance of hypercore/hyperdrive/hyperdb as that dat's "data structure". So:

var dat = {
  key: '...',
  dataStructureType: 'hyperdrive',
  dataStructure: hyperdrive(...)
}

The word dataStructure is a little clunky but it does seem to work alright.

pfrazee added 2 commits June 26, 2018 12:40

Add 0000-hypercore-header.md

f0a4869

Add maf as an author to hypercore-header

96db1ff

RangerMauve approved these changes Jun 26, 2018

View reviewed changes

bnewbold reviewed Jun 27, 2018

View reviewed changes

pfrazee added 2 commits June 27, 2018 11:30

Use more generic pseudo-code

913a0b6

Ignore additional fields

1a22f61

pfrazee commented Jul 2, 2018

View reviewed changes

yoshuawuyts reviewed Jul 2, 2018

View reviewed changes

pfrazee mentioned this pull request Jul 6, 2018

Upcoming Meeting Agenda - 4th July 2018 dat-ecosystem/consortium#25

Closed

6 tasks

hypercore-v1 -> hypercore

5569167

header.type -> header.dataStructureType

a412d3e

Publish 0007-hypercore-header

a100c71

pfrazee merged commit 61681ac into dat-ecosystem-archive:master Jul 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hypercore Header DEP #34

Hypercore Header DEP #34

pfrazee commented Jun 26, 2018

RangerMauve left a comment

bnewbold left a comment

bnewbold Jun 27, 2018

bnewbold Jun 27, 2018

pfrazee Jun 27, 2018

bnewbold Jun 27, 2018

pfrazee Jun 27, 2018

bnewbold Jun 27, 2018

pfrazee Jun 27, 2018

pfrazee commented Jun 27, 2018

pfrazee Jul 2, 2018

mafintosh Jul 6, 2018

bnewbold Jul 6, 2018

yoshuawuyts Jul 2, 2018

pfrazee Jul 2, 2018

yoshuawuyts Jul 2, 2018

pfrazee Jul 2, 2018

mafintosh Jul 6, 2018

yoshuawuyts commented Jul 2, 2018 •

edited

Loading

pfrazee commented Jul 2, 2018

bnewbold commented Jul 6, 2018

pfrazee commented Jul 6, 2018


		A program that is reading a hypercore will examine the `type` to determine how to process the hypercore.

		```js


		At time of writing, Dat's standard data structures are being expanded upon (hyperdb) while existing data structures are being updated with breaking changes (hyperdrive). In order for these changes to be deployed smoothly to the network, it's important that the following needs be met:

		1. Programs should correctly identify the data structure type and version within a hypercore. This will enable the program to use the correct handling code (eg hyperdb, hyperdrive-v1, hyperdrive-v2).

Hypercore Header DEP #34

Hypercore Header DEP #34

Conversation

pfrazee commented Jun 26, 2018

RangerMauve left a comment

Choose a reason for hiding this comment

bnewbold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pfrazee commented Jun 27, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yoshuawuyts commented Jul 2, 2018 • edited Loading

pfrazee commented Jul 2, 2018

bnewbold commented Jul 6, 2018

pfrazee commented Jul 6, 2018

yoshuawuyts commented Jul 2, 2018 •

edited

Loading