-
Notifications
You must be signed in to change notification settings - Fork 5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a better direction than more dramatic changes proposed in an earlier version.
I would just clarify few things so that expectations are absolutely clear to the implemeter:
- Is return value a promise, or actual value or both ? I think it should be clear and reader should be adviced which option to prefer in which case. e.g. do not return promise unless underlying implementation requires it that would allow IPFS to better optimize throughput.
- Is error part of API ? When implemeter is expected to error and how ? What happens when error occurs ?
I also find util / resolver seperation confusing & it’s unclear what is the purpose of it ? Maybe this is an opportunity to remove that separation and switch to a flat structure instead
README.md
Outdated
|
||
`callback` must have the signature `function (err, dagNode)`, where `err` is an Error if the function fails and `dagNode` is the dagNode that got deserialized in the process. | ||
Returns a Promise containing the Javascript object. This object must be able to be serialized with a `serialize()` call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe go as far as saying await serialize(await deserialize(binaryBlob))
should resolve to bytes equal to binaryBlob
README.md
Outdated
|
||
> resolves a path in block, returns the value and or a link and the partial missing path. This way the IPLD Resolver can fetch the link and continue to resolve. | ||
> Resolves a path within the blob, returns the value and or a link and the partial missing path. This way the `js-ipld` can fetch the link and continue to resolve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth giving this structure a name e.g Cursor
to ease communication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used ResolverResult
. I don't like it much, but I also didn't like Cursor
:) Ideas for a better names are welcome.
@@ -114,25 +125,26 @@ If `path` is the root `/`, the result is a nested object that contains all paths | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example is no longer up to date here
- value: <> - The value resolved or an IPLD link if it was unable to resolve it through. | ||
- remainderPath: <> - The remaining path that was not resolved under block scope. | ||
- `value` (`IPLD Data`): the value resolved, on of those from the [IPLD Data model](https://github.com/ipld/specs/blob/master/IPLD-Data-Model-v1.md) | ||
- remainderPath (`string`): the remaining path that was not resolved under block scope | ||
|
||
If `path` is the root `/`, the result is a nested object that contains all paths that `tree()` returns. The values are the same as accessing them directly with the full path. Example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn’t it still be {value:..., remainderPath:''}
? Otherwise caller will need to deal with the fact that sometimes it’s cursor and at other times it’s not, and what if node happens to have remainderPath
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I’m misunderstanding if path is /
resulting cursor will have value equal to one returned by deserialize(bytes)
. If so worth mentioning IMO, if not so also worth pointing that out
README.md
Outdated
|
||
`IpldNode` is a previously deserialized binary blob. | ||
|
||
Returns a Promise containing a `Uint8Array` with the serialized version of the given IPLD Node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it return Uint8Array
without wrapping it in a promise ? If so worth mentioning that.
|
||
#### `util.deserialize(binaryBlob, callback)` | ||
The result is a JavaScript object. Its fields are the public API that can be resolved through. It’s up to the format to add convenient methods for manipulating the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant to say “returns promise for JS object representing a node...”
I also assume that wrapping into promise isn’t required, however it’s worth communicating
|
||
> get the CID of a binary blob | ||
#### `util.cid(binaryBlob[, options])` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can options made non optional ? That would simplify implemeter side of things as of consumption it’s always indirect anyway (as far as I can tell)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that normally you would just call it without any options as you would want to use the default values for the CID version and the hash algorithm.
I thought that if I want to be able to return a Promise, I would always need to (I was actually already concerned about the perf implications). But after a quick test it looks like I can also
I should indeed add more about that in the doc.
Me too. Though I see this as a first iteration we can ship quickly. For example |
Yes you can await on any value, promise or not (in fact if it has However if I’m not mistaken |
I would still leave Also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, help me out here because I'm seeing a bigger breaking change than just async: the current version uses a format-specific "dagNode", which is returned by deserialize(blob)
and required by serialize(blob)
. You then use resolve()
on that "dagNode" to get at properties of it and even instantiate a full JavaScript object representing the data. I believe that's right isn't it? js-ipld-dag-pb dedicates a heap of code to this.
This new implementation seems to do away with that and talks about an "IPLD Node", which should take the shape of the deserialized data itself? (Object.keys(thing)
gives you keys of the underlying data, presumably giving you the ability to use getters on it to lazy decode? but not asynchronously then). "add convenient methods for manipulating the data" suggests it can be decorated with additional methods? Maybe they can't be enumerable but they also couldn't clash with possible key values of the data, which leads to the toString
/ valueOf
problem. "deserialize()
and resolve('/')
is indeed the same" suggests that the object returned by deserialize()
is the actual data because isn't resolve()
responsible for decoding the node into a proper JavaScript object?
I think I like the original version, or at least my reading of it, just async deserialize()
, it returns some format-specific object, serialize()
knows how to deal with only that type of object, resolve()
lets you get at the underlying data in a predictable and uniform way across formats and can even be used to take the format-specific object and instantiate a plain old JS object from it. Async resolve()
solves all of the the laziness problems raised in #50.
In line with this, the Definitions section needs to be edited or removed, it's referring to "dagNode" still while the rest has moved on to this new "IPLD Node" and IpldNode
language.
README.md
Outdated
|
||
`callback` must have the signature `function (err, binaryBlob)`, where `err` is an Error is the function fails and `binaryBlob` is a Buffer containing the serialized version. | ||
> Deserialize into internal representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*"into the internal representation of an IPLD Node"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was actually completely wrong. Correct is: "Deserialize a binary blob into an IPLD Node."
README.md
Outdated
|
||
- value: <> - The value resolved or an IPLD link if it was unable to resolve it through. | ||
- remainderPath: <> - The remaining path that was not resolved under block scope. | ||
- `value` (`IPLD Data`): the value resolved, on of those from the [IPLD Data model](https://github.com/ipld/specs/blob/master/IPLD-Data-Model-v1.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the value resolved, on of those from the" -> "the value resolved, whose type is one of the"
@rvagg Not sure what you're getting at here,
You mean
I think you're misunderstanding something here. In other words That is also why I'm suggesting to make I hope this clarifies it. js-ipld-dag-pb dedicates a heap of code to this. Not familiar with that code but hopefully above explains it.
I don't believe it has anything to do with lazy decoding. I would assume it's about convenience methods like
My understanding is that I think you're misunderstanding As of comment about P.S. Nave collision isn't a big deal if all the methods are in the prototype (unless user attempts to call that method), but since node interface is designed by the same author as a format likely collisions can be avoided.
This however makes me wonder if |
I hope I've addressed all the review comments with the new commit. Things I haven't addressed:
I agree that this is very important and needs to be part of this spec. Though as it is not part yet, I'd like to wait until we agreed on the API (in hope to save some time).
This was indeed the idea. I've put it in to make it clear that you are allowed to do that. I didn't want to be too strict to begin with as it might be useful in some cases.
That is 100% correct. But that's something I'd like to change in order to make IPLD Formats nicer to use. |
@rvagg: Do your concerns still hold try. I start to implement those changes (and see how this goes). |
@vmx I don't think I'm in a position where I can hold firm opinions about this stuff, so don't treat my comments as blockers. I'll be interested to see it in code form. |
@mikeal and @Gozala: we had a discussion over at js-ipld-dag-pb about what's sync and what's async and in the the latest iteration of the refactoring @vmx has eradicated a the vast majority of async and faux-async operations across the current formats, particularly dag-pb which has only a single point where it does an async operations: generating a CID from an existing node. So the current iteration of this doc reflects that and is mostly-sync because the current formats are mostly-sync (and historically have been implemented faux-async with generous helpings of That's going to be a problem for anything that inserts crypto into the flow, but I think the current position is something like: there's no current implementation that inserts such needs and the performance problems of the faux-async approach has already been sufficiently highlighted so let's reserve async for only cases where it's actually necessary. Then js-ipld-stack can be the forum for getting it right with encryption as a first-class concern. That's where we either need to explore |
Since the code is already written in If we try to do encryption at the block layer we’ll end up with a multicodec for every encryption technique and we’d need to build encrypted dag codecs in order to replicate them. None of that sounds ideal to me. |
I think you'll need that even if it's a separate layer because links from encrypted blocks should be hidden for entities that don't have permission (ability to decrypt that). That is not to say it should be necessarily be done at the block layer, rather to say that if encrypted blocks were to become first class all of the replication / selection stack will need to have design constraint reflecting that sometimes Dag traversal would need to deal with decryption. |
Few notes:
|
@Gozala We are indeed swinging into the other extreme directing with making almost everything sync. The current API doesn't work that well for numerous reasons, hence we do experiments over at https://github.com/ipld/js-ipld-stack. Util we get there I think it makes sense changing the current API to async/await and to the bare minimum what we need now, and put all other things we want/need in the future into ipld-stack. |
Ya, you’ll notice that |
It is now clearer that the deserialzed data needs to have only types specified by the IPLD Data Model. BREAKING CHANGE: Switch from callback-based interface to a Promise-based one The major change is switching to Promises for the function signatures. Other changes: - The `multicodec` propery is now called `codec` - `tree()` returns an Iterable - All methods except for `cid()` are now synchronous
It is now clearer that the deserialzed data needs to have only types
specified by the IPLD Data Model.
BREAKING CHANGE: Switch from callback-based interface to a Promise-based one
The major change is switching to Promises for the function signatures.
Other changes:
multicodec
propery is now calledformat
tree()
returns an Iterable/cc @Gozala