Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify handling blocks that aren't serialized objects #2067

Closed
ianopolous opened this issue Dec 14, 2015 · 12 comments
Closed

Clarify handling blocks that aren't serialized objects #2067

ianopolous opened this issue Dec 14, 2015 · 12 comments
Labels
need/community-input Needs input from the wider community

Comments

@ianopolous
Copy link
Member

Currently, arbitrary data can be stored and retrieved with block.{put/get}. However, the resulting blocks cannot be pinned, as pinning tries to deserialize an object.

It would seem either blocks should be enforced to always be serialized objects, or pinning should be able to handle non-object blocks.

@jbenet
Copy link
Member

jbenet commented Dec 14, 2015

I think:

blocks should be enforced to always be serialized objects

@whyrusleeping
Copy link
Member

Once CIDv1 lands, you will be able to address raw blocks. That should solve this issue

@whyrusleeping whyrusleeping added this to the ipld integration milestone Aug 19, 2016
@robcat
Copy link

robcat commented Aug 20, 2016

@whyrusleeping

Once CIDv1 lands, you will be able to address raw blocks. That should solve this issue

I don't see how the issue will solve itself.
Will the blockstore also store separately the <mcodec> mentioned in ipfs/specs#130?

@em-ly em-ly added the need/community-input Needs input from the wider community label Aug 25, 2016
@whyrusleeping
Copy link
Member

@robcat the mcodec is embedded in the cid of the block, so yeah, the blockstore will technically be storing the mcodec.

@robcat
Copy link

robcat commented Nov 24, 2016

@whyrusleeping
But how exactly?

Example: A CID with unary encoding gets pulled. I'm expecting the blockstore to store it in binary form on my (binary) disk and to be able to retrieve it with the unary CID.

@Kubuxu
Copy link
Member

Kubuxu commented Nov 24, 2016

@robcat try echo 'raw data' | ipfs block put -f raw -.

@robcat
Copy link

robcat commented Nov 24, 2016

Ok I got it, the block gets named <version><mcodec><mhash> (in base32). This solves this issue.

But in my view the blockstore shouldn't be aware of the CID version and codec, these exist at a higher level (i.e. the object layer, where the deserialization happens).

@Kubuxu
Copy link
Member

Kubuxu commented Nov 24, 2016

There is reason for that, I will allow @whyrusleeping to elaborate.

@whyrusleeping
Copy link
Member

whyrusleeping commented Nov 25, 2016

@robcat Short answer, it could go either way.

Long answer: We could potentially make the blockstore not care about formats and manage to handle things for the most part without them. The reason we opted for storing formats is that it makes everything we store on disk self describing. You can, without any other context, look at blocks and be able to tell what they are. This is really helpful for writing tools like ipfs-see-all that crawl the repo and do analytics on it. Its also somewhat required for blockstore.AllKeysChan to work correctly. (otherwise, it would have to return just generic multihashes and we would be left making guesses about the types of things)

@mib-kd743naq
Copy link
Contributor

Elaborating on #2067 (comment) a bit: there must be some sort of flag to allow hash specification. Currently -f raw assumes sha2-256 and that's that...

@whyrusleeping
Copy link
Member

@mib-kd743naq mmm, yeah. Great point.

@whyrusleeping
Copy link
Member

We can now add 'raw blocks' to ipfs that can be pinned as normal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/community-input Needs input from the wider community
Projects
None yet
Development

No branches or pull requests

7 participants