-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prioritizing unixfs-v2 #19
Comments
This is a good question - thanks for raising it. Would love thoughts from @Stebalien @alanshaw @daviddias on the extent they see this contributing to go/js goals (and the extent to which work we're doing now might be deprecated by this transition) |
It's currently part of a P2 milestone in the JS roadmap and I literally just (ROUGHLY) timelined this to be completed in Q3. That said, which implementation is ready? JS? We should get it in earlier if so, and people can start playing with it and then switch the default when we're happy? |
@alanshaw there’s an independent implementation in JS but it wouldn’t be good for more than just a reference point, it wasn’t designed to be integrated into IPFS, it was only designed to test that the spec was implementable. For instance, it doesn’t use That said, if you give me a list of the interfaces it needs to use and what functionality you’d like to see from the implementation, I can go off and write another implementation fairly quickly. I could even get this in the Q2 OKR’s if necessary. |
As of this commit the 2019 IPFS roadmap has: "Go and JS IPFS enable modern IPFS data formats (UnixFSv2, CIDv1, raw blocks) by default and in a reproducible way" The last part is also relevant to recent discussions on reproducible file imports here: ipld/legacy-unixfs-v2#15 |
@Stebalien says this would be great to happen in Q2 - this could be a great excuse to integrate go-ipld-prime into go-ipfs. Aiming to have this as optional / in heavy testing by EOQ puts us on a really good trajectory toward 1.0. Note, we'd want to switch to Rabin (or alt) chunking at the same time. I guess the question is whether @warpfork would be freeing up for this given his deep expertise/thinking. |
Great goal, I'm onboard -- and simultaneous switch to (new) Rabin would indeed be ideal -- 50/50 on if that's actually reachable by early summer. But I'd be perfectly happy to be reaching in that direction. |
Big 👍 for reproducible file imports being shipped with unixfsv2. Sidenote: apart from storing metadata in unixfsv2 by default (ipld/legacy-unixfs-v2#15), we need tools to be able to deterministically freeze/reproduce all parameters during import (eg. |
We found a rather elegant way to handle the chunker part of this w/ the IPLD type system. We can easily choose the same chunker when updating a file based on its Binary Type. However, we need IPFS to have non-configurable logic on which chunker to choose for new files to make this entirely deterministic. We’ll also need the file metadata IPFS adds to files and directories to be consistent and non-configurable for the same reasons. As far as the spec goes, it doesn’t guarantee determinism because so many things can be done optionally, but we can guarantee determinism in the way IPFS produces unixfsv2 files/directories if we are willing to remove the configuration. |
Oh, another thing I’ve said to people but probably haven’t written down yet. There is no “ideal chunker” for every file you come across. Different files will have different optimal chunkers. Rabin produces far more small chunks than you would like with compressed media, and doesn’t provide any useful de-duplication. Compressed media ideally has a chunker that understands the compression algorithm and can chunk the keyframe boundaries which makes range requests in the file operate much more efficiently. At some point we’re going to want a content type dependent chooser that selects specific chunkers for compressed media, rabin for text, and a fixed size encoder for everything else (or maybe rabin, not sure what the profile here is). |
Mega +1 from me on this too. I sometimes describe this concern as "IPFS is content-addressible on read... and content-plus-a-bunch-of-flags-addressable on write"... which is an issue that varies between merely terrifyingly to being an outright blocker depending on application. "content-plus-a-bunch-of-flags-addressable" gives up a lot of the benefits content-addressability promises in the first place! There are lots of parts of the IPFS stack where it's perfectly sensible for libraries to be designed to be super configurable... but in IPFS as an application as a whole, we should be getting significantly less configurable for lots of these things, because too much flexibility is the source of this problem. It can be (seemingly paradoxically) better for the ecosystem as a whole if we don't expose so many knobs that it lets the ecosystem fracture itself based on relatively inconsequential twiddlings of those knobs!
👍👍👍 |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Files in IPFS are currently encoded using the old
dag-pb
and often with CIDv0.For some time we'e been working on a specification for encoding files using
dag-cbor
(or any IPLD code that supports the full data model).There we numerous reasons we started the transition away from
dag-pb
including ease of development and performance. The longer we put off completing this transition in IPFS the more "old" data we'll be creating. Additionally, a lot of performance work we might do in IPFS may end up getting thrown out in this transition since it's based on the old encoding system.We expect the unixfs-v2 spec to continue to evolve over time based on feedback from implementations. However, we now have one independent implementation and think it's time for IPFS to begin adopting it and working with the IPLD team on incorporating any feedback into the spec.
Given our workload and the limited resources in IPLD we'd like to know what priority this has in the IPFS project and where it should fit in the roadmap and OKR's so that we can appropriately support IPFS' adoption.
The text was updated successfully, but these errors were encountered: