Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitswap: Use multihashes, not CIDs, when checking if a CID is in a wantlist. #420

Open
Stebalien opened this issue Aug 31, 2017 · 19 comments

Comments

@Stebalien
Copy link
Member

Stebalien commented Aug 31, 2017

Version information:

go-ipfs version: 0.4.11-dev-48476b292
Repo version: 5
System version: amd64/linux
Golang version: go1.8.3

Type:

Enhancement

Severity:

Implementation: Low
Figuring this out: High

Description:

Currently, we bitswap using CIDs. This means that, for example, if a peer asks for a CIDv1 thing that we store as a CIDv0 thing, we won't be able to give it to them. This is going to be a problem if we decide to switch to base32 as the default multihash encoding.

See ipfs/notes#259 for more design related discussion (this issue is to track the actual bug in go-ipfs).

@Stebalien
Copy link
Member Author

We also need to do this for our:

  • caching blockstores.
  • Our datastores.
  • etc...

@ghost
Copy link

ghost commented Aug 31, 2017

I'm not sure how this is going to establish backward-compatibility -- older nodes will assume the queries are still CIDs?

@Stebalien
Copy link
Member Author

This is just internal; go-ipfs is internally comparing CIDs when it should be comparing multihashes. That is, when we receive a wantlist, we shouldn't take the CID into account when checking if we have the data.

@Stebalien
Copy link
Member Author

The only external part is "providing" blocks. The backwards-compatable fix would be for DHT nodes to ignore CID codecs when checking if they have any matching provider records but that's not the best solution.

@ghost
Copy link

ghost commented Aug 31, 2017

Yeah I get that part, and it helps with the problem that we'll be receiving base32 CIDv1s of stuff that's actually stored with a CIDv0. But older nodes (that compare based on CID, not multihash) won't be able to deal with that, right? So they won't be able to reply to queries.

Check out what we changed in Bitswap a year ago when introducing CID there.

@ghost
Copy link

ghost commented Aug 31, 2017

Here's the Bitswap PR from back then -- the notes about it got lost somewhere, @whyrusleeping or @diasdavid will hopefully know where they are.

@ghost
Copy link

ghost commented Aug 31, 2017

ipfs/kubo#3297

@Stebalien
Copy link
Member Author

Note. CIDs on the network is totally fine and kind of useful (it means one always knows how to interpret a block as IPLD); we'll even need them for DAGSwap.

The problem is that we shouldn't, e.g., refuse to respond to a CIDv1 request because we initially computed a CIDv0 for the block when storing it.

@kevina
Copy link
Contributor

kevina commented Aug 31, 2017

@Stebalien what @lgierth is saying is that if we convert a CIDv0 to CIDv1 and the block is stored as a CIDv0 on an older node then that node will not return with the block even though it has it.

@Stebalien
Copy link
Member Author

Ah. And that's where the backwards compat issue comes in. I wonder if we should fix this issue, have a short upgrade period, and then switch to base32 by default.

Note: Fixing this issue won't introduce any problems, it's just that fixing this issue won't let us switch to base32 CIDv1's by default without any back-compat issues...

We could always bump the bitswap version and always send CIDv0 if possible (if the format is DagProtobuf to old peers (making the assumption that they probably don't have many CIDv1 nodes).

@ghost
Copy link

ghost commented Aug 31, 2017

Yes using multihashes is definitely helpful -- agree that we could bump the bitswap version and use that as a demarkation line.

Could you track down any writing on this from last year? I recall there was a pretty good document about the Bitswap aspect of the CID introduction.

@Stebalien
Copy link
Member Author

I have no idea where that might be; I'm just basing this off of what @diasdavid has been saying.

@ghost
Copy link

ghost commented Aug 31, 2017

It even had a neat table explaining the semantic differences of v0<->v0, v0<->v1, v1<->v1, and v1<->v0 communications. It's a bit funny because this issue here hints at basically reverting the CID change from back then, that's why I'm wondering where these docs have gone.

@Winterhuman
Copy link

@Stebalien Doesn't Bitswap use multihashes now, or should this issue still be open?

@aschmahmann
Copy link
Contributor

Ya, because the blockstore is now keyed by multihash checking is effectively done by multihash even when the API requests by CID.

@Stebalien
Copy link
Member Author

Unfortunately, this isn't entirely fixed. If I add a new block, we check existing wantlists by CID, not multihash. Worse, fixing this is a bit tricky unless/until we change the Multihash representation to be string instead of []byte (for performance reasons).

@Jorropo
Copy link
Contributor

Jorropo commented Aug 1, 2022

@Stebalien we can keep using whatever representation and just use unsafe to cast strings to bytes.

More realistically it's an edge case and we are rewriting the server anyway so might as well fix it.

@Stebalien
Copy link
Member Author

Stebalien commented Oct 11, 2022 via email

@Jorropo
Copy link
Contributor

Jorropo commented Oct 11, 2022

an other which (I think, I havn't tested it) doesn't work is -1 relay of messages with different codecs:

  1. Peer A ask B block X with the raw multicodec.
  2. B also want X but as CBOR, so B ask C about X with CBOR codec.
  3. B receive X from C.
  4. B is supposed to send (or send a HAVE) to A about X, but because the codec don't match it doesn't.

@Jorropo Jorropo changed the title Use multihashes, not CIDs, when checking if a CID is in a wantlist. bitswap: Use multihashes, not CIDs, when checking if a CID is in a wantlist. Jul 27, 2023
@Jorropo Jorropo transferred this issue from ipfs/kubo Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants