Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When Brave is configured to use a public gateway, enforce checking IPFS hashes automatically #13500

Open
bbondy opened this issue Jan 12, 2021 · 6 comments
Labels
feature/web3/ipfs OS/Desktop priority/P3 The next thing for us to work on. It'll ride the trains.

Comments

@bbondy
Copy link
Member

bbondy commented Jan 12, 2021

Currently you only have a guarantee that the files you're accessing on IPFS are what they say they are if you're using a local node. This task is to check the contents of files that are loaded against the CID so that even if you're using a gateway, you can be sure the gateway is not doing anything sketchy.

@bbondy bbondy added the priority/P3 The next thing for us to work on. It'll ride the trains. label Jan 13, 2021
@lidel
Copy link

lidel commented Mar 13, 2021

I keep collecting notes about verifiable HTTP responses in ipfs/in-web-browsers#128. It is a surprise to many that it is not a clear-cut thing.

TLDR is that files bigger than 256KB are chunked and represented as a DAG, where each level is hashed (like in git), so the root CID is not representing the hash of the file, but hash of the DAG representation of the file.

This means right now it is not possible to verify responses bigger than 256KB without knowing how DAG looks like, and for that you need to run IPFS node.

Verifiable gateway via IPFS node in offline mode and CAR import

We are looking into various ways of solving this, details listed on the linked issue, but in case of Brave, I see an additional way of having verifiable gateway responses in form of ipfs:// backed by a public gateway and CAR export/import:

  • When a public gateway is selected as resolution method of IPFS resources we would run IPFS node in offline mode (ipfs daemon --offline) which means it does not connect to the swarm, but still provide local gateway.
  • When CAR export is supported on gateways (Make ipfs.dag.export built-in feature of HTTP gateways ipfs/in-web-browsers#170) we could make ipfs:// in Brave download CAR from gateway and then ipfs dag import file.car to the local datastore before sending the request to local gateway.
  • This way Brave could show ipfs:// in address bar, provide integrity verification via go-ipfs.
    • The node running in offline mode does not participate in p2p networking, instead it would effectively fetch data via public gateway.
    • Data would be cached in local repo, so cached resources would work even if gateway goes bust.
    • CAR format ensures DAG is transported as-is, so if the CID match, it will be present in local cache. If not, node running in offline mode would return error, indicating that the gateway response did not include the requested CID.

@bbondy does this sound feasible, or should we wait for gateway responses that do not require go-ipfs?

@bbondy
Copy link
Member Author

bbondy commented Mar 15, 2021

We can't install go-ipfs without the user opt'ing into it and I think asking the user to opt into this would be complicated for the user UI-wise.

@lidel what about if we add some basic protocol support directly into Brave. This is maybe a start of future things to come. Maybe you can describe how we could do this at the protocol level?

@lidel
Copy link

lidel commented Mar 18, 2021

Ok, so I let's scope the verification problem to files represented with unixfs (files and directories).
Below is a broad strokes explainer that should make it easier to reason about what needs to be done;

In IPFS unixfs files can be represented as a CID with one of two multicodecs:

  • dag-pb - a block of raw bytes wrapped in unixfsv1 protobuf manifest that lists optional links to child nodes that are included in the calculation of the final hash (~ for files bigger than 256KB)
  • raw - a single block of raw bytes without any metadata nor children (small files and raw leaves of bigger dag-pb)

If you want to validate CID without running IPFS node you need to:

  1. Look at multicodec in the CID:
  2. if it is raw then you can just hash the payload and compare it with the hash inside of CID. Done.
  3. if it is dag-pb you need to read the protobuf envelope somehow to know if the CID represents only a single block, or is a parent and additional blocks need to be fetched.
    • this is because behind the scene HTTP gateway re-assembles all blocks from dag-pb tree and returns only the raw bytes of entire original file. In other words, envelopes of all individual blocks are "lost in translation" between IPFS and HTTP, which makes CID validation impossible with the raw data alone.

(2) is easy and could be implemented for small files as a PoC
(3) is difficult because metadata information can't be fetched from the same gateway that we are trying to verify :trollface:

Due to this, we could:

  • (A) have two gateways (one for data and one for metadata), or re-implement dag-pb parser to act as a validator and libp2p network stack for asking IPFS swarm directly for each block, which is effectively re-implementing part of go-ipfs + wasting user bandwidth because data is fetched twice (and if we access ipfs natively, there is no point in using gateway in the first place, so this point makes no sense)
  • (B) park this work until we solve Verifiable HTTP Gateway Responses ipfs/in-web-browsers#128 upstream, in a way that does not require re-implementing big chunk of go-ipfs (eg. via huge blocks [low client overhead] or CAR support [needs some re-implementation on client])

@aschmahmann mind doing sanity check on this? I don't see (C), but lmk if I missed something.

@bbondy
Copy link
Member Author

bbondy commented Mar 18, 2021

I guess B is best to avoid collusion between known preconfigured 2 gateways.

@lidel
Copy link

lidel commented May 5, 2021

Quick update: go-ipfs 0.9.0 will expose /api/v0/dag/export on every public gateway (ipfs/kubo#8111).
It enables thin clients to fetch an archive of entire DAG in a trustless way.

The client working in offline mode (ipfs daemon --offline) will be able to import exported archive via ipfs dag import --pin-roots=false

@lidel
Copy link

lidel commented Nov 22, 2023

FYSA this is now possible thanks to verifiable Block and CAR responses on HTTP Gateways:

There are also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/web3/ipfs OS/Desktop priority/P3 The next thing for us to work on. It'll ride the trains.
Projects
None yet
Development

No branches or pull requests

2 participants