-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gateway): Block and CAR response formats #8758
Conversation
one must imagine Sisyphus happy
This is mvp which reuses http header logic from serveFile, plus custom content-disposition to ensure browsers dont render garbage
This is PoC implementation that returns CAR as a chunked stream. It does not set cache-control nor it has content-length. TBD if we want/can have these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Work in progress, dropping some notes so I don't forget.
2022-03-03 discussion: transporting HTTP CARs around: currently sourcing how to handle errors. |
Personal notes for a future feature request: Get parameters are uselessly hard to deal in many languages. Even tho most of them have escape features, you sometime manually concatenate them with the first concatenation being TL;DR: I want an alternative be That also allows to save up resending the header if you do multiple car requests in a row with HTTP2. |
FYI, Getting that info require a full root-traversal (you can actually skip raw-leaves but anyway that not on point). First CAR intricates:In most cases, the info we trivially have is the If we had I do say
Is variability you cannot account for without a traversal. Hopefully those are only small and doesn't change much. (Is it even ok to send wrong but close content lengths ?) Secondly, duped blocksIf we want to be smart we don't send duped blocks multiple time, however you don't know how much of your dag size is duped that can have massive impacts on the real content size. |
FWIW I've heard the same "uselessly hard" feedback about This is to say, there is no silver bullet, too many use cases. We need to support both (query param, and Accept header). I've seen no consensus around mime-type for Block and CAR so skipped it in the initial PoC, waiting for ipld/go-car#238 to be resolved first, but I think you are right, we should support both from the start:
or to make things less ambiguous, reuse existing IPLD concepts:
I'd rather not risk this type of hackery – it would provide better UX via progress bars etc., but some overly smart HTTP clients may close the connection right after they receive the expected number of bytes, which would truncate the CAR stream. |
I know, that why I think we should have both. :) |
- extracted file-like content type responses to separate .go files - Accept HTTP header with support for application/vnd.ipld.* types (TBD, we did not register them yet, so for illustration purpose only)
Include block and car in unixfs_get_latency_seconds for now, so we keep basic visibility into gateway behavior until better metrics are added by #8441
fa78402
to
ee7b0ae
Compare
6a51127
to
43dc5bf
Compare
.raw may be handled by something, depending on OS, and .bin seems to be universially "binary file" across all systems: https://en.wikipedia.org/wiki/List_of_filename_extensions_(A%E2%80%93E)
This test uses official CARv1 fixture from https://ipld.io/specs/transport/car/fixture/carv1-basic/ The CAR has two dag-cbor roots, and we use one of them, which represents a nice DAG with both dag-cbor, dag-pb and raw blocks
#8758 (comment) lidel needs some sleep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
CodeQL is unhappy but that is safe because in fact the user controlled input is passed in fmt
's %q
formater, which leaves us with fairly safe UTF-8.
Related histogram metrics are added in a separate PR: #8443 |
Summary
This PR aims to add support for requesting alternative response format via:
?format=
URL paramererAccept: application/vnd.ipld.{format}
HTTP headerThis MVP supports two formats:
raw
– fetching single blockcar
– fetching entire DAG behind a CID as a CARv1 streamTLDR Demo
Downloading a Block
Note: we return
Content-Type: application/vnd.ipld.raw
– see ipfs/in-web-browsers#191Downloading a CAR
Note:
Content-Type: application/vnd.ipld.car; version=1
– see [IANA #1228646] Request for media type application/vnd.ipld.car in-web-browsers#192Rationale
Why we need both
References
TODO
serveFile
Last-Modified
only whenCache-Control
is missingCache-Control
for immutable content pathsTODO
aboutLast-Modified
when UnixFS 1.5 is supported?as=
(aka?format=
)serveBlock
attachment
+ implicit filename of{cid}.ipfs.block
)application/vnd.*
– need decision in Specify mime type for car ipld/go-car#238 (comment)TBD: ability to test if block is present in local cache viaHEAD /ipfs/{cid}
(local test for block, no network fetch)t0117-gateway-block.sh
serveCar
TBD (is content-length deterministic? blocks may have different order, but returning full DAG should always return same total of bytes)too riskyattachment
+ implicit filename of{cid}.ipfs.car
)attachment
+ implicit filename of{cid}.ipfs.car
)X-Stream-Error
so receiving end may be able to debug things, as long its not a browserW/"<etag_value>"` – it indicates that response can't be cached if byte-for-byte reproducibility is required.
TBD: supportselector=multibase(cbor selector)
?format=car&selector=
is tracked in https://github.com/ipfs/go-ipfs/issues/8769application/vnd.*
– need decision in Specify mime type for car ipld/go-car#238 (comment) + always include version in response + handle when processing requestt0118-gateway-car.sh
create more useful metrics, deprecateunixfsGetMetric
TODO (future PRs)
After this PR lands, we can add key IPLD formats (
serveDagCbor
,serveDagJson
) – ipfs/in-web-browsers#182