Skip to content
This repository has been archived by the owner on May 5, 2022. It is now read-only.

Specifying Range in Link preload header for HTTP/2 Push #139

Closed
roger-on-github opened this issue Jul 12, 2019 · 28 comments
Closed

Specifying Range in Link preload header for HTTP/2 Push #139

roger-on-github opened this issue Jul 12, 2019 · 28 comments
Labels

Comments

@roger-on-github
Copy link

roger-on-github commented Jul 12, 2019

Hi there preload folks. We would like a way to signal an HTTP Range when a resource is specified in a rel=preload Link header.

Our use case for this is media delivery over HTTP, specifically the Low-Latency HLS design described here: https://developer.apple.com/documentation/http_live_streaming/protocol_extension_for_low-latency_hls_preliminary_specification

LL-HLS makes use of HTTP/2 Push to eliminate one round trip per (partial) media segment download. (Client round trip times to media servers can exceed 100 (or even 200) ms, particularly on cellular networks, even in the U.S. When playing at very low delay-from-live (2s or less), the client can only buffer around 1s ahead of the playhead (because that’s all there is). New segments must be loaded in a timely fashion to prevent playback from stalling. Reducing load time by 10% of the overall budget is a significant win.)

The LL-HLS spec includes a BYTERANGE attribute because there are some use cases (such as inserting prerecorded, prepackaged ads into a low-latency live stream) where it’s an advantage to deliver media segments as sub-ranges of larger resources. At the core protocol level, HTTP/2 can already push Range responses. But many HTTP caches (i.e. CDNs) rely on the Link header to communicate the push downstream. We need a way to inform the HTTP cache server of the range.

A recent discussion on the IETF-HTTP-WG list suggests that we’re not the only ones interested in such a feature.

Is this something that you guys could help out with?

@wilaw
Copy link

wilaw commented Jul 12, 2019

+1 Akamai Media would like to publicly support this request.

This approach would help enable a CDN-friendly implementation of LL-HLS. We encourage the development of a standard means of specifying a byte-range preload.

Will Law

@roger-on-github
Copy link
Author

Julian Reschke commented on the HTTP-WG list:

If you do this, please consider range units other than
"bytes". For instance, by embedding the range unit name into the parameter:

Link: </media.mp4>; rel=preload; as=video; type=video/mp4;
range-bytes=1380-14226

@yoavweiss
Copy link
Contributor

From the HTTPWG mailing list discussion, I believe the desire here is for the browser to send range requests when it receives those preload links, regardless of H2 push.

/cc @annevk @jakearchibald who may have opinions on this

@annevk
Copy link
Member

annevk commented Jul 15, 2019

This seems quite problematic from a security perspective as it would allow rather exact probing of a cross-origin resource. If we enforce CORS when such features are used this might be reasonable.

@roger-on-github
Copy link
Author

Can you expand on how adding a range attribute to the Link header would allow exact probing of a cross-origin resource? Note that the server already requires authority over the linked resource to perform the push.

@annevk
Copy link
Member

annevk commented Jul 16, 2019

Are you suggesting that the feature set of Link and <link> diverge when it comes to rel=preload? (The latter has events.)

I'm rather wary of letting connection authority influence the same-origin policy.

cc @sleevi

@roger-on-github
Copy link
Author

roger-on-github commented Jul 16, 2019

They shouldn't diverge without a good reason, no. But before getting to that discussion I'd like to understand the attack you have in mind. How would an attacker use <link> with a range attribute to do exact probing of a cross-origin resource?

@annevk
Copy link
Member

annevk commented Jul 17, 2019

Well, it sounds like you might have a different processing model in mind than I had anticipated, so getting clarity on that first would be good.

@roger-on-github
Copy link
Author

Okay, that's fair. The processing model defined in Section 3.1 (Processing, including by reference Section 4.2.4.3 Fetching and processing a resource from a link element from the HTML Standard) and 3.3 (Server Push (HTTP/2)) should work fine, with the addition that when the user agent fetches the linked resource (3.1) that it create a Range request header from the range attribute in the link and add it to the request, or that when a server pushes the preload link resource (3.3) that it add the Range request header to its push promise headers and send that byte range rather than the entire resource.

@sleevi
Copy link

sleevi commented Jul 17, 2019

Thanks for the ping, @annevk.

Trying to make sure I understand things (I loosely followed the httpbis discussion, and have since caught up on it again)

  • The motivation for extending the rel=preload Link seems to be because certain CDNs use rel=preload as a signal to use HTTP/2 PUSH
  • Those HTTP/2 PUSH streams will go unclaimed if the UA does not also know to issue a Range request for the same resource
  • HTTP/2 PUSH + Range requests is an area of... if not lack of UA consistency, then certainly, a dark area of performance quirks and leopards on filing cabinets

I think it's useful to wholly ignore the HTTP/2 PUSH concerns for a second, least of all because the controversy/concerns around PUSH from various UAs. I'm also a bit uncomfortable coupling our security assumptions within the HTML/W3C model to the underlying assumptions of a particular protocol (e.g. that the connection is authoritative to cross-origin PUSH, which is already something that gives me the skeevies from a security PoV)

If I understand @annevk's concern, the question is about whether or not it's acceptable to give an origin the ability to specify sub-ranges of resources, potentially cross-origin. Anne, did I get that right?

If so, we can probably divide this further into two sets of considerations:

  • Is it safe to allow same-origin 'attacker' specification of sub-ranges?
  • Is it safe to allow cross-origin 'attacker' specification of sub-ranges?
    • With a CORS preflight?
    • Without a CORS preflight, but some other mechanism?
    • Without a CORS preflight at all?

I think these questions further devolve into "Should the attacker be able to specify the Range header?"

We have Range on the privileged no-CORS request-header names, but we remove that header from allowing attackers to specify.

Did I understand everything properly? Apologies if I've completely missed the mark, but just wanting to make sure I've got the right understanding about where some of the concerns may be from various parties.

@roger-on-github
Copy link
Author

I think that both questions are worth considering, but I'd like to understand how any information can be leaked that the attacker didn't have already or that the UA wouldn't be able to obtain just by fetching the resource without the Range header.

Putting range information into the link certainly provides some information about the resource - by design - but annevk's original concern was about "probing" and I still don't understand how that would work.

@sleevi
Copy link

sleevi commented Jul 17, 2019

@roger-on-github Right, I think that's what I was trying to capture with "Should the attacker be able to specify the Range header" question - and it sounds like you're approaching from the inverse, is "What is the bad that could happen if the attacker specifies the Range header (and doesn't require CORS)" - is that fair?

If I recall correctly, some of the past discussion in the Fetch spec has captured some of the concerns re: Range requests:

I'll let Anne respond if they have more context, but I do share an unease about giving developers direct control over Range requests, which this implicitly and explicitly would do.

@annevk
Copy link
Member

annevk commented Jul 19, 2019

So one thing this would allow you to do I think is figure out the length of a "no-cors" cross-origin response. Although there are sidechannel attacks possible on this information, we generally go to great lengths to avoid exposing the exact length (e.g., when storing such responses).

@roger-on-github
Copy link
Author

Thanks @sleevi . I had to spend some time wrapping my head around all that. It seems like the primary threat vector envisioned is an entity from origin X combining an attack payload with a range request to origin Y to exfiltrate information from that request.

So a conservative place to start might be to say that rel=preload with a range attribute require that the linked resource pass the same-origin test (as the page in the <link> case, or the responding server in the Link case) in order to proceed.

@LPardue
Copy link

LPardue commented Jul 24, 2019

Some of these challenges articulated here are caused by overloading preload for two different meanings with different security concerns. There's an escape valve here. Define a relation that is explicitly for push instructions to an edge, and explicitly not interpreted by the browser.

@roger-on-github
Copy link
Author

@LPardue that might be a direction to explore if we can't arrive on a solid security story. But I haven't seen any objections to my last proposal.

@yoavweiss
Copy link
Contributor

@roger-on-github we've traditionally avoided magically setting fetch parameters based on arbitrary attributes, but you could achieve something similar by requiring a cross-origin attribute with an "anonymous" (or empty) value in order to process range requests here.

At the same time, I'm not sure it doesn't open other edge cases, and I'm also not sure if the interaction between the HTTP cache and range requests is well-defined and properly implemented in all browsers.

@LPardue's suggestion seems like something that would significantly simplify things, as you won't have to worry about all of this for rel values that don't have a browser processing model.

@roger-on-github
Copy link
Author

I'm okay with restricting the use case to the Link header. The simplest approach would be to say that the range attribute is not allowed in a link element.

@yoavweiss
Copy link
Contributor

@roger-on-github I don't see how that would solve the concerns raised.

@roger-on-github
Copy link
Author

@yoavweiss Restricting the range attribute to the Link header would mean that it would have no browser (javascript) processing model, correct? That would prevent any javascript attacks. That seems to address all the concerns that were raised.

@rniwa
Copy link

rniwa commented Aug 7, 2019

@yoavweiss Restricting the range attribute to the Link header would mean that it would have no browser (javascript) processing model, correct? That would prevent any javascript attacks. That seems to address all the concerns that were raised.

No, because the script can always create and insert a new link element with rel=preload and mount the same attack. Just allowing in link element is not a meaningful mitigation.

@roger-on-github
Copy link
Author

@rniwa perhaps you misunderstood my suggestion. I'm not proposing allowing it in the link element; I'm proposing to prohibit it in the link element.

@wilaw
Copy link

wilaw commented Aug 8, 2019

I have rethought my position on this issue since originally posting, and after some consideration, now agree with @LPardue . Preload is being overloaded in order to solve a particular problem in live streaming, yet it is carrying with it many unintended complications in security, browser interpretation and CORS access. In retrospect, an easier path forward would be to decouple design of preload from this streaming workflow use-case and for Apple to create and specify a response header of their own creation specifically for the purpose of prompting 'part' H2 pushes by edge servers. This custom origin response header could be unambiguously defined within the HLS specification and I am reasonably sure that all CDNs would be willing to support it outside of W3C definition, given the gravitas of the HLS market.

@yoavweiss
Copy link
Contributor

@wilaw - that would indeed be a cleaner outcome from my perspective. Note that all you'd need to define for this is a new link relation (e.g. "push") that will use a lot of the same
Web Linking infrastructure, only with its own defined semantics (that browsers can ignore), rather than the overloaded preload semantics, which browsers need to take into account.

@LPardue
Copy link

LPardue commented Aug 8, 2019

Preload is being overloaded in order to solve a particular problem in live streaming, yet it is carrying with it many unintended complications in security, browser interpretation and CORS access.

I wasn't singling out the proposal here. Preload is already overloaded for push, which is unfortunate. Adding attributes to fine tune the overloaded relation intent is a smell - we already have nopush.

A new link relation type would deprecate the nopush attribute and provide a partition to evolve server push behaviour without needing to always worry about side effects to the non-push use of preload.

@roger-on-github
Copy link
Author

Another approach which would not require any change to the preload spec would be to make use of the HTTP Link Hints proposal, https://mnot.github.io/I-D/link-hint/ and specifically its accept-ranges hint. In that approach, HTTP caches would take note of the range(s) in the hint and if a cached response contains a rel=preload Link header with such a hint, the edge would H2 push that set of ranges to the client.

(I get that some of you wish that preload had never been tied to push, but I think that ship has sailed. It's already been widely disseminated and adopted.)

@LPardue
Copy link

LPardue commented Aug 8, 2019

Another approach which would not require any change to the preload spec would be to make use of the HTTP Link Hints proposal, https://mnot.github.io/I-D/link-hint/ and specifically its accept-ranges hint. In that approach, HTTP caches would take note of the range(s) in the hint and if a cached response contains a rel=preload Link header with such a hint, the edge would H2 push that set of ranges to the client.

I'm not familiar with that spec but a skim suggests in lay terms that this is just a way for the Link header to include the Accept-Ranges header field, which can hold the units of range (e.g bytes) not the range itself. So I don't think that works as proposed.

(I get that some of you wish that preload had never been tied to push, but I think that ship has sailed. It's already been widely disseminated and adopted.)

Citation please :) Seriously though, I think it would be interesting to get a view of CDN providers that implement H2 Server Push using link rel=preload. And of those: who implement the nopush attribute, who is happy with the status quo, and who is willing to add more complexity to preload vs who might be willing to change to an alternate relation. Balanced against those that want the ability to push ranges.

I'd also put to you that it is difficult for CDNs to determine if the presence of preload is intended for consumption by the CDN or the browser. Right now it is a bit of a guessing game.

Speaking from my own experience, changing push code to trigger from preload to $other_thing is a pretty simple.

@roger-on-github
Copy link
Author

Just to close the loop on this: we've decided to replace the use of HTTP/2 Push in LL-HLS with a blocking request made in advance on a hint URL. So we no longer need Push, and therefore we no longer need to specify Range in the link preload header. From my point of view, this issue can be closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants