-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HLS manifest is fetched across origins #29
Comments
Let me try to summarize the available options. Strict MIME type enforcement for DASH/HLSOne option on the table is to make ORB allow DASH/HLS MIME types like This option is appealing from an ease-of-implementation perspective. Let me try to outline my understanding of web-compatibility of this option (i.e. will adopting this option break websites that used to work in browser X / on platform Y):
Detecting DASH/HLS by sniffing (including range responses)Another option is to sniff DASH/HLS responses. Some sniffing ideas have been described by @sandersdan in whatwg/html#6468 (comment). I note that sniffing based on a single ASCII character (e.g. 0x47) seems insecure - if a sensitive resource (e.g. HTML, XML, or JSON) contains such character, then an attacker might issue a range request that starts with such character. This approach also assumes that DASH/HLS implementations will only issue range requests that start at segment boundaries. Parsing DASH/HLS manifest in ORBAnother option is to fully parse DASH/HLS manifests in ORB - this would help to:
This approach requires parsing the whole response body (not just the 1st 1024 bytes). This is quite similar to how the full response body might need to be parsed as Javascript (and therefore maybe this is something that ORB implementations need to tackle anyway). This approach also tightly couples ORB implementations and DASH/XML standard (e.g. future changes to the manifest format would need to be reflected in ORB's parser). |
The metrics will capture what MIME types are used for DASH/HLS manifests. The metrics will help make an informed decision on whether strict MIME type enforcement is feasibile. This should help move forward with annevk/orb#29 (comment) Bug: 1178928 Change-Id: I6d8e38763658b316b5bda7aea5cef083ab83cd54 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3283540 Commit-Queue: Łukasz Anforowicz <[email protected]> Commit-Queue: Takashi Toyoshima <[email protected]> Reviewed-by: Takashi Toyoshima <[email protected]> Reviewed-by: Dan Sanders <[email protected]> Auto-Submit: Łukasz Anforowicz <[email protected]> Cr-Commit-Position: refs/heads/main@{#943658}
As discussed in whatwg/html#6468 it's not clear to me how HLS doesn't allow a complete bypass of ORB. And that's because the origins of the resources the HLS resource points to don't have to match the HLS resource origin. So an attacker can create a HLS resource and make it point to a variety of resources across the web they want to read. I suppose an option there might be that decoding happens in another process, but doesn't decoding sometimes work for somewhat arbitrary inputs which would then result in information leakage? |
There are demonstrated exploits that use concatenation; {known header} + {injected bytes} can be used to reliably leak data. This was possible to do in Chrome in the distant past by replying to a request with partial data, then redirecting the followup range request. I'm not sure if/how a similar approach could be applied to HLS. It has powerful range primitives, but is also expecting the media data to be complete within each chunk (rather than allowing arbitrary concatenation). I would say that it's theoretically possible for such a thing to occur, but without some exciting new technique it's not a feasible attack in practice.
This much seems to be true. HLS somewhat explicitly is allowed to access cross-origin content in a way that is incompatible with ORB, because it allows the Content-Type to be arbitrary. Maybe there is a hybrid request mode we could use? Something like: start with no-cors, then if the Content-Type isn't safe switch to making cors requests (or perhaps the reverse to encourage CORS). This might work around the issue of few sites setting the |
Can you elaborate on what this means? Does each "complete" piece of media data have some kind of identifiable container?
Interesting idea. Let's see:
I think it would be simpler to allow HLS resources to bypass ORB (assuming we can identify them through sniffing or MIME type) and require CORS for HLS subresources. Or is what you're saying that even HLS subresources could often be identified as media and so we wouldn't have to fallback as early? An alternative on that might be that if an origin hosts an HLS resource and that points to subresources on the same origin, we'd request those with no-cors (as well as those on the same origin as the requestor, although that is a bit of a confused deputy situation I'm guessing it is okay given that it's media), but any resources on other origins go with CORS. |
Each segment is sequence of frames in a media container, in the sense that demuxing can start at the first byte. It should also end demuxing at the last byte, but I wouldn't recommend relying on implementations to check that before using any of the data. Unlike the concatenation example above, HLS demuxer state does not carry over and resume in the next segment. Each segment is fully independent. The container for media data in HLS is either MPEG TS or MP4. MPEG TS is a sequence of header+data chunks that can be parsed sequentially; MP4 is a tree structure that is parsed completely before extracting media data. (Note: in practice MP4 is typically ordered such that streaming is possible once the important metadata is parsed.) Either of those can be easily verified given all of the bytes, but detecting them from only a short window at the start is challenging. A valid MP4 can start with essentially arbitrary bytes unless we make restrictions that go beyond the specification requirements (eg. we could require that a known box type occurs within that window).
I'm offering that if sniffing isn't working out then a hybrid request strategy may be a sufficient workaround. We know that many HLS resources are being served with inaccurate The combination of inaccurate (Edit: It's possible to combine everything together: if |
Given how desktop Chromium and Gecko browsers work, I'm starting to lean on 'Strict MIME type enforcement for DASH/HLS' approach. |
I've updated the linked bug with stable results. In short, the non-stable results were roughly accurate. |
And those 1.5% are without cors? Or is it just checking mimetype? (It seems a bit surprising for non-WebView case that one would use cross-origin without cors, if desktop browsers require cors through fetch(). But perhaps sites do have very specific code paths for mobile) |
I do not have data that is able to correlate MIME type with CORS state. |
@annevk I wonder if Apple has any data on what kinds of requests HLS implementation ends up doing. |
The metrics will capture what MIME types are used for DASH/HLS manifests. The metrics will help make an informed decision on whether strict MIME type enforcement is feasibile. This should help move forward with annevk/orb#29 (comment) Bug: 1178928 Change-Id: I6d8e38763658b316b5bda7aea5cef083ab83cd54 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3283540 Commit-Queue: Łukasz Anforowicz <[email protected]> Commit-Queue: Takashi Toyoshima <[email protected]> Reviewed-by: Takashi Toyoshima <[email protected]> Reviewed-by: Dan Sanders <[email protected]> Auto-Submit: Łukasz Anforowicz <[email protected]> Cr-Commit-Position: refs/heads/main@{#943658} NOKEYCHECK=True GitOrigin-RevId: 961e8b5d5738c33d6ab18915ac4e929370105aa7
@smaug---- apologies for responding so late. What kind of data are you looking for exactly? The request mode of the requests made based on the data in the HLS manifest? |
Which request mode and which mimetype is being used with HLS. |
As discussed in whatwg/html#6468 we need to make adjustments to handle HTTP Live Streaming correctly.
The text was updated successfully, but these errors were encountered: