-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sproutvideo] Add new extractor (closes #7935, replaces #21962) #27685
base: master
Are you sure you want to change the base?
Conversation
I pull this, run
verbose
./youtube-dl "http://videos.sproutvideo.com/embed/e89bddb01f1be3cf60/0d7fb4d67f328c8b" -v
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'http://videos.sproutvideo.com/embed/e89bddb01f1be3cf60/0d7fb4d67f328c8b', u'-v']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.01.03
[debug] Python version 2.7.16 (CPython) - Darwin-19.6.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[SproutVideo] e89bddb01f1be3cf60: Downloading webpage
[SproutVideo] e89bddb01f1be3cf60: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
File "./youtube-dl/youtube_dl/YoutubeDL.py", line 803, in wrapper
return func(self, *args, **kwargs)
File "./youtube-dl/youtube_dl/YoutubeDL.py", line 824, in __extract_info
ie_result = ie.extract(url)
File "./youtube-dl/youtube_dl/extractor/common.py", line 532, in extract
ie_result = self._real_extract(url)
File "./youtube-dl/youtube_dl/extractor/sproutvideo.py", line 63, in _real_extract
self._sort_formats(formats)
File "./youtube-dl/youtube_dl/extractor/common.py", line 1367, in _sort_formats
raise ExtractorError('No video formats found')
ExtractorError: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. |
I usually test with I will look into it, even though I'm pretty much out of ideas now. |
2nd request is getting 403. If you set you'll see:
detail
$ python test/test_download.py TestDownload.test_SproutVideo
[SproutVideo] 4c9dddb01910e3c9c4: Downloading webpage
send: u'GET /embed/4c9dddb01910e3c9c4/0fc24387c4f24ee3 HTTP/1.1\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nConnection: close\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3532.7 Safari/537.36\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nHost: videos.sproutvideo.com\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
header: Access-Control-Allow-Methods: GET
header: Access-Control-Allow-Origin: *
header: Content-Encoding: gzip
header: Content-Type: text/html; charset=utf-8
header: Date: Sun, 10 Jan 2021 22:53:20 GMT
header: ETag: "-1947696748"
header: p3p: CP="NOI CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT"
header: Referrer-Policy: no-referrer-when-downgrade
header: Set-Cookie: svid=c13f97e9-6320-4676-921c-471827ea0e05; max-age=31556952000; path=/; SameSite=None; Secure
header: Vary: Accept-Encoding
header: X-Powered-By: Express
header: X-XSS-Protection: 0
header: transfer-encoding: chunked
header: Connection: Close
[SproutVideo] 4c9dddb01910e3c9c4: Downloading m3u8 information
send: u'GET /49baec34e9983ed24492919e974bd436/86f37744278646b0dc7c2f60483e8382/video/index.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9obHMyLnZpZGVvcy5zcHJvdXR2aWRlby5jb20vNDliYWVjMzRlOTk4M2VkMjQ0OTI5MTllOTc0YmQ0MzYvODZmMzc3NDQyNzg2NDZiMGRjN2MyZjYwNDgzZTgzODIvKi5tM3U4P3Nlc3Npb25JRD0xZmFlOGY0MC0yNzJiLTRlMjQtYjgwNS00ZTlkNTBiYzliNDYiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2MTAzNDA4MDF9fX1dfQ__&sessionID=1fae8f40-272b-4e24-b805-4e9d50bc9b46&Signature=qIo4NQw-hMPyMFsJ2RLvuEzC92PPPX%7E0iWBL7BnfzPsgQ0%7EoQR--pfa4162wk5IZ-2gMgc3mMC57jMF2fYJwfstFzBCLooF3JFakiWifs%7Exn3dukag381CQquBdaSpObHf8baZsv1Vzgf8zF%7EeAmpzE4W4m9QojVAuDp212Gfqp9lVN6P0kQRe%7EJXkdfsCBkaFaKD7-x7MXM1huVz8K9r9qPgIK9KH8%7EQxwSbRknMeYFU7RkHxtgZ6ISrvVHHlWJ5Orgg71g6-WeHRgmBJ-xG4wyPsnaxdSLvawVqzPZpGjB8R0uyCxMj0j-7UpgBFupHmqbEQQOUMYRabgvO1H3WA__&Key-Pair-Id=APKAIB5DGCGAQJ4GGIUQ HTTP/1.1\r\nOrigin: https://videos.sproutvideo.com\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nHost: hls2.videos.sproutvideo.com\r\nAccept: */*\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3532.7 Safari/537.36\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nConnection: close\r\nCookie: svid=c13f97e9-6320-4676-921c-471827ea0e05\r\nReferer: https://videos.sproutvideo.com/embed/4c9dddb01910e3c9c4/0fc24387c4f24ee3\r\n\r\n'
reply: 'HTTP/1.1 403 Forbidden\r\n'
header: Server: CloudFront
header: Date: Sun, 10 Jan 2021 22:53:20 GMT
header: Content-Type: text/html
header: Content-Length: 919
header: Connection: close
header: X-Cache: Error from cloudfront
header: Via: 1.1 56eff4217adb539e7a42fbab3eee2d4d.cloudfront.net (CloudFront)
header: X-Amz-Cf-Pop: MAD51-C2
header: X-Amz-Cf-Id: LeV99Qe3awkY47TGjk1EkCvEwFnQ5arksen-TCT6CXz2zRg1bScwOQ==
ERROR: Failed to download m3u8 information: HTTP Error 403: Forbidden; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/extractor/common.py", line 632, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 2248, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 435, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 548, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 473, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 556, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
Traceback (most recent call last):
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 803, in wrapper
return func(self, *args, **kwargs)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 824, in __extract_info
ie_result = ie.extract(url)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/extractor/common.py", line 532, in extract
ie_result = self._real_extract(url)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/extractor/sproutvideo.py", line 62, in _real_extract
headers=custom_headers)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/extractor/common.py", line 1636, in _extract_m3u8_formats
fatal=fatal, data=data, headers=headers, query=query)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/extractor/common.py", line 665, in _download_webpage_handle
urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/extractor/common.py", line 652, in _request_webpage
self._downloader.report_warning(errmsg)
File "/Users/gpalumbo/temp/youtube-dl/test/helper.py", line 271, in _report_warning
real_warning(w)
File "test/test_download.py", line 52, in report_warning
raise ExtractorError(message)
ExtractorError: Failed to download m3u8 information: HTTP Error 403: Forbidden; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
E
======================================================================
ERROR: test_SproutVideo (__main__.TestDownload):
----------------------------------------------------------------------
Traceback (most recent call last):
File "test/test_download.py", line 159, in test_template
force_generic_extractor=params.get('force_generic_extractor', False))
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 796, in extract_info
return self.__extract_info(url, ie, download, extra_info, process)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 812, in wrapper
self.report_error(compat_str(e), e.format_traceback())
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 625, in report_error
self.trouble(error_message, tb)
File "/Users/gpalumbo/temp/youtube-dl/youtube_dl/YoutubeDL.py", line 595, in trouble
raise DownloadError(message, exc_info)
DownloadError: ERROR: Failed to download m3u8 information: HTTP Error 403: Forbidden; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
----------------------------------------------------------------------
Ran 1 test in 1.353s
FAILED (errors=1) For some reason, the order of the query parameters is important. If you move the This doesn't work:
This does:
|
It seems more like a CloudFront caching problem. In the request that don't work you get a |
Also getting 403, but the test with |
Commenting to support this request. I'd really like a new extractor for this site. |
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
This PR adds support for SproutVideo and every website/service that use it as a video provider.
This PR closes #7935, #16994, #16996 and #21333.
This PR also replaces #21962 since I cannot push on that PR anymore (due to the DCMA blockage).
Now the extractor adds the correct
Accept
,Origin
andReferer
HTTP header to avoid 403 errors.For the full PR description refer to #21962, Thanks.