-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[roosterteeth] Added new extractor #6536
Conversation
Broken on python 2: py26yt "http://roosterteeth.com/show/red-vs-blue#;season=.* 1$" -v
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'http://roosterteeth.com/show/red-vs-blue#;season=.* 1$', u'-v']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2015.08.09
[debug] Git HEAD: 5e879ff
[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-73993-g8a17335, ffprobe N-73993-g8a17335, rtmpdump 2.4
[debug] Proxy map: {}
Traceback (most recent call last):
File "youtube_dl/__main__.py", line 19, in <module>
youtube_dl.main()
File "C:\Dev\git\youtube-dl\master\youtube_dl\__init__.py", line 410, in main
_real_main(argv)
File "C:\Dev\git\youtube-dl\master\youtube_dl\__init__.py", line 400, in _real_main
retcode = ydl.download(all_urls)
File "C:\Dev\git\youtube-dl\master\youtube_dl\YoutubeDL.py", line 1653, in download
url, force_generic_extractor=self.params.get('force_generic_extractor', False))
File "C:\Dev\git\youtube-dl\master\youtube_dl\YoutubeDL.py", line 655, in extract_info
ie_result = ie.extract(url)
File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\common.py", line 286, in extract
return self._real_extract(url)
File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\roosterteeth.py", line 51, in _real_extract
ep_filter = compat_urllib_parse.parse_qs(params)
AttributeError: 'module' object has no attribute 'parse_qs' Doesn't work for me at all: py34yt "http://roosterteeth.com/show/red-vs-blue#;season=.* 1$" -v
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['http://roosterteeth.com/show/red-vs-blue#;season=.* 1$', '-v']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2015.08.09
[debug] Git HEAD: 5e879ff
[debug] Python version 3.4.3 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-73993-g8a17335, ffprobe N-73993-g8a17335, rtmpdump 2.4
[debug] Proxy map: {}
[RoosterteethShow] red-vs-blue: Downloading webpage
ERROR: Unable to download webpage: HTTP Error 403: Forbidden (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are usin
g the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\common.py", line 325, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "C:\Dev\git\youtube-dl\master\youtube_dl\YoutubeDL.py", line 1860, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "C:\python\python343\lib\urllib\request.py", line 469, in open
response = meth(req, response)
File "C:\python\python343\lib\urllib\request.py", line 579, in http_response
'http', request, response, code, msg, hdrs)
File "C:\python\python343\lib\urllib\request.py", line 507, in error
return self._call_chain(*args)
File "C:\python\python343\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\python\python343\lib\urllib\request.py", line 587, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp) UPD: serves me with 403 captcha. |
I don't think URL is a good place for some custom extractor-specific filters that's not supported by the website itself. It should be generic or should not at all. There is a |
I've fixed the first error and removed my own filter. I'm not sure what caused the 403 error in your case. Can you try again with
|
<html>
<head>
<title>Rooster Teeth · Argh!</title>
<style type='text/css'>body,td{cursor:default;}body{background:#fff;color:#000;font:12px Arial , Helvetica , sans-serif;}h2{color:#222;}td{font-size:11px;line-height:150%;vertical-align:top;}a{font-size:11px;text-decoration:none;line-height:150%;color:#c2262b;cursor:pointer;}a:hover{text-decoration:underline;}.secret{font-size:11px;color:#eee;}</style>
</head>
<body>
<center>
<img width="300" height="194" src="">
<br />
<div>
<form class="challenge-form" id="challenge-form" action="/cdn-cgi/l/chk_captcha" method="get">
<script type="text/javascript" src="/cdn-cgi/scripts/cf.challenge.js" data-type="custom" data-ray="214e099963600ca7" async></script>
<noscript id="cf-captcha-bookmark" class="cf-captcha-info">
<iframe src="//www.google.com/recaptcha/api/noscript?k=6LeT6gcAAAAAAAZ_yDmTMqPH57dJQZdQcu6VFqog" height="300" width="500" frameborder="0"></iframe>
<input type="hidden" name="recaptcha_response_field" value="manual_challenge">
<label for="manual_recaptcha_challenge_field">Enter confirmation code after solving challenge above</label>
<textarea id="manual_recaptcha_challenge_field" name="recaptcha_challenge_field" rows="3" cols="40"></textarea>
<button type="submit" class="cf-captcha-submit">Submit</button>
</noscript>
</form>
</div>
<br />
<br />
</center>
</body>
</html> |
By the way, technically you are free to build titles in any way. |
That captcha page is generated by CloudFlare. Strange, I've never encountered it myself. I'd like to add the season to an episode's title but the video page doesn't include the required information and I don't see a way to pass the season from |
Yes, I can browse it in browser it after solving the captcha. Workaround is to pass cookies exported from browser to youtube-dl. But there is nothing can be done in extractor. |
Alright, now you can use The extractor now uses the native HLS implementation. Anything else I should change? |
youtube_dl/extractor/roosterteeth.py
Outdated
if 'youtubeKey' not in meta: | ||
raise ExtractorError('Invalid metadata for youtube video!') | ||
|
||
res = self.url_result('https://youtube.com/watch?v=' + meta['youtubeKey']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should directly do:
res = {
'_type': 'url_transparent',
'url': 'https://youtube.com/watch?v=' + meta['youtubeKey'],
'id': video_id,
}
…he season name to episodes when downloading from a show page.
I've updated the extractor and rebased the fork. Anything else I should change? |
Since #8497 landed, please move changes in |
Still waiting for the fixes? I still can't download like from https://roosterteeth.com/episode/death-battle-season-5-doctor-strange-vs-doctor-fate-marvel-vs-dc ... :( |
This extractor was for the old site. The video embeds have changed and this code won't work anymore. |
This extractor allows you to download single videos are whole seasons from roosterteeth.com, achievementhunter.com and fun.haus.
The RoosterteethShowIE allows you to filter videos using a simple regex filter (the second test case contains an example). I've added that feature since I found no other way to do this with YTDL's own filters. I hope you don't mind.
This resolves #6371.