Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dropout] Add new extractor #19296

Closed
wants to merge 379 commits into from
Closed

[dropout] Add new extractor #19296

wants to merge 379 commits into from

Conversation

tsia
Copy link
Contributor

@tsia tsia commented Feb 21, 2019

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

this adds the support for intl.dropout.tv (logging in etc).
the videos themselves are hosted on vhx so the video download itself is already part of youtube-dl

unfortunately i wasn't able to get tests to use my credentials for the site. if someone can help me out i'm happy to add them.

(see #19146)

@tsia tsia closed this Feb 21, 2019
@tsia tsia reopened this Feb 21, 2019
@tsia
Copy link
Contributor Author

tsia commented Feb 21, 2019

well... turns out, when you log in too many times dropout blocks you temporarily

@tsia
Copy link
Contributor Author

tsia commented Feb 22, 2019

and there are my tests finally

youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
@pisto
Copy link

pisto commented Mar 13, 2019

Just pinging this.

youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
youtube_dl/extractor/intldropout.py Outdated Show resolved Hide resolved
@dstftw
Copy link
Collaborator

dstftw commented Mar 17, 2019

For any further work and review on this you must provide account credentials.

@tsia
Copy link
Contributor Author

tsia commented Mar 18, 2019

I have sent you an email with the credentials. I hope this is OK.

@Romern
Copy link

Romern commented Apr 19, 2019

Single videos work:

youtube-dl -u USER -p PASSWORD https://intl.dropout.tv/um-actually/season:1/videos/c-3po-s-origins-hp-lovecraft-the-food-album-with-weird-al-yankovic
[intldropout] Downloading login page
[intldropout] Logging in
[intldropout] None: Downloading webpage
[vhx:embed] 397785: Downloading webpage
[vhx:embed] 397785: Downloading JSON metadata
[vhx:embed] 397785: Downloading m3u8 information
[vhx:embed] 397785: Downloading MPD manifest
[vhx:embed] 397785: Downloading JSON metadata
[dashsegments] Total fragments: 220
[download] Destination: C-3PO's Origins, HP Lovecraft, the Food Album (with Weird Al Yankovic)-397785.fdash-video-1168827972.mp4
[download] 100% of 682.88MiB in 03:28
[dashsegments] Total fragments: 220
[download] Destination: C-3PO's Origins, HP Lovecraft, the Food Album (with Weird Al Yankovic)-397785.fdash-audio-1168798321.m4a
[download] 100% of 21.69MiB in 00:52
[ffmpeg] Merging formats into "C-3PO's Origins, HP Lovecraft, the Food Album (with Weird Al Yankovic)-397785.mp4"
Deleting original file C-3PO's Origins, HP Lovecraft, the Food Album (with Weird Al Yankovic)-397785.fdash-video-1168827972.mp4 (pass -k to keep)
Deleting original file C-3PO's Origins, HP Lovecraft, the Food Album (with Weird Al Yankovic)-397785.fdash-audio-1168798321.m4a (pass -k to keep)

but playlists are throwing errors:

youtube-dl --cookie cookiefile https://intl.dropout.tv/um-actually --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--cookie', u'cookiefile', u'https://intl.dropout.tv/um-actually', u'--verbose']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.03.01
[debug] Python version 2.7.15rc1 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout:playlist] um-actually: Downloading webpage
[download] Downloading playlist: Um, Actually
[intldropout:playlist] playlist Um, Actually: Collected 24 video ids (downloading 24 of them)
[download] Downloading video 1 of 24
[intldropout] None: Downloading webpage
ERROR: Unable to extract embed; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/YoutubeDL.py", line 794, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 508, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/intldropout.py", line 84, in _real_extract
    video = self._html_search_regex(r'<iframe[^>]*"(?P<embed>https://embed.vhx.tv/videos/[0-9]+[^"]*)"[^>]*>', webpage, 'embed')
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 992, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 983, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract embed; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

EDIT: Also after finishing the process, youtube-dl should logout the account, because there is a 3 device limit which is based on the amount of logins, not if you are actually watching.
EDIT²: Is there actually such a functionality in youtube-dl for doing stuff after finishing?
EDIT³: Also some videos work, while others do not:

youtube-dl --cookie cookiefile https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--cookie', u'cookiefile', u'https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat', u'--verbose']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.03.01
[debug] Python version 2.7.15rc1 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] None: Downloading webpage
[vhx:embed] 397937: Downloading webpage
[vhx:embed] 397937: Downloading JSON metadata
[vhx:embed] 397937: Downloading m3u8 information
[vhx:embed] 397937: Downloading MPD manifest
[vhx:embed] 397937: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on u'https://164skyfiregce-vimeo.akamaized.net/exp=1558469364~acl=%2F304872563%2F%2A~hmac=a71ae801a73cb1d0baf404077eb40064cc576a1a73cd848f7beae199d3943021/304872563/sep/video/1168802399,1168826090,1168826089,1168806310/../'
[dashsegments] Total fragments: 227
[download] Destination: Incantations, Immortans, Vulcans in Heat-397937.fdash-video-1168826090.mp4
[download]   0.3% of ~359.17MiB at  3.05MiB/s ETA 02:47^C
...
youtube-dl --cookie cookiefile https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--cookie', u'cookiefile', u'https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros', u'--verbose']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.03.01
[debug] Python version 2.7.15rc1 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] None: Downloading webpage
ERROR: Unable to extract embed; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/YoutubeDL.py", line 794, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 508, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/intldropout.py", line 84, in _real_extract
    video = self._html_search_regex(r'<iframe[^>]*"(?P<embed>https://embed.vhx.tv/videos/[0-9]+[^"]*)"[^>]*>', webpage, 'embed')
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 992, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "/usr/local/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 983, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract embed; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

@tsia
Copy link
Contributor Author

tsia commented Apr 25, 2019

i added a small check to prevent double logins. i am using the cookie file which seems to cause additional logins for some reason.

my account is currently at its device limit so as soon as i can use it again i will double check the videos that don't seem to work.

i would also like to know if there is a way to log out after youtube-dl is finished.

@tsia
Copy link
Contributor Author

tsia commented Apr 25, 2019

@RomanKarwacik i checked your examples and they seem to work for me. maybe you too ran into the device limit? can you try again? with my latest change you should at least get a sensible error message in case you run into the limit

@Romern
Copy link

Romern commented Apr 26, 2019

I tried to force the device limit by downloading the working link a couple times without cookie, and it worked as intended:

$ youtube-dl -u PRIVATE -p PRIVATE https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat --verbose

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-u', 'PRIVATE', '-p', 'PRIVATE', 'https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.04.24
[debug] Python version 3.6.7 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] Downloading login page
[intldropout] Logging in
[intldropout] None: Downloading webpage
[vhx:embed] 397937: Downloading webpage
[vhx:embed] 397937: Downloading JSON metadata
[vhx:embed] 397937: Downloading m3u8 information
[vhx:embed] 397937: Downloading MPD manifest
[vhx:embed] 397937: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://player.vimeo.com/skyfire/redirect/304872563.mpd?allow_fastly_packager=0&expires=1558971663&hevc=0&sig=e6d75afc1d1b361088e58ab4a4620db433184637'
[dashsegments] Total fragments: 227
[download] Destination: Incantations, Immortans, Vulcans in Heat-397937.fdash-video-1168826090.mp4
[download]   0.0% of ~359.17MiB at 395.64KiB/s ETA 02:01:31^C
ERROR: Interrupted by user
$ youtube-dl -u PRIVATE -p PRIVATE https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat --verbose

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-u', 'PRIVATE', '-p', 'PRIVATE', 'https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.04.24
[debug] Python version 3.6.7 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] Downloading login page
[intldropout] Logging in
[intldropout] None: Downloading webpage
[vhx:embed] 397937: Downloading webpage
[vhx:embed] 397937: Downloading JSON metadata
[vhx:embed] 397937: Downloading m3u8 information
[vhx:embed] 397937: Downloading MPD manifest
[vhx:embed] 397937: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://player.vimeo.com/skyfire/redirect/304872563.mpd?allow_fastly_packager=0&expires=1558971672&hevc=0&sig=b8935100e975c6d288b1755c35213e5ddf9180c0'
[dashsegments] Total fragments: 227
[download] Destination: Incantations, Immortans, Vulcans in Heat-397937.fdash-video-1168826090.mp4
[download]   2.6% of ~623.18MiB at  8.39MiB/s ETA 02:11^C
ERROR: Interrupted by user

$ youtube-dl -u PRIVATE -p PRIVATE https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat --verbose

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-u', 'PRIVATE', '-p', 'PRIVATE', 'https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.04.24
[debug] Python version 3.6.7 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] Downloading login page
[intldropout] Logging in
[intldropout] None: Downloading webpage
[vhx:embed] 397937: Downloading webpage
[vhx:embed] 397937: Downloading JSON metadata
[vhx:embed] 397937: Downloading m3u8 information
[vhx:embed] 397937: Downloading MPD manifest
[vhx:embed] 397937: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://player.vimeo.com/skyfire/redirect/304872563.mpd?allow_fastly_packager=0&expires=1558971682&hevc=0&sig=82fea89392c7225522339c1d8281d5e9b571208c'
[dashsegments] Total fragments: 227
[download] Destination: Incantations, Immortans, Vulcans in Heat-397937.fdash-video-1168826090.mp4
[download]   3.7% of ~671.34MiB at  8.47MiB/s ETA 00:43^C
ERROR: Interrupted by user
$ youtube-dl -u PRIVATE -p PRIVATE https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat --verbose

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-u', 'PRIVATE', '-p', 'PRIVATE', 'https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.04.24
[debug] Python version 3.6.7 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] Downloading login page
[intldropout] Logging in
[intldropout] None: Downloading webpage
ERROR: Device Limit reached
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/YoutubeDL.py", line 796, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/extractor/common.py", line 529, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/extractor/intldropout.py", line 88, in _real_extract
    raise ExtractorError('Device Limit reached', expected=True)
youtube_dl.utils.ExtractorError: Device Limit reached

The error for the other links which did not work before still do not work:

youtube-dl --cookie cookiefile https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--cookie', 'cookiefile', 'https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.04.24
[debug] Python version 3.6.7 (CPython) - Linux-4.19.21-041921-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.4, ffprobe 3.4.4, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[intldropout] None: Downloading webpage
ERROR: Unable to extract embed; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/YoutubeDL.py", line 796, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/extractor/common.py", line 529, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/extractor/intldropout.py", line 89, in _real_extract
    video = self._html_search_regex(r'<iframe[^>]*"(?P<embed>https://embed.vhx.tv/videos/[0-9]+[^"]*)"[^>]*>', webpage, 'embed')
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/extractor/common.py", line 1013, in _html_search_regex
    res = self._search_regex(pattern, string, name, default, fatal, flags, group)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl-2019.4.24-py3.6.egg/youtube_dl/extractor/common.py", line 1004, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract embed; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

@tsia
Copy link
Contributor Author

tsia commented Apr 26, 2019

that is so weird. for me this link is working fine.

/usr/local/src/youtube-dl# ./youtube-dl --cookie cookies.txt https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--cookie', u'cookies.txt', u'https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros', u'--verbose']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.04.24
[debug] Python version 2.7.13 (CPython) - Linux-4.9.0-8-amd64-x86_64-with-debian-9.8
[debug] exe versions: ffmpeg 3.2.12-1, ffprobe 3.2.12-1
[debug] Proxy map: {}
[intldropout] None: Downloading webpage
[vhx:embed] 397938: Downloading webpage
[vhx:embed] 397938: Downloading JSON metadata
[vhx:embed] 397938: Downloading m3u8 information
[vhx:embed] 397938: Downloading MPD manifest
[vhx:embed] 397938: Downloading JSON metadata
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on u'https://player.vimeo.com/skyfire/redirect/304872562.mpd?allow_fastly_packager=0&expires=1558740558&hevc=0&sig=da9bb9f7954001a55af525bfec524f0b9efe1145'
[dashsegments] Total fragments: 207
[download] Destination: Ganondorf, Gremlins, Geography of Westeros-397938.fdash-video-1168829462.mp4
[download] 100% of 636.46MiB in 01:44
[debug] Invoking downloader on u'https://player.vimeo.com/skyfire/redirect/304872562.mpd?allow_fastly_packager=0&expires=1558740558&hevc=0&sig=da9bb9f7954001a55af525bfec524f0b9efe1145'
[dashsegments] Total fragments: 207
[download] Destination: Ganondorf, Gremlins, Geography of Westeros-397938.fdash-audio-1168802403.m4a
[download] 100% of 20.47MiB in 01:04
[ffmpeg] Merging formats into "Ganondorf, Gremlins, Geography of Westeros-397938.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel 'repeat+info' -i 'file:Ganondorf, Gremlins, Geography of Westeros-397938.fdash-video-1168829462.mp4' -i 'file:Ganondorf, Gremlins, Geography of Westeros-397938.fdash-audio-1168802403.m4a' -c copy -map '0:v:0' -map '1:a:0' 'file:Ganondorf, Gremlins, Geography of Westeros-397938.temp.mp4'
Deleting original file Ganondorf, Gremlins, Geography of Westeros-397938.fdash-video-1168829462.mp4 (pass -k to keep)
Deleting original file Ganondorf, Gremlins, Geography of Westeros-397938.fdash-audio-1168802403.m4a (pass -k to keep)

@Romern
Copy link

Romern commented Apr 26, 2019

I used curl to download the pages manually:

$ curl -s -b cookiefile https://intl.dropout.tv/um-actually/season:1/videos/incantations-immortans-vulcans-in-heat | grep 'https://embed.vhx.tv/videos'
<iframe src="https://embed.vhx.tv/videos/397937?api=1&amp;autoplay=1&amp;back=Um%2C%20Actually%20%26ndash%3B%20Season%201&amp;color=feea3b&amp;context=https%3A%2F%2Fintl.dropout.tv%2Fum-actually%2Fseason%3A1&amp;live=0&amp;playsinline=1&amp;referrer=&amp;sharing=1&amp;title=0" allow="encrypted-media; autoplay; fullscreen" id="watch-embed" class="sticky-player-child embed-content border-reset" title="Video Player" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
      embed_url: "https://embed.vhx.tv/videos/397937?api=1&amp;autoplay=1&amp;back=Um%2C%20Actually%20%26ndash%3B%20Season%201&amp;color=feea3b&amp;context=https%3A%2F%2Fintl.dropout.tv%2Fum-actually%2Fseason%3A1&amp;live=0&amp;playsinline=1&amp;referrer=&amp;sharing=1&amp;title=0",
$ curl -s -b cookiefile https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros | grep 'https://embed.vhx.tv/videos'
    <div class="embed" data-trailer-url="https://embed.vhx.tv/videos/405679"></div>
      embed_url: "https://embed.vhx.tv/videos/397938?api=1&amp;autoplay=1&amp;back=Um%2C%20Actually%20%26ndash%3B%20Season%201&amp;color=feea3b&amp;context=https%3A%2F%2Fintl.dropout.tv%2Fum-actually%2Fseason%3A1&amp;live=0&amp;playsinline=1&amp;referrer=&amp;sharing=1&amp;title=0",

@tsia
Copy link
Contributor Author

tsia commented Apr 26, 2019

i get this:

# curl -s -b cookies.txt https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros | grep 'https://embed.vhx.tv/videos'
<iframe src="https://embed.vhx.tv/videos/397938?api=1&amp;auth-user-token=eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo1Mzk0NDEwLCJleHAiOjE1NTYyNzUyMTN9.HU-FmyyCEx5UR5lZSkewsecyLupytzTq0xoEG5hNHGc&amp;autoplay=1&amp;back=Um%2C%20Actually%20%26ndash%3B%20Season%201&amp;color=feea3b&amp;context=https%3A%2F%2Fintl.dropout.tv%2Fum-actually%2Fseason%3A1&amp;live=0&amp;playsinline=1&amp;referrer=&amp;sharing=1&amp;title=0" allow="encrypted-media; autoplay; fullscreen" id="watch-embed" class="sticky-player-child embed-content border-reset" title="Video Player" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>
      embed_url: "https://embed.vhx.tv/videos/397938?api=1&amp;auth-user-token=eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo1Mzk0NDEwLCJleHAiOjE1NTYyNzUyMTN9.HU-FmyyCEx5UR5lZSkewsecyLupytzTq0xoEG5hNHGc&amp;autoplay=1&amp;back=Um%2C%20Actually%20%26ndash%3B%20Season%201&amp;color=feea3b&amp;context=https%3A%2F%2Fintl.dropout.tv%2Fum-actually%2Fseason%3A1&amp;live=0&amp;playsinline=1&amp;referrer=&amp;sharing=1&amp;title=0",

what i noticed: when i'm not logged in and go to https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros in my browser and check the source i get the same html as you do. so somehow the session is broken

have you tried to also add -c cookiefile? Maybe the cookies change and need to be updated in the file

@Romern
Copy link

Romern commented Apr 26, 2019

I imported my browser cookie file and it seems to be working fine, both the playlists and the videos. The videos which worked for the (probably) expired cookie file where those which did not need any login at all. Maybe an additional check for "START YOUR FREE TRIAL" or similar in the web page could help prevent this.

@Romern
Copy link

Romern commented Apr 27, 2019

My subscription is now expired unfortunately, but I just noticed, that the playlist functionality only considers the first page. For this page: https://intl.dropout.tv/um-actually , it only shows 24 videos, even though there are 29 in total, with 5 on a second page.
Also, when logged in, but the subscription is expired, it shows that I am not logged in. This could be fixed by checking if "Sign out" is on the page.

@tsia
Copy link
Contributor Author

tsia commented May 8, 2019

i just added support for multiple pages on playlists (and updated some tests because dropout moved some shows around)

@Qazerowl
Copy link

Are requests 4&5 still being evaluated, or are further changes required?

@tsia tsia requested a review from dstftw November 1, 2019 18:08
@Qazerowl
Copy link

@tsia I just want to thank you and @dstftw for working on this. I've been following this addition for 6 months and I'm looking forward to it being included in the main branch so I don't have to keep juggling the versions.

@tsia
Copy link
Contributor Author

tsia commented Dec 2, 2019

@Qazerowl Thanks. I try to keep my fork up to date regularly as i am using it myself until this is merged.

I Noticed that Dropout Videos are currently being downloaded with the Title "Untitled" but I'm not sure if this is an issue in my code or in the VHX Extractor.

@putnam
Copy link

putnam commented Jan 27, 2020

I've been playing with this the last couple days and have some feedback.

First, there's no reason to single out intl.dropout.tv -- the US-based site is identical and works just fine, so it'd be better to match on www, intl, and dropout.tv. And accordingly it might be better to just call this IE "dropout" instead of intldropout.

The "Untitled" title issue is because while the initial dropout extractor is getting the correct video title from the dropout.tv web page, the title is just lost when the VHX embed IE runs and a new JSON config is applied:
https://github.com/tsia/youtube-dl/blob/master/youtube_dl/extractor/vimeo.py#L1149

The JSON metadata pulled from the VHX embed page clobbers all the existing metadata pulled from Dropout. I don't know why the JSON metadata at Vimeo has "Untitled" for most of the videos at Dropout (not all, seems to be newer ones) but it's likely they just didn't fill the info in properly on their end and it didn't matter because the embeds don't need it. This IE needs a way to pass the title along to the VHX Embed IE so it's maintained, as the JSON data cannot be trusted from VHX.

I spent a couple hours getting up to speed on this code base to try to figure out the best method to fix the title issue but I'm not entirely sure. The url_result functionality provides a video_title argument but it seems to not be carried through in this case. It's clear the dropout IE is pretty much a simple wrapper around the vhx:embed IE but it needs some additional work to pull titles properly.

It's also likely Dropout is gone very soon because of the massive layoffs within IAC: https://www.thewrap.com/collegehumor-hit-layoffs-iac-stops-funding/

Edit: Yeah, the video_title parameter doesn't seem to be used at all in youtube-dl. When an IE of type url is encountered, those info parameters are ignored and discarded: https://github.com/tsia/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L863

It kind of seems like this stanza should check for the presence of id and video_title parameters in the IE and populate extra_info with them. But ultimately it gets clobbered anyway because the JSON data is prioritized (add_extra_info ignores keys that already exist).

@tsia
Copy link
Contributor Author

tsia commented Jan 27, 2020

Thanks for the feedback!

i wasn't sure if www.dropout.tv was exactly the same as intl.dropout.tv. It looked like both are using Vimeos VHX Platform but i wasn't able to verify how similar the Frontend was.

It's very unclear what will happen to Dropout / CH. intl subscribers recently got an email saying intl.dropout.tv will be discontinued and merged into www.dropout.tv. I would like to wait until this has happened and when i have some spare time i will try to move everything into the generic VHX Extractor.

@putnam
Copy link

putnam commented Jan 27, 2020

I was able to fix everything with this diff, which I'll just paste for you to take a look and adopt however you like. It's purely in the vimeo extractor. But also I'd go ahead and change intldropout to support www so everyone can use it. At the moment US-based users are unable to use your fork without patching that.

I have not tested this with VHX directly. I expect it will work fine though.

diff --git a/youtube_dl/extractor/vimeo.py b/youtube_dl/extractor/vimeo.py
index df92ab687..83c55d6c5 100644
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@@ -1149,10 +1149,21 @@ class VHXEmbedIE(VimeoBaseInfoExtractor):
     def _real_extract(self, url):
         video_id = self._match_id(url)
         webpage = self._download_webpage(url, video_id)
-        config_url = self._parse_json(self._search_regex(
+        vhx_config_obj = self._parse_json(self._search_regex(
             r'window\.OTTData\s*=\s*({.+})', webpage,
-            'ott data'), video_id, js_to_json)['config_url']
-        config = self._download_json(config_url, video_id)
-        info = self._parse_config(config, video_id)
+            'ott data'), video_id, js_to_json)
+        vimeo_player_config_url = vhx_config_obj['config_url']
+
+        vimeo_player_config = self._download_json(vimeo_player_config_url, video_id)
+        info = self._parse_config(vimeo_player_config, video_id)
         self._vimeo_sort_formats(info['formats'])
+
+        # The above _parse_config call has replaced title and id with the title and ID from Vimeo.
+        # VHX uses Vimeo as its backend. Video IDs on Vimeo's backend will differ from VHX, and sometimes titles do, too.
+        # Prioritizing the title and ID from VHX will guarantee the right title and video ID are given to the client.
+        # This applies as well to downstream services that use VHX/VimeoOTT, like Dropout.
+        info['title'] = vhx_config_obj['video']['title']
+        info['id'] = compat_str(vhx_config_obj['video']['id'])
+
         return info

@putnam
Copy link

putnam commented Jan 27, 2020

Oh, also, the regex for the "Show More" playlist expansion is no longer correct. The URL can look like ?html=1&page=2, so need to cover that use case in intldropout.py:

-            next_page_url = self._search_regex(r'href="(/[^\?]+\?page=\d+)"', webpage, 'next page url', default=None)
+            next_page_url = self._search_regex(r'href="(/[^\?]+\?.*?page=\d+)"', webpage, 'next page url', default=None)

@tsia tsia changed the title [intldropout] Add new extractor [dropout] Add new extractor Mar 1, 2020
dstftw and others added 27 commits May 23, 2021 16:01
@tsia tsia closed this May 23, 2021
@tsia tsia deleted the branch ytdl-org:master May 23, 2021 14:03
@tsia tsia deleted the master branch May 23, 2021 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.