-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dropout] Add new extractor #19296
[dropout] Add new extractor #19296
Conversation
well... turns out, when you log in too many times dropout blocks you temporarily |
and there are my tests finally |
Just pinging this. |
For any further work and review on this you must provide account credentials. |
I have sent you an email with the credentials. I hope this is OK. |
Single videos work:
but playlists are throwing errors:
EDIT: Also after finishing the process, youtube-dl should logout the account, because there is a 3 device limit which is based on the amount of logins, not if you are actually watching.
|
i added a small check to prevent double logins. i am using the cookie file which seems to cause additional logins for some reason. my account is currently at its device limit so as soon as i can use it again i will double check the videos that don't seem to work. i would also like to know if there is a way to log out after youtube-dl is finished. |
@RomanKarwacik i checked your examples and they seem to work for me. maybe you too ran into the device limit? can you try again? with my latest change you should at least get a sensible error message in case you run into the limit |
I tried to force the device limit by downloading the working link a couple times without cookie, and it worked as intended:
The error for the other links which did not work before still do not work:
|
that is so weird. for me this link is working fine.
|
I used curl to download the pages manually:
|
i get this:
what i noticed: when i'm not logged in and go to https://intl.dropout.tv/um-actually/season:1/videos/ganondorf-gremlins-geography-of-westeros in my browser and check the source i get the same html as you do. so somehow the session is broken have you tried to also add |
I imported my browser cookie file and it seems to be working fine, both the playlists and the videos. The videos which worked for the (probably) expired cookie file where those which did not need any login at all. Maybe an additional check for "START YOUR FREE TRIAL" or similar in the web page could help prevent this. |
My subscription is now expired unfortunately, but I just noticed, that the playlist functionality only considers the first page. For this page: https://intl.dropout.tv/um-actually , it only shows 24 videos, even though there are 29 in total, with 5 on a second page. |
i just added support for multiple pages on playlists (and updated some tests because dropout moved some shows around) |
Are requests 4&5 still being evaluated, or are further changes required? |
@Qazerowl Thanks. I try to keep my fork up to date regularly as i am using it myself until this is merged. I Noticed that Dropout Videos are currently being downloaded with the Title "Untitled" but I'm not sure if this is an issue in my code or in the VHX Extractor. |
I've been playing with this the last couple days and have some feedback. First, there's no reason to single out intl.dropout.tv -- the US-based site is identical and works just fine, so it'd be better to match on www, intl, and dropout.tv. And accordingly it might be better to just call this IE "dropout" instead of intldropout. The "Untitled" title issue is because while the initial dropout extractor is getting the correct video title from the dropout.tv web page, the title is just lost when the VHX embed IE runs and a new JSON config is applied: The JSON metadata pulled from the VHX embed page clobbers all the existing metadata pulled from Dropout. I don't know why the JSON metadata at Vimeo has "Untitled" for most of the videos at Dropout (not all, seems to be newer ones) but it's likely they just didn't fill the info in properly on their end and it didn't matter because the embeds don't need it. This IE needs a way to pass the title along to the VHX Embed IE so it's maintained, as the JSON data cannot be trusted from VHX. I spent a couple hours getting up to speed on this code base to try to figure out the best method to fix the title issue but I'm not entirely sure. The url_result functionality provides a video_title argument but it seems to not be carried through in this case. It's clear the dropout IE is pretty much a simple wrapper around the vhx:embed IE but it needs some additional work to pull titles properly. It's also likely Dropout is gone very soon because of the massive layoffs within IAC: https://www.thewrap.com/collegehumor-hit-layoffs-iac-stops-funding/ Edit: Yeah, the video_title parameter doesn't seem to be used at all in youtube-dl. When an IE of type url is encountered, those info parameters are ignored and discarded: https://github.com/tsia/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L863 It kind of seems like this stanza should check for the presence of id and video_title parameters in the IE and populate extra_info with them. But ultimately it gets clobbered anyway because the JSON data is prioritized (add_extra_info ignores keys that already exist). |
Thanks for the feedback! i wasn't sure if www.dropout.tv was exactly the same as intl.dropout.tv. It looked like both are using Vimeos VHX Platform but i wasn't able to verify how similar the Frontend was. It's very unclear what will happen to Dropout / CH. intl subscribers recently got an email saying intl.dropout.tv will be discontinued and merged into www.dropout.tv. I would like to wait until this has happened and when i have some spare time i will try to move everything into the generic VHX Extractor. |
I was able to fix everything with this diff, which I'll just paste for you to take a look and adopt however you like. It's purely in the vimeo extractor. But also I'd go ahead and change intldropout to support www so everyone can use it. At the moment US-based users are unable to use your fork without patching that. I have not tested this with VHX directly. I expect it will work fine though.
|
Oh, also, the regex for the "Show More" playlist expansion is no longer correct. The URL can look like ?html=1&page=2, so need to cover that use case in intldropout.py:
|
Co-authored-by: Sergey M. <[email protected]>
Co-authored-by: Sergey M. <[email protected]>
Co-authored-by: Sergey M. <[email protected]>
No longer exists.
Co-authored-by: Sergey M. <[email protected]>
[ci skip]
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
this adds the support for intl.dropout.tv (logging in etc).
the videos themselves are hosted on vhx so the video download itself is already part of youtube-dl
unfortunately i wasn't able to get tests to use my credentials for the site. if someone can help me out i'm happy to add them.
(see #19146)