-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[youtube] fix: playlist #150
base: master
Are you sure you want to change the base?
Conversation
Ok, so, good new is that it seems to be working with all playlist types. There's also a new fix offered here: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me 👍
I can confirm that this works at least for normal playlists and channels Edit: Never mind, I see that it has already been reviewed :) |
strange, this commit isn't showing up here. insaneracist@b2a462a edit: that superfluous commit woke it up. |
it's never ending 🤣 |
converting this to draft for now. |
a779979
to
29e9c94
Compare
That's because those are mixes playlist they start with a prefix of |
it does end actually. Ran 3 tests and results are: 384, 402, 415 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might not need this. Playlists with certain prefixes (known as mixed playlists) can sometimes contain a lot of pages. My suggestion would be to see if it's a mix and fetch just the first page and implement an argument to have the maximum number of fetches for a mix playlist.
See
yt-dlc/youtube_dlc/extractor/youtube.py
Lines 2847 to 2877 in 29e9c94
def _extract_mix(self, playlist_id): | |
# The mixes are generated from a single video | |
# the id of the playlist is just 'RD' + video_id | |
ids = [] | |
yt_initial = None | |
last_id = playlist_id[-11:] | |
for n in itertools.count(1): | |
url = 'https://www.youtube.com/watch?v=%s&list=%s' % (last_id, playlist_id) | |
webpage = self._download_webpage( | |
url, playlist_id, 'Downloading page {0} of Youtube mix'.format(n)) | |
new_ids = orderedSet(re.findall( | |
r'''(?xs)data-video-username=".*?".*? | |
href="/watch\?v=([0-9A-Za-z_-]{11})&[^"]*?list=%s''' % re.escape(playlist_id), | |
webpage)) | |
# if no ids in html of page, try using embedded json | |
if (len(new_ids) == 0): | |
yt_initial = self._get_yt_initial_data(playlist_id, webpage) | |
if yt_initial: | |
new_ids = self._extract_mix_ids_from_yt_initial(yt_initial) | |
# Fetch new pages until all the videos are repeated, it seems that | |
# there are always 51 unique videos. | |
new_ids = [_id for _id in new_ids if _id not in ids] | |
if not new_ids: | |
break | |
ids.extend(new_ids) | |
last_id = ids[-1] | |
url_results = self._ids_to_results(ids) | |
yt-dlc/youtube_dlc/extractor/youtube.py
Lines 3051 to 3059 in 29e9c94
if playlist_id.startswith(('RD', 'UL', 'PU')): | |
if not playlist_id.startswith(self._YTM_PLAYLIST_PREFIX): | |
# Mixes require a custom extraction process, | |
# Youtube Music playlists act like normal playlists (with randomized order) | |
return self._extract_mix(playlist_id) | |
has_videos, playlist = self._extract_playlist(playlist_id) | |
if has_videos or not video_id: | |
return playlist |
Just don't give up on this yet.
As I am super tired, will merge #151 now so that there is at least a working version out there. Will take a look tomorrow again. |
cf38793
to
2fd8290
Compare
@blackjack4494, thanks, i was about to give up. the problem was not sending enough client information, it kept returning the initial piece of the playlist (but only for some types). |
@SoneeJohn, the playlists starting with |
so what's the state on this one @insaneracist ? |
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Fixes #148
Quick hack that needs testing