-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SpankBangPlaylist] Add new extractor #19145
Conversation
youtube_dl/extractor/spankbang.py
Outdated
@@ -12,7 +12,7 @@ | |||
|
|||
|
|||
class SpankBangIE(InfoExtractor): | |||
_VALID_URL = r'https?://(?:(?:www|m|[a-z]{2})\.)?spankbang\.com/(?P<id>[\da-z]+)/video' | |||
_VALID_URL = r'https?://(?:(?:www|m|[a-z]{2})\.)?spankbang\.com/(?P<id>[\da-z-]+)/(?:video|playlist)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not match playlist URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to match single playlist videos..but i changed it back
youtube_dl/extractor/spankbang.py
Outdated
'http://www.%s/%s' % ('spankbang.com', video_url), | ||
SpankBangIE.ie_key()) | ||
for video_url in re.findall( | ||
r'href="/?([\da-z-]+/playlist/[^"]+)', webpage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is too broad, regex should be restricted by playlist id.
- This captures duplicates.
…ate duplicate video capture, playlist title is no longer fatal, used _match_id
youtube_dl/extractor/spankbang.py
Outdated
r'href="/?(' + id + '-[\da-z]+/playlist/[^"]+)', div, 'page url', default=None) | ||
|
||
if page_url: | ||
page = self._download_webpage(urljoin('http://spankbang.com', page_url), id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It's a job if the video extractor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed to get the canonical url for the single video extractor [SpankBang] to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again: video URLs are already available on playlist page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these urls are in the for /playlist_id-playlist_item_id/playlist/playlistname
even if you replace playlist with video and only use the playlist_item_id it does point to the the actual video url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*does not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not talking about these URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are they
Please follow the guide below
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
[SpankBangPlaylist] New extractor
New extractor to pull spankbang playlists
Adjusted spankbang regex to recognize single playlist videos