[AsianCrush] fix extractor, add support for yuyutv and Midnight Pulp #21290

ealgase · 2019-06-03T12:12:21Z

Please follow the guide below

You will be asked some questions, please read them carefully and answer honestly
Put an x into all the boxes [ ] relevant to your pull request (like that [x])
Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

The AsianCrush extractor was partially broken (didn't extract description properly). This pull request fixes that, and also adds support for sister sites yuyutv and Midnight Pulp (closing #21281).

ealgase · 2019-06-05T13:54:46Z

@remitamine sorry to bother you, but could you please take a look at this? (I think I've been understanding the code standards better now, so hopefully you won't have to leave as much feedback on this)

dstftw

Merge video and playlist extractors into single video extractor and single playlist extractor.

ealgase · 2019-06-07T19:51:22Z

I can do that for the main AsianCrushIE, but for the Playlist IE, it currently requires an additional site specific variable (the _SITE_TITLE).

dstftw · 2019-06-07T19:55:09Z

Nothing stops from rewriting it to use re.sub.

ealgase · 2019-06-07T20:06:02Z

I don't understand? I'm referring to the last bit of this:

        title = remove_end(
            self._html_search_regex(
                r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
                'title', default=None) or self._og_search_title(
                webpage, default=None) or self._html_search_meta(
                'twitter:title', webpage, 'title',
                default=None) or self._search_regex(
                r'<title>([^<]+)</title>', webpage, 'title', fatal=False),
            ' | %s' % self._SITE_TITLE)

I don't see how re.sub would allow that last bit to work without knowing the title of the site.

dstftw · 2019-06-07T20:10:20Z

re.sub(r'\s*\|\s*.+?$', '', title).

ealgase · 2019-06-08T03:59:07Z

OK, I've made the requested changes.

youtube_dl/extractor/asiancrush.py

dstftw · 2019-07-13T19:08:44Z

youtube_dl/extractor/asiancrush.py

-            'kaltura:%s:%s' % (partner_id, kaltura_id),
-            ie=KalturaIE.ie_key(), video_id=kaltura_id,
-            video_title=title)
+        description = self._html_search_regex(r'<div class="description">(.+?)</div>', webpage, 'description', fatal=False, flags=re.DOTALL)


Move flags into regex. Carry long lines.

I'm confused about "Move flags into regex", I can't find a difference between the way this is implemented in other extractors.

What difference are you even talking about?

You said "Move flags into regex", I don't understand what you're asking me to do.

Do you aware what flags are at all?

youtube_dl/extractor/asiancrush.py

dstftw · 2019-07-14T05:25:12Z

youtube_dl/extractor/asiancrush.py



 class AsianCrushIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/video/(?:[^/]+/)?0+(?P<id>\d+)v\b'
+    IE_NAME = 'asiancrush'
+    _VALID_URL = r'https?://(?:www\.)?(?P<host>(?:asiancrush\.com|yuyutv\.com|midnightpulp\.com))/video/(?:[^/]+/)?0+(?P<id>\d+)v\b'


Move .com part outside the inner group.

dstftw · 2019-07-14T05:26:40Z

youtube_dl/extractor/asiancrush.py

@@ -96,15 +148,16 @@ def _real_extract(self, url):
                entries.append(self.url_result(
                    mobj.group('url'), ie=AsianCrushIE.ie_key()))

-        title = remove_end(
+        title = re.sub(


Breaks on None title.

dstftw · 2019-07-15T16:33:08Z

Does not work:

> py -3.7 .\youtube_dl\__main__.py https://www.yuyutv.com/video/013886v/the-act-of-killing/ -v --proxy 127.0.0.1:8118
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.yuyutv.com/video/013886v/the-act-of-killing/', '-v']
[debug] Encodings: locale cp1251, fs utf-8, out utf-8, pref cp1251
[debug] youtube-dl version 2019.07.14
[debug] Git HEAD: 1ef4607
[debug] Python version 3.7.0 (CPython) - Windows-10-10.0.10240-SP0
[debug] exe versions: ffmpeg N-85653-gb4330a0, ffprobe N-85653-gb4330a0, phantomjs 2.1.1, rtmpdump 2.4
[asiancrush] 13886: Downloading webpage
[asiancrush] 13886: Downloading webpage
[Kaltura] 1_66x4rg7o: Downloading video info JSON
[Kaltura] 1_66x4rg7o: Downloading m3u8 information
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'http://cdnapi.kaltura.com/p/513551/sp/51355100/playManifest/entryId/1_66x4rg7o/format/url/protocol/http/flavorId/1_hgns56wd'
ERROR: unable to download video data: HTTP Error 400: Bad Request
Traceback (most recent call last):
  File "C:\Dev\youtube-dl\master\youtube_dl\YoutubeDL.py", line 1915, in process_info
    success = dl(filename, info_dict)
  File "C:\Dev\youtube-dl\master\youtube_dl\YoutubeDL.py", line 1854, in dl
    return fd.download(name, info)
  File "C:\Dev\youtube-dl\master\youtube_dl\downloader\common.py", line 366, in download
    return self.real_download(filename, info_dict)
  File "C:\Dev\youtube-dl\master\youtube_dl\downloader\http.py", line 341, in real_download
    establish_connection()
  File "C:\Dev\youtube-dl\master\youtube_dl\downloader\http.py", line 109, in establish_connection
    ctx.data = self.ydl.urlopen(request)
  File "C:\Dev\youtube-dl\master\youtube_dl\YoutubeDL.py", line 2227, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "C:\Python\Python37\lib\urllib\request.py", line 531, in open
    response = meth(req, response)
  File "C:\Python\Python37\lib\urllib\request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python\Python37\lib\urllib\request.py", line 563, in error
    result = self._call_chain(*args)
  File "C:\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\Python\Python37\lib\urllib\request.py", line 755, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "C:\Python\Python37\lib\urllib\request.py", line 531, in open
    response = meth(req, response)
  File "C:\Python\Python37\lib\urllib\request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python\Python37\lib\urllib\request.py", line 569, in error
    return self._call_chain(*args)
  File "C:\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\Python\Python37\lib\urllib\request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

…tv (closes #21281, closes #21290)

… cocoro.tv (closes #21281, closes #21290)" This reverts commit a136b6e.

This reverts commit 8421fda.

… cocoro.tv (closes #21281, closes #21290)" This reverts commit a136b6e.

This reverts commit 8421fda.

…tv (closes #21281, closes #21290)

dstftw reviewed Jun 7, 2019

View reviewed changes

dstftw added the pending-fixes label Jun 7, 2019

[asiancrush] fix extractor, add support for yuyutv and midnightpulp

8861b4b

ealgase force-pushed the asiancrush-clones branch from ed1b7b0 to 8861b4b Compare June 8, 2019 03:58

dstftw requested changes Jul 13, 2019

View reviewed changes

[asiancrush] improve coding conventions

5aaa26a

dstftw requested changes Jul 14, 2019

View reviewed changes

pull bot referenced this pull request in Vikash-Kothary/youtube-dl Jul 15, 2019

[kaltura] Check source format URL (#21290)

799756a

dstftw closed this in f614968 Jul 15, 2019

Lamieur referenced this pull request in Lamieur/youtube-dl Aug 3, 2019

[kaltura] Check source format URL (#21290)

8421fda

Lamieur referenced this pull request in Lamieur/youtube-dl Aug 3, 2019

[asiancrush] Add support for yuyutv.com, midnightpulp.com and cocoro.…

a136b6e

…tv (closes #21281, closes #21290)

meunierd referenced this pull request in meunierd/youtube-dl Feb 13, 2020

[kaltura] Check source format URL (#21290)

d1ae620

meunierd referenced this pull request in meunierd/youtube-dl Feb 13, 2020

[asiancrush] Add support for yuyutv.com, midnightpulp.com and cocoro.…

63d8368

…tv (closes #21281, closes #21290)

Lamieur referenced this pull request in Lamieur/youtube-dl Apr 20, 2020

Revert "[asiancrush] Add support for yuyutv.com, midnightpulp.com and…

45131ce

… cocoro.tv (closes #21281, closes #21290)" This reverts commit a136b6e.

Lamieur referenced this pull request in Lamieur/youtube-dl Apr 20, 2020

Revert "[kaltura] Check source format URL (#21290)"

db72409

This reverts commit 8421fda.

Lamieur referenced this pull request in Lamieur/youtube-dl Apr 20, 2020

Revert "[asiancrush] Add support for yuyutv.com, midnightpulp.com and…

67edfc0

… cocoro.tv (closes #21281, closes #21290)" This reverts commit a136b6e.

Lamieur referenced this pull request in Lamieur/youtube-dl Apr 20, 2020

Revert "[kaltura] Check source format URL (#21290)"

005b96d

This reverts commit 8421fda.

pareronia referenced this pull request in pareronia/youtube-dl Jun 22, 2020

[kaltura] Check source format URL (#21290)

2dd7665

pareronia referenced this pull request in pareronia/youtube-dl Jun 22, 2020

[asiancrush] Add support for yuyutv.com, midnightpulp.com and cocoro.…

b47796d

…tv (closes #21281, closes #21290)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AsianCrush] fix extractor, add support for yuyutv and Midnight Pulp #21290

[AsianCrush] fix extractor, add support for yuyutv and Midnight Pulp #21290

ealgase commented Jun 3, 2019 •

edited

Loading

ealgase commented Jun 5, 2019

dstftw left a comment •

edited

Loading

ealgase commented Jun 7, 2019

dstftw commented Jun 7, 2019

ealgase commented Jun 7, 2019

dstftw commented Jun 7, 2019

ealgase commented Jun 8, 2019

dstftw Jul 13, 2019

ealgase Jul 14, 2019

dstftw Jul 14, 2019

ealgase Jul 14, 2019

dstftw Jul 15, 2019

dstftw Jul 14, 2019

dstftw Jul 14, 2019

dstftw commented Jul 15, 2019

[AsianCrush] fix extractor, add support for yuyutv and Midnight Pulp #21290

[AsianCrush] fix extractor, add support for yuyutv and Midnight Pulp #21290

Conversation

ealgase commented Jun 3, 2019 • edited Loading

Please follow the guide below

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

ealgase commented Jun 5, 2019

dstftw left a comment • edited Loading

Choose a reason for hiding this comment

ealgase commented Jun 7, 2019

dstftw commented Jun 7, 2019

ealgase commented Jun 7, 2019

dstftw commented Jun 7, 2019

ealgase commented Jun 8, 2019

dstftw Jul 13, 2019

Choose a reason for hiding this comment

ealgase Jul 14, 2019

Choose a reason for hiding this comment

dstftw Jul 14, 2019

Choose a reason for hiding this comment

ealgase Jul 14, 2019

Choose a reason for hiding this comment

dstftw Jul 15, 2019

Choose a reason for hiding this comment

dstftw Jul 14, 2019

Choose a reason for hiding this comment

dstftw Jul 14, 2019

Choose a reason for hiding this comment

dstftw commented Jul 15, 2019

ealgase commented Jun 3, 2019 •

edited

Loading

dstftw left a comment •

edited

Loading