Add experimenta lazy loading of info extractors #8497

jaimeMF · 2016-02-10T13:14:43Z

Inspired by @remitamine's comment (#3029 (comment))
In my computer the difference is not super spectacular. youtube-dl --version goes from 0.6s to 0.3-0.4s, in the case of the zipped executable it goes from 1s to 0.4s; and youtube-dl 'http://youtube.com/watch?v=BaW_jenozKcj' goes from 2.1s to 1.6s. On slower devices like the Raspberry Pi (#3029) the difference may be more noticeable.

Since a lot of things may break, it requires to run make lazy-extractors first.

kidol · 2016-02-10T13:51:37Z

ARMv7 Processor rev 2 (v7l)
4 x 1,3Ghz

Without patch

time youtube-dl --version
real 0m0.953s
user 0m0.820s
sys 0m0.110s

time youtube-dl --get-url http://youtube.com/watch?v=BaW_jenozKcj
real 0m3.058s
user 0m2.440s
sys 0m0.180s

With patch

time youtube-dl --version
real 0m0.610s
user 0m0.550s
sys 0m0.040s

time youtube-dl --get-url http://youtube.com/watch?v=BaW_jenozKcj
real 0m2.859s
user 0m2.100s
sys 0m0.210s

I guess disk speed / IOPS is more important than cpu power?

jaimeMF · 2016-02-10T18:18:43Z

Thanks for checking, it seems you get an improvement similar to mine. Although your initial times aren't as bad as those in #3029.

I guess disk speed / IOPS is more important than cpu power?

I don't completely understand what you say, could you elaborate?

kidol · 2016-02-10T19:02:41Z

Yes, similar results. Not sure how they get to 10+ seconds for --version command. I doubt it has to do with CPU after seeing my results for ARM CPU.

I don't completely understand what you say, could you elaborate?

I have no clue about Python, but in the strace results someone posted in the original issue, I've noticed a lot of repetitive file system calls like:

open("/usr/local/bin/youtube-dl", O_RDONLY|O_LARGEFILE)

So I assume these are all the extractors being read? If that's the case, a slow file system could be the bottleneck and that's why I did not see a big difference in my test (fast file system).

remitamine · 2016-02-11T10:55:26Z

with python2 i get this error:

python2 __main__.py --version
Traceback (most recent call last):
  File "__main__.py", line 16, in <module>
    import youtube_dl
  File "/home/amine/youtube-dl/youtube_dl/__init__.py", line 43, in <module>
    from .extractor import gen_extractors, list_extractors
  File "/home/amine/youtube-dl/youtube_dl/extractor/__init__.py", line 4, in <module>
    from .lazy_extractors import *
  File "/home/amine/youtube-dl/youtube_dl/extractor/lazy_extractors.py", line 1763
SyntaxError: Non-ASCII character '\xc3' in file /home/amine/youtube-dl/youtube_dl/extractor/lazy_extractors.py on line 1763, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

remitamine · 2016-02-11T11:13:50Z

this is the result i get(netbook with AMD 1 GHz cpu and 2 GB RAM):
when i test to extract youtube video i modified the extractor to return immediately in the instialization because the downloading web pages can affect the time.
Without patch:

time python __main__.py --version
2016.02.10

real    0m1.987s
user    0m1.817s
sys 0m0.157s

time python __main__.py --simulate 'http://youtube.com/watch?v=BaW_jenozKcj'

real    0m3.683s
user    0m3.517s
sys 0m0.163s

With patch:

time python __main__.py --version
2016.02.09.1

real    0m1.032s
user    0m0.963s
sys 0m0.060s

time python __main__.py --simulate 'http://youtube.com/watch?v=BaW_jenozKcj'

real    0m1.909s
user    0m1.817s
sys 0m0.067s

jaimeMF · 2016-02-11T13:51:25Z

@remitamine It should be fixed with jaimeMF/youtube-dl@eee1aca

SharkWipf · 2016-02-15T11:48:11Z

Apologies for my earlier, now removed comment, I failed to RTFM and built youtube-dl without the lazy loading support.

I've tested your patch on one of the Raspberries that had the problem to begin with, results are as follows:

"Vanilla" youtube-dl:

time ./youtube-dl-vanilla --version
2016.01.29

real    0m6.404s
user    0m6.140s
sys     0m0.240s
time ./youtube-dl-vanilla --get-url http://youtube.com/watch?v=BaW_jenozKcj

real    0m23.432s
user    0m8.110s
sys     0m0.490s

"Lazy-load" youtube-dl (this time built correctly):

time ./youtube-dl-lazy --version
2016.02.09.1

real    0m1.929s
user    0m1.780s
sys     0m0.150s
time ./youtube-dl-lazy --get-url http://youtube.com/watch?v=BaW_jenozKcj

real    0m13.758s
user    0m4.170s
sys     0m0.290s

My "butchered" version of youtube-dl from #3029, supporting only Youtube, Soundcloud and Bandcamp:

time ./youtube-dl-butchered --version
2016.01.29

real    0m2.358s
user    0m2.180s
sys     0m0.160s
time ./youtube-dl-butchered --get-url http://youtube.com/watch?v=BaW_jenozKcj

real    0m18.967s
user    0m3.940s
sys     0m0.420s

Note, the --get-url results should be taken lightly as this Pi is currently having some network problems. That said, the results seem to be reproducable in concurrent tests.

Seems there's a significant performance improvement in this version, even more than with just removing 900+ lines of downloaders in my butchered version. I haven't tested actually downloading videos with it yet but at least in these tests the difference seems very impressive.

This was tested with the zipped versions of all 3, vanilla downloaded directly from the official download site, the other 2 built with (make lazy-extractors;) make youtube-dl.

yan12125 · 2016-02-20T15:08:29Z

Fails with python 2.6:

$ PYTHON=python2.6 make lazy-extractors
python2.6 devscripts/make_lazy_extractors.py youtube_dl/extractor/lazy_extractors.py
WARNING: Lazy loading extractors is an experimental feature that may not always work
Traceback (most recent call last):
  File "devscripts/make_lazy_extractors.py", line 53, in <module>
    src = build_lazy_ie(ie, name)
  File "devscripts/make_lazy_extractors.py", line 47, in build_lazy_ie
    s += make_valid_template.format(ie._make_valid_url())
ValueError: zero length field name in format
Makefile:99: recipe for target 'youtube_dl/extractor/lazy_extractors.py' failed
make: *** [youtube_dl/extractor/lazy_extractors.py] Error 1

yan12125 · 2016-02-20T15:48:23Z

devscripts/make_lazy_extractors.py

+        valid_url=valid_url,
+        module=ie.__module__)
+    if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:
+        s += getsource(ie.suitable)


This can be more PEP8 with a '\n':

class YahooSearchIE(LazyLoadExtractor): _VALID_URL = None _module = 'youtube_dl.extractor.yahoo' @classmethod def suitable(cls, url): return re.match(cls._make_valid_url(), url) is not None @classmethod def _make_valid_url(cls): return 'yvsearch(?P<prefix>|[1-9][0-9]*|all):(?P<query>[\\s\\S]+)'

yan12125 · 2016-02-20T16:07:35Z

Error when downloading multiple URLs of the same InfoExtractor:

$ youtube-dl -vs test:youtube test:youtube_1
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vs', 'test:youtube', 'test:youtube_1']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.02.09.1
[debug] Git HEAD: dccd778
[debug] Python version 3.5.1 - Linux-4.4.1-2-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: avconv v12_dev0-2370-gab9068c, avprobe v12_dev0-2370-gab9068c, ffmpeg 3.0, ffprobe 3.0, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.youtube.com/watch?v=BaW_jenozKcj&t=1s&end=9
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading video info webpage
[youtube] BaW_jenozKc: Extracting video information
[youtube] BaW_jenozKc: Downloading MPD manifest
[TestURL] Test URL: http://www.youtube.com/watch?v=UxxajLWwzqY
[youtube] UxxajLWwzqY: Downloading webpage
[youtube] UxxajLWwzqY: Downloading video info webpage
[youtube] UxxajLWwzqY: Extracting video information
[youtube] {22} signature length 40.41, html5 player en_US-vfldIygzk
ERROR: Signature extraction failed: Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/youtube.py", line 894, in _decrypt_signature
    if player_id not in self._player_cache:
AttributeError: 'YoutubeIE' object has no attribute '_player_cache'
 (caused by AttributeError("'YoutubeIE' object has no attribute '_player_cache'",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/youtube.py", line 894, in _decrypt_signature
    if player_id not in self._player_cache:
AttributeError: 'YoutubeIE' object has no attribute '_player_cache'
Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/youtube.py", line 894, in _decrypt_signature
    if player_id not in self._player_cache:
AttributeError: 'YoutubeIE' object has no attribute '_player_cache'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 668, in extract_info
    ie_result = ie.extract(url)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/common.py", line 315, in extract
    return self._real_extract(url)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/youtube.py", line 1395, in _real_extract
    encrypted_sig, video_id, player_url, age_gate)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/youtube.py", line 906, in _decrypt_signature
    'Signature extraction failed: ' + tb, cause=e)
youtube_dl.utils.ExtractorError: Signature extraction failed: Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/youtube.py", line 894, in _decrypt_signature
    if player_id not in self._player_cache:
AttributeError: 'YoutubeIE' object has no attribute '_player_cache'
 (caused by AttributeError("'YoutubeIE' object has no attribute '_player_cache'",)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Another minor suggestion: print information about whether lazy extractors are used or not in the verbose log.

yan12125 · 2016-02-20T16:40:39Z

Patch for some of my ideas:

diff --git a/devscripts/make_lazy_extractors.py b/devscripts/make_lazy_extractors.py
index 8627d0b..5506335 100644
--- a/devscripts/make_lazy_extractors.py
+++ b/devscripts/make_lazy_extractors.py
@@ -41,14 +41,16 @@ def build_lazy_ie(ie, name):
         valid_url=valid_url,
         module=ie.__module__)
     if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:
-        s += getsource(ie.suitable)
+        s += '\n' + getsource(ie.suitable)
     if hasattr(ie, '_make_valid_url'):
         # search extractors
         s += make_valid_template.format(ie._make_valid_url())
     return s

 names = []
-for ie in _ALL_CLASSES:
+sorted_ies = sorted(_ALL_CLASSES, key=lambda c: c.__name__[:-2] if c.__name__ != 'GenericIE' else '')
+sorted_ies = sorted_ies[1:] + [sorted_ies[0]]
+for ie in sorted_ies:
     name = ie.ie_key() + 'IE'
     src = build_lazy_ie(ie, name)
     module_contents.append(src)

jaimeMF · 2016-02-21T11:49:32Z

@yan12125 thanks a lot for your comments, I thin I have addressed all of them.

Fails with python 2.6:

Fixed in jaimeMF/youtube-dl@a4126fd

Style fixes in jaimeMF/youtube-dl@dd20d6e.

Another minor suggestion: print information about whether lazy extractors are used or not in the verbose log.

jaimeMF/youtube-dl@8c96085

Error when downloading multiple URLs of the same InfoExtractor:

jaimeMF/youtube-dl@29d8ba4

jaimeMF · 2016-02-21T13:12:30Z

@SharkWipf thanks for testing it, I'm glad it improves the time.

jaimeMF · 2016-02-21T13:23:47Z

Something "bad" about this change is that all pull request that add a extractor would need to be updated. So I want to expose the rationale for the change:

I initially started by creating a new youtube_dl/lazy_extractors.py file and changed all from .extractor import <something> to:

try:
    from .lazy_extractors import <something>
except ImportError:
    from .extractor import <something>

Apart from the need to change all imports, the main problem is that when you do __import__('youtube_dl.extractor.<somemodule>'), youtube_dl/extractor/__init__.py is also run and therefore all extractors are loaded which makes the change useless. That's why I decided to modify youtube_dl/extractor/__init__.py (which allows to reuse the functions defined there). Maybe it would be easier to handle merge conflicts if the extractors were loaded in the except ImportError: part, but I personally find 900 indented imports a bit ugly.

dstftw · 2016-02-22T18:57:59Z

Currently, it's impossible to make py2exe Windows build with lazy extractors enabled since devscripts/make_lazy_extractors.py is only called via Makefile that is not used in Windows build. Ideally single python setup.py py2exe should still be kept enough for a py2exe build.

dstftw · 2016-02-22T21:07:10Z

Here are some of my measurements.

Command: youtube-dl -v:

Linux:

python 2.7.11, non-lazy:
real 0m0.259s
user 0m0.203s
sys 0m0.033s

python 3.5.1, non-lazy:
real 0m0.369s
user 0m0.307s
sys 0m0.050s

python 2.7.11, lazy:
real 0m0.161s
user 0m0.117s
sys 0m0.030s

python 3.5.1, lazy:
real 0m0.216s
user 0m0.160s
sys 0m0.043s

Windows:

python 2.7.10, non-lazy
0.59+s

python 2.7.10, lazy
0.40+s

For sensible use cases I've got almost similar measurements (lazy/non-lazy) being lazy one even slower in some cases (probably due to network I/O influence).

jaimeMF · 2016-03-06T18:34:38Z

Currently, it's impossible to make py2exe Windows build with lazy extractors enabled since devscripts/make_lazy_extractors.py is only called via Makefile that is not used in Windows build. Ideally single python setup.py py2exe should still be kept enough for a py2exe build.

Do you want me to add a new distutils command? I'm playing with it, but currently you need to run python setup.py build_lazy_extractors py2exe, is that what you want? (I would avoid doing it for default until we test it more).

dstftw · 2016-03-06T20:21:14Z

I would avoid doing it for default until we test it more

Ok then.

jaimeMF · 2016-03-07T13:14:19Z

Added in jaimeMF/youtube-dl@a4e1733.

remitamine · 2016-03-16T13:41:46Z

i think it will be possible to use this method to improve generic extractor load time if we add the _extract_url.
but the problem is that many extractors uses other names for embed url extraction and also the _extract_url is not generic i think it's better to convert them into _extract_urls(sometimes the page contain multiple embed but the _extract_url only extract one).

jaimeMF · 2016-03-17T12:57:53Z

@remitamine I think it's better to merge this PR first and then work on that. Note that there is a similar implementation for what you want in #6216. Note that the problem would be that the _extract_url(s) could try to access some property or global variable, so we must be careful on how we do it.

yan12125 · 2016-04-06T17:28:48Z

I guess this PR can be merged?

Here are my tests:
Without lazy load:

time python3.6 -m youtube_dl -v url
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'url']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.02.09.1
[debug] Python version 3.6.0a0 - Linux-3.10.49-perf-g4186cc1-aarch64-with-libc
[debug] exe versions: none
[debug] Proxy map: {}
ERROR: You've asked youtube-dl to download the URL "url". That doesn't make any sense. Simply remove the parameter in your command or configuration.
Traceback (most recent call last):
  File "/data/local/tmp/youtube-dl-lazy-load/youtube_dl/YoutubeDL.py", line 668, in extract_info
    ie_result = ie.extract(url)
  File "/data/local/tmp/youtube-dl-lazy-load/youtube_dl/extractor/common.py", line 315, in extract
    return self._real_extract(url)
  File "/data/local/tmp/youtube-dl-lazy-load/youtube_dl/extractor/commonmistakes.py", line 29, in _real_extract
    raise ExtractorError(msg, expected=True)
youtube_dl.utils.ExtractorError: You've asked youtube-dl to download the URL "url". That doesn't make any sense. Simply remove the parameter in your command or configuration.

    0m2.09s real     0m1.93s user     0m0.07s system

With lazy-load:

time python3.6 -m youtube_dl -v url
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'url']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.02.09.1
[debug] Lazy loading extractors enabled
[debug] Python version 3.6.0a0 - Linux-3.10.49-perf-g4186cc1-aarch64-with-libc
[debug] exe versions: none
[debug] Proxy map: {}
ERROR: You've asked youtube-dl to download the URL "url". That doesn't make any sense. Simply remove the parameter in your command or configuration.
Traceback (most recent call last):
  File "/data/local/tmp/youtube-dl-lazy-load/youtube_dl/YoutubeDL.py", line 668, in extract_info
    ie_result = ie.extract(url)
  File "/data/local/tmp/youtube-dl-lazy-load/youtube_dl/extractor/common.py", line 315, in extract
    return self._real_extract(url)
  File "/data/local/tmp/youtube-dl-lazy-load/youtube_dl/extractor/commonmistakes.py", line 29, in _real_extract
    raise ExtractorError(msg, expected=True)
youtube_dl.utils.ExtractorError: You've asked youtube-dl to download the URL "url". That doesn't make any sense. Simply remove the parameter in your command or configuration.

    0m1.19s real     0m1.07s user     0m0.09s system

Test environment: My Android phone with my patched Python build.

An incredible improvement! Much thanks @jaimeMF.

jaimeMF · 2016-04-06T18:40:31Z

I'll need to rebase against the current HEAD, I'll try to do it on the weekend.

'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created.

When building with python3 the unicode characters are not escaped, python2 needs to know the encoding.

remitamine · 2016-04-15T17:03:25Z

not sure why it happen but it should be related to this change.
the url works without lazy extractors but when i use lazy extractors it uses the GenericIE instead of YoutubeUserIE.

[amine@amine youtube-dl]$ make clean 
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
[amine@amine youtube-dl]$ make
zip --quiet youtube-dl youtube_dl/*.py youtube_dl/*/*.py
zip --quiet --junk-paths youtube-dl youtube_dl/__main__.py
echo '#!/usr/bin/env python' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
pandoc -f markdown -t plain README.md -o README.txt
/usr/bin/env python devscripts/prepare_manpage.py >youtube-dl.1.temp.md
pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md
/usr/bin/env python devscripts/bash-completion.py
/usr/bin/env python devscripts/zsh-completion.py
/usr/bin/env python devscripts/fish-completion.py
/usr/bin/env python devscripts/make_supportedsites.py docs/supportedsites.md
[amine@amine youtube-dl]$ ./youtube-dl -f best -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/AlltimeConspiracies/videos
[youtube:user] AlltimeConspiracies: Downloading channel page
^C
ERROR: Interrupted by user
[amine@amine youtube-dl]$ make clean 
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
[amine@amine youtube-dl]$ make lazy-extractors 
/usr/bin/env python devscripts/make_lazy_extractors.py youtube_dl/extractor/lazy_extractors.py
WARNING: Lazy loading extractors is an experimental feature that may not always work
[amine@amine youtube-dl]$ make
zip --quiet youtube-dl youtube_dl/*.py youtube_dl/*/*.py
zip --quiet --junk-paths youtube-dl youtube_dl/__main__.py
echo '#!/usr/bin/env python' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
COLUMNS=80 /usr/bin/env python youtube_dl/__main__.py --help | /usr/bin/env python devscripts/make_readme.py
/usr/bin/env python devscripts/make_contributing.py README.md CONTRIBUTING.md
pandoc -f markdown -t plain README.md -o README.txt
/usr/bin/env python devscripts/prepare_manpage.py >youtube-dl.1.temp.md
pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md
/usr/bin/env python devscripts/bash-completion.py
/usr/bin/env python devscripts/zsh-completion.py
/usr/bin/env python devscripts/fish-completion.py
/usr/bin/env python devscripts/make_supportedsites.py docs/supportedsites.md
[amine@amine youtube-dl]$ ./youtube-dl -f best -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/AlltimeConspiracies/videos
[generic] videos: Requesting header
WARNING: Falling back on generic information extractor.
[generic] videos: Downloading webpage

jaimeMF · 2016-04-15T17:10:31Z

@remitamine it's the suitable mehtod for YoutubeUserIE:

    @classmethod
    def suitable(cls, url):
        # Don't return True if the url can be extracted with other youtube
        # extractor, the regex would is too permissive and it would match.
        other_ies = iter(klass for (name, klass) in globals().items() if name.endswith('IE') and klass is not cls)
        if any(ie.suitable(url) for ie in other_ies):
            return False
        else:
            return super(YoutubeUserIE, cls).suitable(url)

It's also picking GenericIE, and since it matches all urls it returns False. I'm not sure what would be the cleanest fix, I can think of just changing it in youtube/extractor/youtube.py to look like

iter(klass for (name, klass) in globals().items() if name.endswith('IE') and not name == 'GenericIE' and klass is not cls)

Do you have a better suggestion?

remitamine · 2016-04-15T17:23:21Z

as i understand the suitable method here tries to see if the url match other youtube extractors, may be we can check only for extractors that starts with Youtube

jaimeMF · 2016-04-15T17:33:46Z

as i understand the suitable method here tries to see if the url match other youtube extractors, may be we can check only for extractors that starts with Youtube

That sounds better.

remitamine · 2016-04-15T18:13:30Z

thanks for the help.
now it work with f3a58d4.

remitamine · 2016-06-21T14:20:51Z

there is an error happen when i use Lazy Extractors:

python __main__.py -v -F 'http://www.youtube.com/watch?v=BaW_jenozKc'
[debug] System config: []
[debug] User config: ['--external-downloader', 'aria2c', '--sub-lang', 'en,ar', '--write-sub', '--sub-format', 'ass/vtt/srt/best', '-f', 'best[height<=720]/bestvideo[height<=720]+bestaudio', '--hls-prefer-native']
[debug] Command-line args: ['-v', '-F', 'http://www.youtube.com/watch?v=BaW_jenozKc']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.06.20
[debug] Lazy loading extractors enabled
[debug] Git HEAD: 1ac5705
[debug] Python version 3.5.1 - Linux-4.6.2-1-ARCH-i686-with-arch
[debug] exe versions: ffmpeg 3.0.2, ffprobe 3.0.2, rtmpdump 2.4
[debug] Proxy map: {}
Traceback (most recent call last):
  File "__main__.py", line 19, in <module>
    youtube_dl.main()
  File "/home/amine/youtube-dl/youtube_dl/__init__.py", line 420, in main
    _real_main(argv)
  File "/home/amine/youtube-dl/youtube_dl/__init__.py", line 410, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/amine/youtube-dl/youtube_dl/YoutubeDL.py", line 1740, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/home/amine/youtube-dl/youtube_dl/YoutubeDL.py", line 667, in extract_info
    if not ie.suitable(url):
  File "/home/amine/youtube-dl/youtube_dl/extractor/lazy_extractors.py", line 213, in suitable
    return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url)
TypeError: super(type, obj): obj must be an instance or subtype of type

jaimeMF · 2016-06-22T17:23:54Z

@remitamine should be fixed in 169d836. Thanks for pointing it. For future problems, feel free to open a new issue and ping me.

royale1223 · 2016-07-28T15:03:13Z

@jaimeMF Hi, how stable is this patch? Close to production ready?

jaimeMF · 2016-07-29T10:59:20Z

It works, but some details may not work correctly and sometimes it breaks.

royale1223 · 2016-07-29T12:47:58Z

@jaimeMF anything specific I should worry about?

jaimeMF · 2016-07-29T13:09:46Z

@jaimeMF anything specific I should worry about?

No. If you find something that doesn't work, open a new issue and we will fix it.

yan12125 · 2016-09-01T07:07:39Z

@royale1223 It's not caused by this patch. This URL redirects to https://vimeo.com/ondemand/castlesinthesky/89677808, and the latter works fine with youtube-dl. Nevertheless the original URL should be supported as well. Could you open a new issue?

royale1223 · 2016-12-14T03:06:21Z

@yan12125 Happens because it's not a free video.

alejomendoza · 2016-12-28T15:44:29Z

Hi there! just wondering where is the documentation on how to lazy load the info extractors. Is there a flag I can pass when getting the url info youtube-dl -g --youtube-skip-dash-manifest https://www.youtube.com/watch?v=V0Ll64U-FuY ?

yan12125 · 2016-12-28T16:17:49Z

@alejomendoza clone this repository and run the following two commands:

make lazy-extractors
make youtube-dl

And replace the official youtube-dl with the newly generated one.

Move import of nieuwsblad extractor from __init__.py to extractors.py #8497

tobimensch · 2017-07-03T12:52:06Z

If this performance improvement is working without issues by now, why not make it the default?

yan12125 · 2017-07-03T13:35:31Z

Actually lazy extractors break testing. (#13554) It should be fixed for developers before making it as the default for users.

yan12125 reviewed Feb 20, 2016
View reviewed changes

jaimeMF added 5 commits April 8, 2016 21:43

Delay initialization of InfoExtractors until they are needed

e52d7f8

Move the extreactors import to youtube_dl/extractor/extractors.py

1b3d5e0

lazy extractors: specify the encoding

0d778b1

When building with python3 the unicode characters are not escaped, python2 needs to know the encoding.

lazy extractors: Fix building with python2.6

c1ce6ac

This was referenced Apr 9, 2016

Add support for cbc.ca #6342

Closed

Support memritv.org #6210

Closed

Add support for YLE (Closes: #1574) #5805

Closed

[joemonster] Add new extractor #5177

Closed

jaimeMF deleted the lazy-load branch April 9, 2016 08:32

yan12125 mentioned this pull request Dec 28, 2016

Youtube stream link generation speed #3483

Closed

This was referenced Feb 10, 2017

YouTube-DL Slow on RPi (Suggestion) #12059

Closed

Slow on Raspberry Pi #3029

Closed

fluxw42 referenced this pull request in fluxw42/youtube-dl Apr 11, 2017

Merge branch 'master' into nieuwsblad

f9dfc76

Move import of nieuwsblad extractor from __init__.py to extractors.py #8497

yan12125 mentioned this pull request May 18, 2017

Why is youtube-dl so slow to get urls compared to alternatives? #13122

Closed

ids1024 mentioned this pull request Dec 29, 2017

give an option to improve performance on low-powered devices mps-youtube/yewtube#747

Open

vn-ki mentioned this pull request Apr 1, 2018

Installation problems mpsyt not found mps-youtube/yewtube#818

Closed

nicolaasjan mentioned this pull request Apr 30, 2023

Build youtube-dl from source #32076

Closed

3 tasks

Add experimenta lazy loading of info extractors #8497

Add experimenta lazy loading of info extractors #8497

Conversation

jaimeMF commented Feb 10, 2016

kidol commented Feb 10, 2016

jaimeMF commented Feb 10, 2016

kidol commented Feb 10, 2016

remitamine commented Feb 11, 2016

remitamine commented Feb 11, 2016

jaimeMF commented Feb 11, 2016

SharkWipf commented Feb 15, 2016

yan12125 commented Feb 20, 2016

yan12125 Feb 20, 2016

Choose a reason for hiding this comment

yan12125 commented Feb 20, 2016

yan12125 commented Feb 20, 2016

jaimeMF commented Feb 21, 2016

jaimeMF commented Feb 21, 2016

jaimeMF commented Feb 21, 2016

dstftw commented Feb 22, 2016

dstftw commented Feb 22, 2016

jaimeMF commented Mar 6, 2016

dstftw commented Mar 6, 2016

jaimeMF commented Mar 7, 2016

remitamine commented Mar 16, 2016

jaimeMF commented Mar 17, 2016

yan12125 commented Apr 6, 2016

jaimeMF commented Apr 6, 2016

remitamine commented Apr 15, 2016

jaimeMF commented Apr 15, 2016

remitamine commented Apr 15, 2016 • edited Loading

jaimeMF commented Apr 15, 2016

remitamine commented Apr 15, 2016

remitamine commented Jun 21, 2016

jaimeMF commented Jun 22, 2016

royale1223 commented Jul 28, 2016

jaimeMF commented Jul 29, 2016

royale1223 commented Jul 29, 2016

jaimeMF commented Jul 29, 2016

yan12125 commented Sep 1, 2016

royale1223 commented Dec 14, 2016 • edited Loading

alejomendoza commented Dec 28, 2016

yan12125 commented Dec 28, 2016

tobimensch commented Jul 3, 2017

yan12125 commented Jul 3, 2017

remitamine commented Apr 15, 2016 •

edited

Loading

royale1223 commented Dec 14, 2016 •

edited

Loading