Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support connection reuse / keep-alive #14523

Open
1 of 5 tasks
lilydjwg opened this issue Oct 17, 2017 · 15 comments · Fixed by yt-dlp/yt-dlp#3668
Open
1 of 5 tasks

Support connection reuse / keep-alive #14523

lilydjwg opened this issue Oct 17, 2017 · 15 comments · Fixed by yt-dlp/yt-dlp#3668
Labels

Comments

@lilydjwg
Copy link

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

Many videos from YouTube contain a lot of small video chunks. Creating one connection per file download significantly slows down the speed. I only get up to 200-500KiB/s for a chunk then the connection is closed and another is made for the next chunk. When downloading a big video file from YouTube I can get 5-6MiB/s. (The actual speed difference depends on the available bandwidth and round-trip time.)

TCP slow start significantly limits the speed when the connection is starting. With connection reuse / HTTP keep-alive we can avoid all the speeding-up phrase.

@remitamine
Copy link
Collaborator

previous discussion related to this #13734 (comment).

@yan12125
Copy link
Collaborator

Thanks for the reminder. I don't even remember that ticket :/

As that ticket is closed, let's continue here. I've just done some experiments with requests, and it appears not bad. With https://github.com/yan12125/youtube-dl/tree/twitch-connection-reuse-via-requests, twitch HLS streams can be downloaded fine. I got no major performance improvements from here, maybe @lilydjwg can give it a try?

@lilydjwg
Copy link
Author

lilydjwg commented Oct 17, 2017

@yan12125 tried that with YouTube but the connection is dropped: the server is setting Connection: close.... And it doesn't support HTTP/2 either :-(

@dstftw
Copy link
Collaborator

dstftw commented Oct 17, 2017

For me twitch-connection-reuse-via-requests shows identical to current master performance on twitch and slower on nrk. Here are the best of 5 measures.

twitch-connection-reuse-via-requests (5 reqs, 36-41 sec):

PS C:\Dev\youtube-dl\master> Measure-Command {py -3.6 .\youtube_dl\__main__.py http://www.nrk.no/video/PS*150533 --fixup never}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 36
Milliseconds      : 528
Ticks             : 365284261
TotalDays         : 0,000422782709490741
TotalHours        : 0,0101467850277778
TotalMinutes      : 0,608807101666667
TotalSeconds      : 36,5284261
TotalMilliseconds : 36528,4261

master (5 reqs, 31-36 sec):

PS C:\Dev\youtube-dl\master> Measure-Command {py -3.6 .\youtube_dl\__main__.py http://www.nrk.no/video/PS*150533 --fixup never}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 31
Milliseconds      : 341
Ticks             : 313410096
TotalDays         : 0,000362743166666667
TotalHours        : 0,008705836
TotalMinutes      : 0,52235016
TotalSeconds      : 31,3410096
TotalMilliseconds : 31341,0096

@yan12125
Copy link
Collaborator

That video is a good test case. I'd like to share my testing results, too. (removed irrelevant logs)

I've update that branch with some more bug fixes. There shouldn't be performance changes in comparison with the previous version.

On twitch-connection-reuse-via-requests

$ for i in $(seq 1 5) ; do ; time python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never ; rm -f *.mp4 ; done
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  2.86s user 1.23s system 18% cpu 21.801 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  2.09s user 1.14s system 20% cpu 15.466 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  2.08s user 1.17s system 21% cpu 15.125 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  1.95s user 1.07s system 18% cpu 15.885 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  1.91s user 1.04s system 18% cpu 15.853 total

On master

$ for i in $(seq 1 5) ; do ; time python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never ; rm -f *.mp4 ; done
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  2.85s user 1.67s system 23% cpu 19.657 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  1.99s user 1.48s system 18% cpu 18.920 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  2.01s user 1.54s system 22% cpu 15.676 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  1.94s user 1.45s system 20% cpu 16.851 total
python -m youtube_dl "http://www.nrk.no/video/PS*150533" --fixup never  1.97s user 1.50s system 20% cpu 17.027 total

Environment:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.10.15.1
[debug] Git HEAD: fa4bc6e71
[debug] Python version 3.6.3 - Darwin-16.7.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.3.4, ffprobe 3.3.4
[debug] Proxy map: {}

In 5 trials master takes 17.6262 seconds and twitch-connection-reuse-via-requests 16.826 seconds on average. Not a big reduction, and I'm not sure whether it's related to connection reuse or not, either. Anyway, if requests may bring 5 seconds delay, it's an issue.

@lilydjwg: I got "Connection: keep-alive" for https://go.twitch.tv/videos/182470407. Are you testing with Twitch VODs?

@lilydjwg
Copy link
Author

lilydjwg commented Oct 18, 2017

I tried with YouTube in the previous comment.

That Twitch video is so huge so I only test to a small percent.

1min47s to 2.0% with twitch-connection-reuse-via-requests, with speed at 6-8MiB/s.
1min49s to 1.0% with 2017.10.07, with speed at 2-3MiB/s.

Both are not the first invocation to avoid the startup delay. The speed is repainted on a newline on every request due to logging.

@lilydjwg
Copy link
Author

For the nrk video, it's 00:20 vs 00:54. The chunk size (2.8M) is smaller than Twitch ones (4.7M) and I only get up to 2.4MiB/s with 2017.10.07.

@lilydjwg
Copy link
Author

I notice that I can download Twitch videos without using a proxy (I'm in China). Here's the results of direct connection:

56s to 0.5% at ~1.7MiB/s with twitch-connection-reuse-via-requests
57s to 0.5% with 2017.10.07. No log is interrupting the progress bar so I can only see the speed goes up and then drop. It shows 1.93MiB/s when I Ctrl-C.

The proxy I'm using is using bbr, and that seems to make a difference.

@yan12125
Copy link
Collaborator

Hmm what's bbr proxy?

@lilydjwg
Copy link
Author

@yan12125 it's a shadowsocks proxy. I mean the server is using the bbr TCP congestion algorithm.

@yan12125
Copy link
Collaborator

Thanks for the info and testing results. requests appears to be helpful in some use cases. As it's far from really usable, I'd like to keep that branch as-is. If you have some more ideas, I'll be glad to hear them.

@remitamine
Copy link
Collaborator

ffmpeg added support for both HTTP persistent connection in FFmpeg/FFmpeg@b7d6c0c and HTTP pipelining(simulate) in FFmpeg/FFmpeg@1f0eaa0, they are enabled by default.

@dreness
Copy link

dreness commented Dec 28, 2017

@remitamine Yeah! I think it's working. To verify this, I installed ffmpeg-20171227-8f9024f-win64-static and started capture of a twitch stream with:

youtube-dl -f best \
        --hls-use-mpegts\
        --output "%(title)s-%(id)s.ts"\
        --restrict-filenames\
        --hls-prefer-ffmpeg\
        --no-part\
        --skip-unavailable-fragments\
        "${URL}"

Then I fired up Wireshark and set the display filter to ip.src_host contains video, which will match the pair of hostnames currently used by Twitch (video-weaver* and video-edge*). Let it run for a bit, then stop the capture and open the 'conversations' stats and sort by src port (i.e. the ephemeral high port on my end of the connection). Expanding the relative start and duration fields (both units are seconds) shows that while the video-edge connections do seem to be rotating, each connection is clearly serving well more than one HTTP request, and the video-weaver connection (which hosts the m3u8 files) persists for the entire duration.

ffmpeg-http-keepalive

yan12125 referenced this issue Feb 28, 2018
Tested with

1. Twitch VOD https://go.twitch.tv/videos/182470407
2. Trailing garbages in gzipped contents, see the new test in test_http.py
@srussel
Copy link

srussel commented Nov 3, 2019

I rebased the reuse commit on to current master and tested with https://www.youtube.com/watch?v=Fhs_H9hgXwM. It did not seem to make much difference. I enabled logging with

logging.basicConfig(level=logging.DEBUG, format="%(message)s")

and kept getting this message after each chunk.

Resetting dropped connection: r4---sn-u2bpouxgoxu-5qal.googlevideo.com

@daspri
Copy link

daspri commented Jan 13, 2020

How can this unconnected commit be accessed with git? I would like to try it but cherry-pick can not find the hash.

I use youtube-dl through a proxy and reconnecting to an http connect proxy which then must reconnect to the server for each hls fragment is very slow and would benefit from persistent connections. I have tried using --hls-prefer-ffmpeg but it disconnects/reconnects for each https fragment as well.

Edit: I applied it manually. It does not respect the --proxy argument, but by setting the "https_proxy" environment variable the download from ITV is much much faster using 1 persistent connection.

coletdjnz added a commit to yt-dlp/yt-dlp that referenced this issue Oct 13, 2023
Adds support for HTTPS proxies and persistent connections (keep-alive)

Closes #1890
Resolves #4070
Resolves ytdl-org/youtube-dl#32549
Resolves ytdl-org/youtube-dl#14523
Resolves ytdl-org/youtube-dl#13734

Authored by: coletdjnz, Grub4K, bashonly
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Adds support for HTTPS proxies and persistent connections (keep-alive)

Closes yt-dlp#1890
Resolves yt-dlp#4070
Resolves ytdl-org/youtube-dl#32549
Resolves ytdl-org/youtube-dl#14523
Resolves ytdl-org/youtube-dl#13734

Authored by: coletdjnz, Grub4K, bashonly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants