Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made it possible to choose which client to use when sending innertube requests and define new clients #132

Merged
merged 24 commits into from
Oct 8, 2024

Conversation

iexavl
Copy link
Contributor

@iexavl iexavl commented Sep 28, 2024

At present because youtube is seemingly making some changes to it's APIS, requests made to certain clients might fail.
That's why, in case a request fails, with this one can choose to opt for a different client if the current one is broken. This can either be a registered client in Clients , or a custom made one if it has not been added yet to the registered ones. Any methods that use innertube have been updated to use clients specified in the corresponding request. At present, those include requests for video info, search continuation, and one for playlists. Using one of the registered clients is as simple as:

        String videoId = "abc12345";
        YoutubeDownloader downloader = new YoutubeDownloader();
        RequestVideoInfo request = new RequestVideoInfo(videoId).client(ClientType.MWEB);
        Response<VideoInfo> response = downloader.getVideoInfo(request);
        VideoInfo video = response.data();

@sealedtx
Copy link
Owner

sealedtx commented Sep 30, 2024

@iexavl what do you think about removing ClientTraits and keeping only ClientType with only priority in Clients. As you already mentioned we don't have clear definition what min/max qualities stands for each client and this might be updated by youtube anytime, someone might get confused if this library won't be updated as well. Other than this - looks good to me. Thank you

@iexavl
Copy link
Contributor Author

iexavl commented Sep 30, 2024

@iexavl what do you think about removing ClientTraits and keeping only ClientType with only priority in Clients. As you already mentioned we don't have clear definition what min/max qualities stands for each client and this might be updated by youtube anytime, someone might get confused if this library won't be updated as well. Other than this - looks good to me. Thank you

Yeah that's fine. Initially I thought the downloader could potentially use some information about the clients, but later I scrapped that, because realistically the only thing the downloader cares about is the client body. I'll add some more clients at some point and maybe write some documentation on what these clients actually are, because even if a client breaks if it's not written anywhere that it could potentially work if the user switches clients, there are possibly gonna be a decent bit of open issues about that.

@sealedtx
Copy link
Owner

@iexavl ping me when its ready to be merged

@iexavl
Copy link
Contributor Author

iexavl commented Oct 1, 2024

@iexavl ping me when its ready to be merged

@sealedtx It's pretty much functionally ready. I added all the clients that were in the list and made default priorities for them. The priorities were determined by sending 5 requests with every client and checking how many of these requests were successful and how much time the request took. Of course, this is not super accurate because of the small data set, but I think it's enough for now. I have manually set the default client to ANDROID_VR, because all requests sent with it were a success, it was pretty fast and it has the biggest quality range out of the fastest ones. Here are the test results if you are interested:
results.txt
They are sorted first by their success rate (in percentages), and then by their measured average response speed for the 5 requests and that is also how they are prioritized.
A thing I also found interesting is that pretty much all the clients that were the fastest with 100% success rate were 144-720p with the only exception being ANDROID_VR.
And also that the clients either got 100%, got 40% or got 0%, no in between. Which means the clients that failed, failed on exactly 3 of the videos every time or all the videos and nothing in between.

@sealedtx
Copy link
Owner

sealedtx commented Oct 2, 2024

@iexavl is it difficult to run this test again? I'd like to see also list of available itags (quality formats) for each of client

@iexavl
Copy link
Contributor Author

iexavl commented Oct 2, 2024

@iexavl is it difficult to run this test again? I'd like to see also list of available itags (quality formats) for each of client

byAudioQuality.txt
byCumulativeAV.txt
bySuccessAndSpeed.txt
byVideoQuality.txt
fullResults.txt

Seems I was wrong about ANDROID_TV. I ran the tests multiple times and TV_UNPLUGGED_CAST always came out on top.
The videos I ran it with were cherrypicked 4 videos 2160p60 and one 4320p. All the files are sorted as follows:

fullResults.txt : these are just the full test results. Not sorted in any particular order
byCumulativeAV.txt : these results are sorted by the ordinals of the video qualities added to the ordinals of the audio qualities for each video.
byAudioQuality.txt: sorted by the audio qualities
byVideoQuality.txt: sorted by video qualities
bySuccessAndSpeed: sorted by the success rate of the client first and the speed of the client second.

The cumulativeAV, video and audio qualities are sorted solely by their quality values, reliability and speed aren't included there.
I can throw that into the equation and perhaps throw in a couple of function for getting clients based on cumulative AV, video and audio qualities? It's also important to keep in mind that the data set is still relatively small. I did find this though: https://docs.invidious.io/installation/#docker-compose-method-production
And perhaps if I let the test run for a while until I get flagged as a bot I can run this and send all requests along with the po_token. After that I can maybe just take a playlist with like 50 videos and let the test run.
Oh and by the way I saw these error messages:
Error parsing format: unknown itag 701
Error parsing format: unknown itag 700
Error parsing format: unknown itag 699
Error parsing format: unknown itag 698
Error parsing format: unknown itag 697
Error parsing format: unknown itag 696
Error parsing format: unknown itag 695
Error parsing format: unknown itag 694

So there are some unregistered Itags, possibly for the ultrahighres. For now I am going to bed, but maybe tomorrow I will check what these itags are and add them to the Itag enum.
Also I just noticed that in byCumulativeAV.txt I put the audioQualities along with the videoQualities. Whoops.

@sealedtx
Copy link
Owner

sealedtx commented Oct 3, 2024

@iexavl greatly appreciate your work! Can you share results in format?

client_name, success_rate, avg_speed, total_formats_count, [vid_480p, vid_1080p, ..., audio_medium, audio_low..., audio+vid_1080p]

also would be good to add definitions for unknown itags

@iexavl
Copy link
Contributor Author

iexavl commented Oct 3, 2024

@iexavl greatly appreciate your work! Can you share results in format?

client_name, success_rate, avg_speed, total_formats_count, [vid_480p, vid_1080p, ..., audio_medium, audio_low..., audio+vid_1080p]

also would be good to add definitions for unknown itags

byRequestedFormatting.txt

I added the missing Itags. Well, the ones that were reported to me anyway. The results are formatted exactly as you specified
(the avg_speed is in milliseconds) and are sorted first by success rate, then by cumulative AV, then by speed.
EDIT: I also re-attached the file that is now using the VideoQuality/AudioQuality .name() instead of the qualityLabel for easier programatic parsing.

@sealedtx
Copy link
Owner

sealedtx commented Oct 3, 2024

@iexavl seems like WEB needs to be top priority?

@iexavl
Copy link
Contributor Author

iexavl commented Oct 3, 2024

@iexavl seems like WEB needs to be top priority?

Nope. I took a playlist and ran the test for longer. The good news is that the po_token works and I don't get flagged. Aside from the fact that this took like 40 minutes to run, the bad news is that pretty much all the WEB clients are almost completely busted. They all give this error:
Error deciphering is required but no js url parsing format: {"itag":18,"xtags":"Cg8KB2hlYXVkaW8SBHRydWU","fps":24,"projectionType":"RECTANGULAR","bitrate":347704,"mimeType":"video/mp4; codecs=\"avc1.42001E, mp4a.40.2\"","audioQuality":"AUDIO_QUALITY_LOW","approxDurationMs":"270883","audioSampleRate":"22050","quality":"medium","qualityLabel":"360p","audioChannels":2,"signatureCipher":"s=GsJGsJAJfQdSKwRgIhAKPU5Kl1BSDc2BsoVVtEjRlxUzt3JjdUggfYIz7Lp7FNAiEA1LJ5AawKvHFWr9mn1GHchfA7w1y01lDebjtglbggvgk%3Dgk%3D&sp=sig&url=https://rr2---sn-jvhoxucb-cvws.googlevideo.com/videoplayback%3Fexpire%3D1727998469%26ei%3DpdX-Zs71ItG9mLAP79uN-Q8%26ip%3D217.174.52.171%26id%3Do-AORzc01jJPXHB3vhxPtRhvhPA9AVMQWW3OO3fzz9Bgue%26itag%3D18%26source%3Dyoutube%26requiressl%3Dyes%26xpc%3DEgVo2aDSNQ%253D%253D%26mh%3DGt%26mm%3D31%252C29%26mn%3Dsn-jvhoxucb-cvws%252Csn-nv47zn7r%26ms%3Dau%252Crdu%26mv%3Dm%26mvi%3D2%26pl%3D24%26gcr%3Dbg%26initcwndbps%3D1117500%26bui%3DAXLXGFR31GZqNEDsnuXeUjHnWZccabXlEcfWLO3eN6Sa-VPytQQC2dWQFkq3h2dPkEdlYd1qS6aMJmjV%26spc%3D54Mbxd6m8kPOE8QQUsEx5vimQrPPl8n9XGyDdrohr4c04S2FnItjEYcZB-kS%26vprv%3D1%26svpuc%3D1%26xtags%3Dheaudio%253Dtrue%26mime%3Dvideo%252Fmp4%26ns%3D37Slort1pj-CAWjelRZgsu0Q%26rqh%3D1%26cnr%3D14%26ratebypass%3Dyes%26dur%3D270.883%26lmt%3D1706223005421924%26mt%3D1727976535%26fvip%3D4%26fexp%3D51300760%26c%3DWEB%26sefc%3D1%26txp%3D4530434%26n%3DCzADrAdsHB0R1O9_8%26sparams%3Dexpire%252Cei%252Cip%252Cid%252Citag%252Csource%252Crequiressl%252Cxpc%252Cgcr%252Cbui%252Cspc%252Cvprv%252Csvpuc%252Cxtags%252Cmime%252Cns%252Crqh%252Ccnr%252Cratebypass%252Cdur%252Clmt%26lsparams%3Dmh%252Cmm%252Cmn%252Cms%252Cmv%252Cmvi%252Cpl%252Cinitcwndbps%26lsig%3DACJ0pHgwRQIgM6bpNp21o5sckxaCU2VolAXRlCu5zsn3SUJP7ztplmsCIQDTrwRlOIwx4JpQmUkAEr3i562FFBi12i9B4yUYqiifOg%253D%253D","width":640,"lastModified":"1706223005421924","height":360}

many times over. I think last time I might have landed on some videos that seemed to work, but with the playlist the problem showed. I had noticed this error in the WEB clients before and was surprised that I wasn't seeing it anymore, well I was wrong.
Here are the test results for more videos:
byRequestedFormatting.txt
In addition to this, I also experienced a problem with requesting playlist info. Though that one isn't related to innertube, but something in the html parsing is going wrong. It doesn't happen with all playlists. If we want to use the web clients as prioritized we should probably take a look at those cipher errors first.

@iexavl
Copy link
Contributor Author

iexavl commented Oct 3, 2024

@sealedtx is there a reason why the jsUrl needs to be extracted from the html response? I looked at the requests youtube was sending and I saw a constant js address. I changed the parseFormats call in parseVideoAndroid to this:
formats = parseFormats(playerResponse, "https://www.youtube.com/s/player/96d06116/player_ias.vflset/en_US/base.js", clientVersion);

and it looks like it's working. With that when there's a cipher it doesn't give those errors, and the web clients seem to be working fine.

@sealedtx
Copy link
Owner

sealedtx commented Oct 4, 2024

@sealedtx is there a reason why the jsUrl needs to be extracted from the html response? I looked at the requests youtube was sending and I saw a constant js address. I changed the parseFormats call in parseVideoAndroid to this: formats = parseFormats(playerResponse, "https://www.youtube.com/s/player/96d06116/player_ias.vflset/en_US/base.js", clientVersion);

and it looks like it's working. With that when there's a cipher it doesn't give those errors, and the web clients seem to be working fine.

I can't remember exactly, I believe it is updated from time to time. You can try make default value for jsUrl if it fails to find one, thats would be ok

Please, do this change, run final test and get the top 1 priority client, finalize the code and I'll merge it

@iexavl
Copy link
Contributor Author

iexavl commented Oct 4, 2024

@sealedtx is there a reason why the jsUrl needs to be extracted from the html response? I looked at the requests youtube was sending and I saw a constant js address. I changed the parseFormats call in parseVideoAndroid to this: formats = parseFormats(playerResponse, "https://www.youtube.com/s/player/96d06116/player_ias.vflset/en_US/base.js", clientVersion);
and it looks like it's working. With that when there's a cipher it doesn't give those errors, and the web clients seem to be working fine.

I can't remember exactly, I believe it is updated from time to time. You can try make default value for jsUrl if it fails to find one, thats would be ok

Please, do this change, run final test and get the top 1 priority client, finalize the code and I'll merge it

I have changed the ordering according to the latest test results, which are these:
byRequestedFormatting.txt

I still set the default client to ANDROID_VR. The main reason for that is because it doesn't require a cipher unlike pretty much all of the WEB clients. Also, the youtubei player requests never returns a jsUrl at least in my experience. It has to be obtained from the html page https://www.youtube.com/watch?v. So I made it if ciphering is required, a request to that url is made with the current videoId so the jsUrl can be obtained, after that the cipher is cached and the request isn't sent again for requests made via clients. That way if the url changes it won't need a source update.
I also may or may not want this commit history to be dragged under the rug.

String jsUrl;
try {
jsUrl = extractor.extractJsUrlFromConfig(playerConfig, videoId);
} catch (YoutubeException ex) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need here another catch? it will just trigger callback.onError(ex); twice with same exception

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iexavl what about this? seems like unnecessary call callback.onError(ex);

@sealedtx
Copy link
Owner

sealedtx commented Oct 5, 2024

@iexavl left comments to fix

@sealedtx
Copy link
Owner

sealedtx commented Oct 7, 2024

@iexavl I'll merge this as it it, if you don't respond until tomorrow, Will fix error callback myself. Thank you very much for your efforts

@iexavl
Copy link
Contributor Author

iexavl commented Oct 7, 2024

@iexavl I'll merge this as it it, if you don't respond until tomorrow, Will fix error callback myself. Thank you very much for your efforts

I suppose I'll look forward to it. I am just not sure what you mean so its probably for the best that you do it. Feel free to merge whenever.

@sealedtx sealedtx merged commit 32e994a into sealedtx:master Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants