-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YouTube] Add more parameters to InnerTube requests, use the iOS client for livestreams and fix extraction of embeddable age-restricted videos and contents with a warning before playback #780
Conversation
You could just have something that seeds the random number generator used by that parameter before running the tests |
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
...chabi/newpipe/extractor/services/youtube/stream/YoutubeStreamExtractorAgeRestrictedTest.java
Outdated
Show resolved
Hide resolved
In // at the top
private static final SecureRandom random = new SecureRandom();
// ...
// then add these methods
/**
* Generates a random string using the secure random device {@link #random}.
* {@link #setRandomSeed(long)} might be useful when mocking tests.
* @param alphabet which characters to use
* @param the length of the returned string
* @return a random string of the requested length made of only characters from the provided alphabet
*/
public static String randomStringFromAlphabet(final String alphabet, final int length) {
final StringBuilder stringBuilder = new StringBuilder();
for (int i = 0; i < length; ++i) {
stringBuilder.append(alphabet.charAt(random.nextInt(alphabet.length())));
}
return stringBuilder.toString()
}
/**
* Seeds the random device used for {@link #randomStringFromAlphabet(String, int)}. Use this in tests so that they can be mocked as the same random numbers are always generated. This is not intended to be used outside of tests
* @param seed the seed to pass to {@link SecureRandom#setSeed(long)}
*/
public setRandomSeed(final long seed) {
random.setSeed(seed);
} |
1f32c10
to
a44467a
Compare
Tests runs fine with the mock downloader on my computer (I updated mocks) but not in the CI. I applied what Stypox said, but it seems to be not really working. Someone has an idea for that? |
@TiA4f8R My guess it that setSeed(), does not produce the same effect as when it is passed in through the contstructor. See https://docs.oracle.com/javase/8/docs/api/java/security/SecureRandom.html#setSeed-byte:A-
You could try it out locally if it always returns the same values |
c00f9e6
to
6938a4e
Compare
Oh, I assumed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me (except comments, but you have probably a reason for making them like that, so don't make changes, I just wanted to open some other discussions) :-)
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Show resolved
Hide resolved
c472896
to
f6d6d0e
Compare
@litetex should we proceed with merging this? It is ready in my opinion, and related tests succeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
return JsonObject.builder() | ||
.object("context") | ||
.object("client") | ||
.value("clientName", "IOS") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this value be in a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not right now, but we should probably refactor how clients are managed in the extractor later, to deduplicate code used.
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/utils/Utils.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/utils/Utils.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Show resolved
Hide resolved
f6d6d0e
to
5f12ac2
Compare
…VersionAndKey method The boolean keyAndVersionExtracted in YoutubeParsingHelper was not set to false when resetting the client version and the key, which makes the extractor uses null on the next getting of the client version or the key if the clientVersion and the key were extracted before. Also update client versions.
…ter the Android client The cpn param, aka the content playback nonce param, is a parameter sent by YouTube web client in videoplayback requests, and for some of them, in the player request body. This PR adds it everywhere. For the desktop/WEB client, some params were missing from the playbackContext object, which seemed (or not) to make YouTube throttle streams extracted from the WEB client. This PR adds them. Fingerprinting on the WEB client basing on the client version used is not possible anymore, because the latest client version is extracted at the first time of a YouTube request on a session which require the extractor to fetch again the website (and this may come back the reCaptcha issues again unfortunately, but it seems there is no other way to get it). For the Android client, the video id is now also sent as a query parameter, like a 12 characters string, in the t query parameter, in order to spoof better this client. Researches need to be done on this parameter, unique to each request, and how it is generated by clients. This commit also fixes a small bug with the Android User-Agent string. Some code improvements have been also made.
…and key from YouTube and YouTube Music This is done by fetching https://www.youtube.com/sw.js for YouTube and https://music.youtube.com/sw.js for YouTube Music. Two new methods in Utils class have been added which allow to try to get a match of regular expressions in a string array, or a Pattern array, on a content, on a specific index or 0. Also some code refactoring has been made in this class.
…bled the Android client for livestreams The iOS client is only enabled for livestreams and the Android client is now only enabled for videos, both by default. A way to force, or not, the fetch of both clients have been added with two new static methods in YoutubeStreamExtractor.
…arameter with the false value InnerTube responses return pretty printed responses, which increase responses' size for nothing. By using the prettyPrint parameter on requests and setting its value to false, responses are not pretty printed anymore, which reduces responses size, and so data transfer and processing times. This usage has been recently deployed by YouTube on their websites.
…er agents Also provide ability to get mobile user-agents used for mobile InnerTube requests and deduplicate related code.
…ns and key Also move the iPhone device machine id to a constant, explain how it is used and move the licence in the header of the file, and fix missing imports in YoutubeStreamExtractor (due to a rebase issue).
5f12ac2
to
9ca647a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not finished yet, but no time left for today.
Here my current review:
...in/java/org/schabi/newpipe/extractor/services/youtube/extractors/YoutubeStreamExtractor.java
Outdated
Show resolved
Hide resolved
...in/java/org/schabi/newpipe/extractor/services/youtube/extractors/YoutubeStreamExtractor.java
Outdated
Show resolved
Hide resolved
...in/java/org/schabi/newpipe/extractor/services/youtube/extractors/YoutubeStreamExtractor.java
Outdated
Show resolved
Hide resolved
...in/java/org/schabi/newpipe/extractor/services/youtube/extractors/YoutubeStreamExtractor.java
Outdated
Show resolved
Hide resolved
...in/java/org/schabi/newpipe/extractor/services/youtube/extractors/YoutubeStreamExtractor.java
Outdated
Show resolved
Hide resolved
...in/java/org/schabi/newpipe/extractor/services/youtube/extractors/YoutubeStreamExtractor.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Show resolved
Hide resolved
extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick peek :)
*/ | ||
private static final String IOS_DEVICE_MODEL = "iPhone14,5"; | ||
|
||
private static Random numberGenerator = new SecureRandom(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use a regular Random instance? I don't think we're doing anything of cryptographic importance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you remember correctly, I said in several places the JavaScript clients are using the window.crypto.getRandomValues, so I used this to mickmick the best official clients.
By doing some basic reverse engineering, even if I am not sure, they are also using this in the Android app.
innertubeClientName, | ||
innertubeClientVersion | ||
}; | ||
youtubeMusicKey = new String[] {musicKey, musicClientName, musicClientVersion}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we potentially introduce an object/class for this than using a String[] array? (To remove the need to guess what each index is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done when we will refactor clients (with requests bodies buildings too, see #780 (comment)).
.../org/schabi/newpipe/extractor/services/youtube/stream/YoutubeStreamExtractorDefaultTest.java
Show resolved
Hide resolved
Also revert indentation in Utils.mixedNumberWordToLong.
…raction of contents with warnings and more Use the TV embedded client technique to get streams of embeddable age-restricted videos. This client doesn't provide the playerMicroFormatRenderer object in the player response, but it is still returned on the WEB player response, even for unavailable (but non-private) contents, so we need now to store it, as we are replacing the player response from the WEB client by the TV embedded one. Otherwise, some metadata such as the unlisted property, category, the uploadDate and the publishDate properties. The outdated code for these contents has been removed. Add the racyCheckOk and contentCheckOk to player and next requests to the InnerTube API. The first doesn't seem to make any difference when used anonymously, but the second one is needed to get streams of contents with a warning before they can be played. Also apply some requested changes, fixes and improvements in YoutubeParsingHelper and YoutubeStreamExtractor.
…MixTest Mixes seems to be not given by YouTube anymore if you use a PENDING consent cookie value. As mocks needs to updated, the test is always failing because of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM now
- Please create followup issues/PRs
APK seems to work fine :)
- Age-restricted videos work again (for now)
- You can rewind YT live streams now, nice 👍
Good work
PS: I think I have to rebase my invidious PR after this 😆
...g/schabi/newpipe/extractor/services/youtube/stream/YoutubeStreamExtractorRelatedMixTest.java
Outdated
Show resolved
Hide resolved
…ctorRelatedMixTest.testRelatedItems test disabled
This PR adds more parameters to InnerTube requests:
the
playbackContext
JSON object sent in player requests of the desktop web client, only for the web client (it seems it was needed to avoid some throttling);a parameter not sent by all clients on player request bodies (I saw it on the web client and I do not see it right now):
cpn
, akacontentPlaybackNonce
. It seems it's like a sort of client authenticity sent by all official clients on videoplayback URLs: it uses strong random values to generate a 16-character-string. This PR adds it on both parts (request bodies of player requests and videoplayback URLs) for all clients;the
id
of the video as a URL parameter of the InnerTube requests and at
parameter, on which we need to do more researches (a 12-character string which is also unique to each player request), only sent for mobile clients (and only done by the YouTube apps);for
player
andnext
requests:racyCheckOk
, for age-restricted contents (doesn't seem to do something when used anonymously),contentCheckOk
, to allow playback of contents with a warning before playing them because of the sensitive topics they contain;a new query parameter, set to
false
:prettyPrint
. YouTube was returning pretty printed responses before but that's not the case anymore, because they added this parameter which reduces a lot response sizes (but not really transfer size): take a look at the following screenshot:(Where
Transfert
meansTransfer
andTaille
meansSize
in French.)This PR also supersedes #732 and fixes most of its problems:
YoutubeStreamExtractor
;Like said in #732, fetching the iOS client (with a
deviceModel
field in the JSON payload which matches a recent Apple device model (see https://gist.github.com/adamawolf/3048717) to get 60 fps streams and an iOS user agent to get a single HLS manifest with a regularstreamingData
JSON object instead of anhlsFormats
object for livestreams) allows to get an HLS manifest for regular videos and an HLS manifest with separated video and audio for livestreams.This PR fixes some bugs with the extraction of the client version and the key and use a very lighter way than the current one, still used as a fallback, to find the client version and key of YouTube and YouTube Music, used by their service worker, by respectively fetching
https://www.youtube.com/sw.js
andhttps://music.youtube.com/sw.js
. A new method in theUtils
class has been added to decrease code duplication and increase readability of my changes.Harcoded client versions and mocks have been also updated to a more recent version.
This PR finally fixes the extraction of embeddable age-restricted contents (the ones available before), by using the new way discovered to get streams of them, as written in TeamNewPipe/NewPipe#8102.
NewPipe Debug APK to test the changes (source code): app-debug.zip
Fixes TeamNewPipe/NewPipe#8102, fixes TeamNewPipe/NewPipe#8103, closes #680 (for the extractor implementation).