Feature: Select info methods #950

konsumer · 2021-06-24T20:54:18Z

I am making a very light proxy, where I need the raw stream URLs for "regular" videos, meaning not age-restricted or otherwise weird. As it is, this library does 3 requests for info in getBasicInfo:

let info = await pipeline([id, options], validate, retryOptions, [
    getWatchHTMLPage,
    getWatchJSONPage,
    getVideoInfoPage,
  ]);

This data is needed for special videos, and for collecting other important info, but not all use-cases need that.

I am thinking just exporting getWatchHTMLPage would do the trick for me, and I can skip the other stuff. Alternately, maybe an option for getBasicInfo to make less requests would also support my use-case.

I am happy to PR for the feature, if interested. I would much prefer to share the effort of keeping getWatchHTMLPage up-to-date, rather than maintain my own copy of something very similar.

In my current code, I am using this, and filtering URLs to get stream links that work:

import fetch from 'node-fetch'

const regex = /var ytInitialPlayerResponse = (.+);<\/script>/gm

export async function getWatchHTMLPage (id) {
  const r = await fetch(`https://www.youtube.com/watch?v=${id}`)
  const str = await r.text()
  const m = regex.exec(str)
  if (m && m.length === 2){
    return JSON.parse(m[1])
  }
}

// test
getInfo('K-281doxOMc')
  .then(console.log)

But like I said, exporting getWatchHTMLPage here would be preferred.

This is sort of related to #945 as I started working in this direction when it started failing (which in my case is resolved by just not calling the other data-functions.)

The text was updated successfully, but these errors were encountered:

gatecrasher777 · 2021-06-25T09:35:03Z

The library only executes those extra requests if the previous one failed.

If you want really light and efficient, the innertube player?key= post requests give the player responses already in json format. These requests use much less bandwidth and overhead than calling the watch pages. But you need to maintain a session in which you get all the info needed to make those calls from an initial call to the youtube home page (which also gives the html5player to be used for decyphering urls).

The player?key= post requests don't work for age restricted videos though. For those you still need to use the watch page with a logged- in token/cookie.

The player requests also work in an unlimited and anonymous way too - i.e. without cookies and no 429s.

konsumer · 2021-06-25T11:00:55Z

The library only executes those extra requests if the previous one failed.

Ah, yes, sorry, I just had a look at pipeline. Cool!

But you need to maintain a session

I use redirects in a lambda, so that would probably be problematic, without drastically changing how it works.

I think exporting them might be helpful for other stuff (like more atomic testing, working around failures that crash data-collection, etc) but I can totally live with it. Feel free to close this issue.

TimeForANinja closed this as completed Jun 25, 2021

TimeForANinja added feature question labels Jun 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Select info methods #950

Feature: Select info methods #950

konsumer commented Jun 24, 2021 •

edited

Loading

gatecrasher777 commented Jun 25, 2021 •

edited

Loading

konsumer commented Jun 25, 2021 •

edited

Loading

Feature: Select info methods #950

Feature: Select info methods #950

Comments

konsumer commented Jun 24, 2021 • edited Loading

gatecrasher777 commented Jun 25, 2021 • edited Loading

konsumer commented Jun 25, 2021 • edited Loading

konsumer commented Jun 24, 2021 •

edited

Loading

gatecrasher777 commented Jun 25, 2021 •

edited

Loading

konsumer commented Jun 25, 2021 •

edited

Loading