Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/arte]: switch from api/v1 to api/v2 #29640

Closed
3 tasks done
casaper opened this issue Jul 24, 2021 · 3 comments · May be fixed by #29653
Closed
3 tasks done

[extractor/arte]: switch from api/v1 to api/v2 #29640

casaper opened this issue Jul 24, 2021 · 3 comments · May be fixed by #29653
Labels

Comments

@casaper
Copy link

casaper commented Jul 24, 2021

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

I've noticed the Arte.tv extractor uses the older API to grab the extraction data.
For example:

https://api.arte.tv/api/player/v1/config/de/086124-001-A

Currently, this works just fine. But their website uses their newer V2 API:

https://api.arte.tv/api/player/v2/config/de/086124-001-A

This one however is requiring a bearer token authentication in order to get the data.

As long as the old API V1 is still up and running, there is actually no need to change anything.

But I'm afraid some day the folks at arte.tv will kill the old API, and then the extractor will turn dysfunctional.

Refactor the Arte.tv extractor to API V2

I already found out where and how the browser gathers the token for the authentication, though.
It is only a little hidden behind some next.js lazy loading script. So the level of obfuscation is minor. I presume other sites will try to do way worse than that.

So will I get the sanctioning for such a refactor of the extractor?

I'm asking, because I do not want to invest time in it to create a PR, that then is ignored or rejected for whatever reason.

@casaper
Copy link
Author

casaper commented Jul 24, 2021

curl -s 'https://static-cdn.arte.tv/guide/manifest.js?ver=210715143858' | egrep -oh '"default":{"token":"[a-zA-Z0-9_-]*"'

@dirkf
Copy link
Contributor

dirkf commented Oct 23, 2021

curl -s 'https://static-cdn.arte.tv/guide/manifest.js?ver=210715143858' | egrep -oh '"default":{"token":"[a-zA-Z0-9_-]*"'

Or now:

get_token() {
    curl -s "$(curl -s "$1" |
                        egrep -oh 'https?://static-cdn\.arte\.tv/replay/_next/static/chunks/224\.[[:alnum:]-]+\.js')" |
        sed -rn -e 's/^.+"default":\{"token":"([a-zA-Z0-9_-]+)".*$/\1/;T;p'
}
$ get_token 'https://www.arte.tv/en/videos/088501-000-A/mexico-stealing-petol-to-survive/'
MzYyZDYyYmM1Y2Q3ZWRlZWFjMmIyZjZjNTRiMGY4MzY4NzBhOWQ5YjE4MGQ1NGFiODJmOTFlZDQwN2FkOTZjMQ

@dirkf
Copy link
Contributor

dirkf commented Nov 30, 2022

Continued in #30878.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants