Youtube changed transcript timing. Can we use the CC file instead? #40
reaper-sid
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Youtube recently decided to merge multiple lines of the CC into each single line of the transcript. This makes youtube2Anki much less useful. I found that the CC file can be pulled as XML. You can find the links to the various CC files in the HTML of the video page below a section that looks like "captions":{"playerCaptionsTracklistRenderer":{"captionTracks":.
After replacing \u0026 with &, the URLs look like this:
https://www.youtube.com/api/timedtext?v=[video_id]&caps=asr&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=[expire_code]&sparams=ip,ipbits,expire,v,caps,xoaf&signature=[signature_code]&key=yt8&lang=en
Would it be possible to rewrite to use the CC file from those links instead of the transcript for a more granular set of data and timing?
Beta Was this translation helpful? Give feedback.
All reactions