Youtube changed transcript timing. Can we use the CC file instead? #40
reaper-sid
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Youtube recently decided to merge multiple lines of the CC into each single line of the transcript. This makes youtube2Anki much less useful. I found that the CC file can be pulled as XML. You can find the links to the various CC files in the HTML of the video page below a section that looks like "captions":{"playerCaptionsTracklistRenderer":{"captionTracks":.
After replacing \u0026 with &, the URLs look like this:
https://www.youtube.com/api/timedtext?v=[video_id]&caps=asr&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=[expire_code]&sparams=ip,ipbits,expire,v,caps,xoaf&signature=[signature_code]&key=yt8&lang=en
Would it be possible to rewrite to use the CC file from those links instead of the transcript for a more granular set of data and timing?
Beta Was this translation helpful? Give feedback.
All reactions