Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support text tracks #108

Open
benwiley4000 opened this issue Jun 5, 2017 · 12 comments
Open

Support text tracks #108

benwiley4000 opened this issue Jun 5, 2017 · 12 comments

Comments

@benwiley4000
Copy link
Owner

benwiley4000 commented Jun 5, 2017

More info here.

For text tracks:

  1. We should support a textTracks property on the playlist track object which is a list of objects describing the properties in an HTMLTrackElement. We will dynamically add tracks to the video element (this article has good advice).
  2. We need a way to select which text tracks are active. Captions need to be able to be toggled on and off. For other kinds of text tracks (subtitles, description, chapters, metadata) I'm not sure how these are shown or not. How does the browser handle them? Do we need to do anything special since we're not displaying the browser's default video controls?
  3. We need a way to select which language the user prefers. Which language is default? Does the developer need to hardcode this or can it be detected from the user's system?
  4. We can have a boolean prop called displayNativeCaptions (or something) that is true by default but can be turned off to disable the browser's native text tracks display inside the video element.
  5. We'll want to make the cues available via playerContext, which means making most or all of the data from the VTTCue interface available. We'll want to make all the active and possibly all the cues for the active track(s) available.

We should check out how related/similar libraries have handled APIs for text tracks. Most importantly we should see what options popular media libraries have for manipulating available text tracks from the user angle.

@benwiley4000
Copy link
Owner Author

Here's a few videos from archive.org we could use for text track examples (we'll need to use YouTube to autogenerate subtitles).

https://archive.org/details/Computer1984

https://archive.org/details/commodore-64-training-tape-with-jim-butterfield

https://archive.org/details/TelephoneEtiquette

@benwiley4000
Copy link
Owner Author

benwiley4000 commented Feb 24, 2019

We can run this in the console of a YouTube video to generate some VTT text

var AVOID_CONCURRENT_CAPTIONS = true;
try {
  var captionData = JSON.parse(ytplayer.config.args.player_response)
    .captions
    .playerCaptionsTracklistRenderer
    .captionTracks[0];
  fetch(captionData.baseUrl)
    .then(function(res) {
      return res.text();
    })
    .then(function(xmlString) {
      return new window.DOMParser().parseFromString(xmlString, 'text/xml');
    })
    .then(function(xmlParsed) {
      function pad2(number) {
        // thanks https://www.electrictoolbox.com/pad-number-two-digits-javascript/
        return (number < 10 ? '0' : '') + number
      }

      function pad3(number) {
        return number >= 100 ? number : ('0' + pad2(number))
      }

      function formatTime(time) {
        var hours = 0;
        var minutes = 0;
        var seconds = 0;
        var milliseconds = 0;
        while (time >= (60 * 60)) {
          hours++;
          time -= (60 * 60);
        }
        while (time >= 60) {
          minutes++;
          time -= 60;
        }
        while (time >= 1) {
          seconds++;
          time -= 1;
        }
        milliseconds = (time * 1000).toFixed(0);
        return pad2(hours) + ':' + pad2(minutes) + ':' + pad2(seconds) + '.' + pad3(milliseconds);
      }
      var webVttContent = '';
      webVttContent += 'WEBVTT\n';
      var tempNode = document.createElement('div');
      var textElements = xmlParsed.querySelector('transcript').childNodes;
      Array.prototype.forEach.call(textElements, function(text, index) {
        webVttContent += '\n' + (index + 1) + '\n';
        var start = Number(text.getAttribute('start'));
        var end = start + Number(text.getAttribute('dur'));
        if (AVOID_CONCURRENT_CAPTIONS && index + 1 < textElements.length) {
          var nextStart = Number(textElements[index + 1].getAttribute('start'));
          end = Math.min(end, nextStart);
        }
        webVttContent += formatTime(start) + ' --> ' + formatTime(end) + '\n';
        var content = text.textContent.replace(/<[^>]+>/g, '');
        tempNode.innerHTML = content;
        webVttContent += tempNode.textContent + '\n';
      });
      console.log(webVttContent);
    });
} catch (err) {
  console.warn(err);
  console.warn('Unable to get WebVTT caption data from YouTube video.');
}

@benwiley4000
Copy link
Owner Author

Telephone Etiquette: https://hastebin.com/uzibatapis.sql

Commodore 64 video: https://hastebin.com/lebuyaxifo.sql

Computer Music: https://hastebin.com/sidubidudu.sql

@benwiley4000
Copy link
Owner Author

Some more notes:

  1. Browser language can be detected with navigator.languages (or if that's not available, navigator.language, or in Internet Explorer, navigator.browserLanguage... double-check compatibility). There should be a defaultLanguage prop to PlayerContextProvider, but if it's missing, we will auto-detect.
  2. The user can select a selectedTextTrack (final name TBD) which is null by default and can be one of the subtitle or caption tracks, but not anything else. All the media players seem not to try to make much of a distinction between captions and subtitles, even if there are subtle differences, and importantly, no one seems to want to allow playing both at once.
  3. The autoloadCaptions (final name TBD) boolean prop will cause selectedTextTrack to be initially set automatically based on the set or detected language. The isDefault property can be used if there are multiple in the same language?
  4. If displayNativeCaptions is true and selectedTextTrack is not null then:
  • We will display the selectedTextTrack
  • If a chapters track is available in the same language it will be shown, or else if one in the same language prefix is available it will be shown, or else we will choose the first available. In all cases we can use isDefault for specificity.
  • Same as above for description

I think this resolves all of the questions above except for:

5. We'll want to make the cues available via playerContext, which means making most or all of the data from the VTTCue interface available. We'll want to make all the active and possibly all the cues for the active track(s) available.

I'd say that's still TBD, but we can go ahead and implement everything else. Perhaps for Cassette v2.0 we can skip displayNativeCaptions and the texttrack info in context, and spin that stuff out into a post-2.0 issue.

@benwiley4000 benwiley4000 added this to the 2.0 beta release milestone Feb 27, 2019
@benwiley4000
Copy link
Owner Author

benwiley4000 commented Feb 27, 2019

Implementation outline for 2.0 beta:

  • textTracks prop for setting track elements in media element.
  • Implement autoloadCaptions to select the first subtitles/captions available
  • New @cassette/player control for selecting the selectedTextTrack
  • defaultLanguage prop and autodetection of language for use with autoloadCaptions
  • Display chapters and description tracks taking language into account

@benwiley4000
Copy link
Owner Author

I thought about whether we should support props to override how descriptions, chapters and metadata text tracks are chosen for a given subtitles or captions track. I decided we can leave the door open for this in the future, but it's a real edge case that might not need support.

@benwiley4000
Copy link
Owner Author

The CaptionsSelector component has been added. I need to fix a bug (in Chrome, but not Firefox it seems) where showing one captions text track shows all the captions text tracks on load.

@benwiley4000
Copy link
Owner Author

That issue is fixed now!

@benwiley4000
Copy link
Owner Author

I think we're ok to handle displayNativeCaptions actually. The big thing to handle is to make sure we update activeTextTracks[?].activeCues appropriately.

@benwiley4000
Copy link
Owner Author

Three more items to check off:

  • displayNativeCaptions prop
  • Update activeCues appropriately
  • Persist selected text tracks info in snapshot somehow

@benwiley4000
Copy link
Owner Author

  • Make sure CC menu doesn't flow off screen when opened.

@benwiley4000
Copy link
Owner Author

  • getMenuItemLabelForTextTrack prop for CaptionsSelector
  • Make sure all child listeners (texttracks, sources) are removed on component unmount

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant