index.html

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Web Media Application Developer Guidelines</title>
    <script src="https://www.w3.org/Tools/respec/respec-w3c" class="remove" defer></script>
    <script class="remove">
      var respecConfig = {
        specStatus: "CG-DRAFT",
        editors: [{
          name: "Joel Korpi",
          company: "AppNexus",
          companyURL: "http://www.appnexus.com"
        },{
          name: "Thasso Griebel",
          company: "CastLabs",
          companyURL: "http://www.castlabs.com"
        }],
        formerEditors: [{
          name: "Jeff Burtoft",
          company: "Microsoft",
          companyURL: "http://www.microsoft.com"
        }],
        group: "webmediaapi",
        wgURI: "https://www.w3.org/community/webmediaapi/",
        github: "https://github.com/w3c/webmediaguidelines",
        copyrightStart: 2017,
        additionalCopyrightHolders: "Consumer Technology Association",
        logos: [{
          src: 'https://www.w3.org/StyleSheets/TR/2016/logos/W3C',
          href: "https://www.w3.org",
          alt: "World Wide Web Consortium (W3C)",
          width: 72,
          height: 48,
          id: 'w3c-logo',
        }, {
          src: 'https://cdn.cta.tech/cta/media/media/home/cta-logo.png',
          href: "https://cta.tech/Research-Standards.aspx",
          alt: "Consumer Technology Association (CTA)",
          width: 100,
          height: 65,
          id: 'cta-logo',
        }]
      };
    </script>
    <style>
      .head > a.logo {
        float: right;
        margin: 0.4rem 0 0.2rem .4rem;
      }
      .head > a.logo {
        border: none;
        text-decoration: none;
        background: transparent;
      }

      table {
        border-spacing: 0;
        border-collapse: collapse;
        width: 100%;
      }
      td {
        border: 1px solid gray;
        padding: 4px;
        vertical-align: top;
      }
    </style>
  </head>
  <body class="informative">
    <section id="abstract">
      <p> This specification is a companion guide to the <a
          href="https://w3c.github.io/webmediaapi/" target="_blank">Web
          Media API spec</a>. While the Web Media API spec is targeted
        at device implementations to support media web apps in 2018,
        this specification will outline best practices and developer
        guidance for implementing web media apps. This specification
        should be updated at least annually to keep pace with the
        evolving web platform. The target devices will include any
        device that runs a modern HTML user agent, including
        televisions, game machines, set-top boxes, mobile devices and
        personal computers. </p>
    </section>
    <section id="sotd">
      <p style='border: 5px solid red; border-radius: 10px; padding: 1em; margin: 1em; text-align: center; font-weight: 300; font-size: 120%;'>🚩
        <br>This document is deprecated and <strong>MUST NOT</strong> be used for further technical work.</p>
        <p style='position: fixed; bottom: 1em; right: 1em; padding: 0.5em; border-radius: 5px; background-color: red; color: white;font-weight: bold; font-size: 150%;writing-mode: vertical-rl;
        text-orientation: mixed;'><span style='background-color: white;padding: 4px; border-radius: 3px;z-index: 1000;'>🚩</span> Deprecated document. Do not use.</p>
      <p>This is a deprecated document and is no longer updated. It is inappropriate to cite this document as other than deprecated work.</p>
    </section>
    <section>
      <h2>Introduction</h2>
      <ol class="ednote" title="Notes on v1 draft specification:">
        <li>This document is directed towards application developers.
          Its content will contain best practices for building media
          applications across devices, it will not provide direction to device
          manufactures or User Agent implementers. </li>
        <li>This is a companion spec put forth by the Web Media API
          Community Group.</li>
      </ol>
      <section>
      <h3>Scope</h3>
      <p>The scope of this document includes general guidelines, best
        practices, and examples for building media applications across
        web browsers and devices. </p>
      <p>The target audience for these guidelines are software
        developers and engineers focused on building cross-platform,
        cross-device, HTML5-based applications that contain
        media-specific use-cases.</p>
      <p>The focus of this document is on HTML5-based applications,
        however the use-cases and principles described in the guidelines
        can be applied to native applications (applications that have
        been developed for use on a particular platform or device). The
        examples in this document provide a starting point to build
        your media application and includes example implementations from
        various providers and vendors. This document also includes
        sample content and <a>manifests</a> as well as <a>encoding</a> guidelines to
        provide hints on achieving the best quality and
        efficiency for your media applications.</p>
      </section>
      <section>
      <h3>Accessibility</h3>
      <p>These guidelines will cover making your applications compliant
        with accessibility requirements from the perspective of
        delivering and consuming media. However, to make sure your
        entire application is accessibility-friendly, please see the W3C Web Content
          Accessibility Guidelines [[WAI-WEBCONTENT]].</p>
      </section>
      <section>
      <h3>Glossary of Terms</h3>
      <p>The following lexicon provides the common language used to build this document and communicate concepts to the end reader.<br>
      </p>
        <table>
            <tr>
              <td><b>Term</b></td>
              <td><b>Definition</b></td>
            </tr>
            <tr>
              <td><b>360 Video</b></td>
              <td>Video content where a view in every direction is
                recorded at the same time, shot using an
                omni-directional camera or a collection of cameras.
                During playback the viewer has control of the viewing
                direction like a spherical panorama.</td>
            </tr>
            <tr>
              <td><dfn>AAC</dfn><br>
              </td>
              <td>Advanced Audio Coding is a proprietary
                audio coding standard for lossy digital audio
                compression. Designed to be the successor of the MP3
                format, AAC generally achieves better sound quality than
                MP3 at the same <a>Bit rate</a>.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>ABR</dfn><br>
              </td>
              <td>Adaptive Bitrate Streaming is a method to optimize
                video playback by automatically selecting a specific bitrate rendtion
                of a video file within the video player. Some ABR algorithms use client bandwidth throughput to determine
                which <a>rendition</a> to play. For example, if a client tries to play a video on a slow internet connection, the
                player could automatically select the lower bitrate <a>rendition</a> to maximize playback speed.<br>
              </td>
            </tr>
            <tr>
              <td><b>AVOD</b></td>
              <td>Advertising-supported Video on Demand. AVOD services
                monetize their content by serving ads to users, as
                opposed to other business models such as paid
                subscription or pay-per-title.</td>
            </tr>
            <tr>
              <td><dfn>Bit rate</dfn><br>
              </td>
              <td> Bit rate, (also known as data rate), is
                the amount of data used for each second of video. In the
                world of video, this is generally measured in kilobits
                per second (kbps), and can be constant or variable. </td>
            </tr>
            <tr>
              <td><dfn>codec</dfn><br>
              </td>
              <td>a codec is an algorithm defining the compression
                and decompression of digital video or audio. Most codecs
                employ proprietary coding algorithms for data
                compression. MP3, <a>H.264</a>, and <a>HEVC</a> are examples of
                codecs.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>CDN</dfn></td>
              <td>A content delivery network (CDN) is a system of
                distributed servers (network) that deliver pages and
                other media content to a user, based on the geographic
                locations of the user, the origin of the webpage and the
                content delivery server.</td>
            </tr>
            <tr>
              <td><dfn>Chunk</dfn> or <b>Chunking</b></td>
              <td><a>ABR</a> technologies typically break a video or audio
                stream into chunks to make transmission more compact and
                efficient. These chunks are typically 2-10 seconds in
                length and contain at least one <a>I-Frame</a> so that the
                video player has complete information for which to
                render the video from that point in the manifest. </td>
            </tr>
            <tr>
              <td><dfn>Closed Captions</dfn></td>
              <td>Used for accessibility and usability, closed captions
                are a visual representation of the audio content of a
                media file stored as metadata file or track and
                displayed as an overlay in the video player in
                synchronization with the video/audio track. Typical
                formats for closed captioning are WebVTT and SRT.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>DRM</dfn></td>
              <td>A combination of encryption, authentication and access
                control technologies that are used to prevent
                unauthorized users from consuming copyrighted media
                content. The most widely used DRM solutions today are
                Microsoft PlayReady, Google Widevine, and Apple FairPlay
                Streaming.<br>
                <br>
                There are also open technologies that use standards such
                as Common Encryption [[MPEGCENC]] and Media Source
                Extensions (MSE) [[MEDIA-SOURCE]] to create a secure environment without
                third-party products.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>Embed</dfn></td>
              <td>Most video players in a web page use what's called an
                "Embed" or "Embed Code" that is placed in the code of
                your HTML page or app that will render the video
                playback environment. </td>
            </tr>
            <tr>
              <td><dfn>Encoding</dfn></td>
              <td>Also referred to as compression, this is the process
                of removing redundancies from raw video and audio data,
                thereby reducing the amount of data required to deliver
                the media across a network, on a physical disc, etc. The
                process can result in reduced visual or auditory quality
                of the encoded media, but the loss is usually
                imperceptible to the user. <a>H.264</a> and <a>HEVC</a> are examples
                of <a>codec</a>s that use compression during <a>transcoding</a>. </td>
            </tr>
            <tr>
              <td><dfn>EME</dfn> (Encrypted Media Extensions)</td>
              <td>Encrypted Media Extensions (EME) is a recommended W3C
                specification for providing a communication channel
                between web browsers and digital rights management (<a>DRM</a>)
                agent software. This allows the use of HTML5 video to
                play back <a>DRM</a>-wrapped content such as streaming video
                services without the need for third-party media plugins
                like Adobe Flash or Microsoft Silverlight. The use of a
                third-party key management system may be required,
                depending on whether the publisher chooses to scramble
                the keys.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>Container</dfn> (Format)<br>
              </td>
              <td>A format or "container format" is used
                to bind together video and audio information, along with
                other information such as metadata or even subtitles.
                For example, .MP4, .MOV, .WMV etc. are all container
                formats that contain both audio, video, and metadata in
                a single file. Formats contain tracks that are encoded
                using <a>codec</a>s. For example an .MP4 might use the <a>AAC</a>
                audio <a>codec</a> together with the <a>H.264</a> video <a>codec</a>.
              </td>
            </tr>
            <tr>
              <td><dfn>H.264</dfn></td>
              <td>Also known as MPEG-4 AVC (Advanced Video Coding) it is
                now one of the most commonly used recording formats for
                high definition video. It offers significantly greater
                compression than previous formats. </td>
            </tr>
            <tr>
              <td><dfn>HEVC</dfn><br>
              </td>
              <td>High Efficiency Video Coding is one of
                the newest generation video <a>codec</a>s that is able to
                achieve efficiency up to 4x greater than <a>H.264</a>, but it
                requires the HEVC <a>codec</a> to be present on the device.
                HEVC typically takes considerable more processing power
                to encode and decode than older <a>codec</a>s such as <a>H.264</a>.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>HLS</dfn></td>
              <td>HTTP Live Streaming (HLS) is an adaptive streaming
                technology created by Apple that allows the player to
                automatically adjust the quality of a video delivered to
                a web page based on changing network conditions to
                ensure the best possible viewer experience. </td>
            </tr>
            <tr>
              <td><dfn>I-Frame</dfn> (video)</td>
              <td>In video compression, an I-Frame is an independent
                frame that is not dependent on any future or previous
                frames to present a complete picture. I-Frames are
                necessary to provide full key frames for the
                encoder/decoder. </td>
            </tr>
            <tr>
              <td><dfn>In-stream</dfn> (Advertisement)</td>
              <td>A video advertisement that accompanies the main video content. Examples include pre-rolls, mid-rolls, and post-rolls which play before, during, or after a main piece of content, respectively.</td>
            </tr>
            <tr>
              <td><dfn>Live Streaming</dfn></td>
              <td>Live streaming is a type of broadcast that is
                delivered over the Internet where the source content is
                typically a live event such as a sporting event,
                religious service, etc. Unlike <a>VOD</a>, viewers of a live
                stream all watch the same broadcast at the same time. </td>
            </tr>
            <tr>
              <td><dfn>Manifest</dfn></td>
              <td>A manifest is a playlist file for adaptive streaming
                technologies (such as <a>HLS</a>, DASH, etc.) that provides the
                metadata information for where to locate each segment of
                a video or audio source. Depending on the configuration,
                some technologies have two types of manifests: one
                "master" manifest that contains the location of each
                <a>rendition</a> manifest and one <a>rendition</a> manifest for each
                <a>rendition</a> that contains the location (relative or
                absolute) of each <a>chunk</a> of a video or audio source. </td>
            </tr>
            <tr>
              <td><dfn>Media Renderer</dfn></td>
              <td>A process that takes as input file-based video bytes and renders those bytes
              to the client display, such as a computer monitor or television display. The media renderer is
              responsible to ensure that the video is displayed accurately, per the encoded profile.</td>
            </tr>
            <tr>
              <td><b>Metadata Track</b></td>
              <td><a>ABR</a> streaming technologies contain the ability to
                include not only video and audio tracks within the
                stream, but also allow for metadata tracks for
                applications such as <a>Closed Captions</a>, advertising cues,
                etc.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>Mezzanine File</dfn></td>
              <td>An intermediate file typically created after post-production that is used for distribution or
                for input into another workflow process, such as transcoding. Mezzanine files
                are typically very high quality to ensure the minimum
                quality degredation. Often times, a lossless <a>codec</a> is used
                to achieve this high quality.<br>
              </td>
            </tr>
            <tr>
              <td><dfn>MPEG-DASH</dfn></td>
              <td>Dynamic Adaptive Streaming over HTTP (DASH), also
                known as MPEG-DASH, is an adaptive bitrate streaming
                technique that enables high quality streaming of media
                content over the Internet delivered from conventional
                HTTP web servers. </td>
            </tr>
             <tr>
              <td><b>Outstream Advertisement</b></td>
              <td>A video advertisement that stands alone, without accompanying main video
              content. Examples include a video ad that is <a>embed</a>ded within a text news article page that contains text and images as the main content medium.</td>
            </tr>
            <tr>
              <td><dfn>Player</dfn><br>
              </td>
              <td>A video or audio media player that is used to render a
                media stream in your application environment. There are
                commercially available and open source video players and
                SDKs for almost every platform. </td>
            </tr>
            <tr>
              <td><dfn>Rendition</dfn></td>
              <td>A specific video and audio stream for a target quality
                level or bitrate in an adaptive streaming set or
                manifest. For example, most <a>HLS</a> or DASH adaptive
                streaming manifests contain multiple renditions for
                different quality/bitrate targets so that the viewer
                automatically views the best quality content for their
                specific Internet connection speed. </td>
            </tr>
            <tr>
              <td><dfn>SDK</dfn></td>
              <td>A Software Development Kit is a set of tools that
                allow the creation of applications for a certain
                framework or development platform. Typically they are
                the implementation of one or more APIs to interface to a
                particular programming language, and include debugging
                utilities, sample code, and documentation. </td>
            </tr>
            <tr>
              <td><dfn>Streaming</dfn></td>
              <td>Delivering media content to a viewer via Internet protocols such as HTTP or
                RTMP.</td>
            </tr>
            <tr>
              <td><b>SVOD</b></td>
              <td>Subscription-supported Video on Demand. SVOD services
                monetize their content through paid subscriptions from
                users, as opposed to other business models such as
                advertising or pay-per-title.</td>
            </tr>
            <tr>
              <td><dfn>Transcoding</dfn></td>
              <td>The process of converting a media asset from one <a>codec</a> to another.</td>
            </tr>
            <tr>
              <td><dfn>Trick Play/Mode</dfn></td>
              <td>Trick mode, sometimes called trick play, is a feature of digital video systems including Digital Video Recorders and Video on Demand systems that mimics the visual feedback given during fast-forward and rewind operations that were provided by analogue systems such as VCRs.</td>
            </tr>
            <tr>
              <td><b>URL Signing</b></td>
              <td>URL signing is a mechanism for securing access to
                content. When URL Signing enforcement is enabled on a
                property; requests to a content server or content
                delivery network must be signed using a secure token.
                The signing also includes an expiration time after which
                the link will no longer work.</td>
            </tr>
            <tr>
              <td><dfn>VOD</dfn></td>
              <td>Video on demand is video that is served at the request
                of a user. Delivery is typically through a content
                delivery network to optimize delivery speed and
                efficiency.<br>
              </td>
            </tr>
            <tr>
              <td><b>VR</b></td>
              <td>Virtual Reality is a realistic, immerse,
                three-dimensional environment, created using interactive
                software and hardware, and experienced or controlled by
                movement of the body. VR experiences typically require
                wearing a head mounted display (HMD).</td>
            </tr>
          </tbody>
        </table>
      <p>For a more detailed list of relevant terms, please see the
        glossary of terms for the vendors or technologies you use in
        your workflow. </p>

      </section>
    </section>
    <section>
      <h2>Media Playback Use Cases</h2>


        <section>
        <h3>General Description</h3>
        <p>Material (typically in video or audio content) is made
          available by a content provider via a web-enabled application
          and delivered by a content distribution network. There are
          three distinct interlocking processes: generation, delivery
          and consumption / playback. </p>

        <section>

        <h3>Content Generation</h3>
        <p> <a>Mezzanine File</a> content is normally delivered to the service
          provider as a file with near lossless compression.
          Specifically, for High Definition content at 1080p with 24, 30, 50, or
          60 frames-per-second, the bitrate of the original content is
          typically 25Mbps to 150Mbps for 1080P HD and 150Mbps - 300 Mbps for 4K UHD. </p>
        <p><a>ABR</a> <a>streaming</a> content is generated by <a>transcoding</a> the <a>Mezzanine File</a>  against an <a>transcoding</a> profile. Firstly, the mezzanine file is transcoded into a set of versions, each with its own average bitrate. Secondly, each version is packaged into segments of a specified duration. </p>
        <p> The first process is defined by an <a>transcoding</a> profile. A
          profile describes the set of constraints to be used when video
          is being prepared for consumption by a range of video
          applications. The description includes the different bitrates
          to be generated during the <a>transcoding</a> process that will allow
          for the same content to be consumed on a wide variety of
          devices on different networks from cellular to LAN. </p>
        <p> The second process is performed by a packager which segments
          the different bitrates. Next, the output is packaged into a
          transport <a href='#dfn-container'>format</a> such as transport streams (.ts) or fragmented
          MP4s (.m4s). Lastly, they are optionally encrypted with a <a>DRM</a> that is
          suitable for the environment where the content is going to be
          played out. The packager is also responsible for the
          generation of a <a>manifest</a> file, typically a DASH (.mpd), <a>HLS</a>
          (.m3u8) or possibly Smooth (.ism) or HDS (.f4v), that
          specifies the location of the media and its <a href='#dfn-container'>format</a>. </p>
        </section>
        <section>
        <h4>Content Delivery</h4>
        <p> After the content has been generated the resulting segments
          of video and corresponding <a>manifest</a> files are pushed to an
          origin server. However, the assets are rarely delivered
          directly from the origin. </p>
        <p> At this stage, the control in the chain switches to the
          client's web video application. The content provider supplies the client with the URL of a manifest file located on a <a>CDN</a> rather than the origin. The manifest URL is typically passed to a video <a>player</a> for playback. The player makes a HTTP GET request for the manifest from <a>CDN</a> edge. If the <a>CDN</a> edge does not currently have the manifest available, the <a>CDN</a> requests a copy of the file from the origin and the file is cached at the edge for later use. The CDN edge then returns the the manifest to the requesting player. <i>(NOTE: There are different levels of caching; popular <a>VOD</a> content is kept closer to the edge in the <a>CDN</a> network in this way it can be delivered to customers faster than an edge server that isn't tuned for high volume delivery.)</i> </p>
        </section>
        <section>

        <h3>Content Playback</h3>
        <p> As mentioned in the Content Generation section, the <a>player</a> uses a tag within the <a>manifest</a> to determine the playout type. In <a>HLS</a>, if there is a type with the value of <a>VOD</a>, the player will not reload the manifest. This has important consequences if there are changes in availability after the session as commenced. In DASH, the difference between a live and VOD playlist are more subtle. At this point the behavior
          differs between players found on the different devices
          depending on the transport formats. However, broadly, it
          behaves in the following way: <br>
        </p>
        <ol>
            <li>While the player's <a>ABR</a> algorithm determines which
            <a>rendition</a> to use during playback, for instance based on
            the currently measured network throughput or the current buffer,
            the algorithm might not have enough information at playback start
            to use such metrics.  The initial decision depends on the player's
            <a>ABR</a> algorithm but is often based on prior knowledge. This
            can for instance be previous measurements of network throughput, a
            simple default value that is used if not enough data is available
            to estimate the current network throughput, or a custom algorithm to
            determine the initial bandwidth estimation.</li>
          <li>Once the player has this information it can then compare
            this with the metadata from the manifest that describes the
            different qualities that the content provider is supplying.
            It picks the quality level with an average bitrate that is
            as close to the calculated throughput but within its bounds to
            avoid a situation where a consistent experience is
            interrupted as the player requires more data than the
            current network bandwidth can supply, where the player’s
            buffer is emptying faster than it is being filled. </li>
          <li>It then requests a segment from a location on the edge
            server that typically relative to the location of the
            manifest. </li>
          <li>The downloaded segment is then added to the video decoder
              pipeline for playback. For player's that are based on Media
              Source Extensions [[MEDIA-SOURCE]] the segment is appended to the
              corresponding media source buffer. Note that the segment is
              appended as is even if the content is encrypted. The
              underlying <a>DRM</a> system is usually integrated with the
              decoder pipeline and will decrypt the samples prior to decoding.
          </li>
          <li>The <a>media renderer</a> pulls the video data from the buffer and
            passes it to the video surface where it is rendered. </li>
          <li>If the available bandwidth remains constant the player
            will continue to request segments from the same bitrate
            stream, pulling down chunks and filling its video buffer. In
            the event a change in network availability the player will
            make a decision about the need to either drop to a lower
            <a>rendition</a> or request a higher bitrate. In case of <a>HLS</a>,
            switching the <a>rendition</a> will trigger a download of a new
            child manifest prior to the download of the associated segment. In case
            of <a>MPEG-DASH</a> the player can start downloading the associated
            segment directly.</li>
        </ol>
        <i>NOTE: This workflow may have additional steps if <a>DRM</a> is used.</i>
        </section>
        </section>
        <section>
        <h3>On-Demand Streaming (VOD)</h3>

        <p> Despite the almost identical mechanics, content generation, delivery, and play out used for <a>VOD</a> and live, a large organization will typically maintain two distinct workflows as there are subtle but important ways in which they differ. </p>

        <section>
        <h4>Content Generation</h4>
        <p>For VOD the source is typically static file-based rather than a feed. The
          <a>encoding</a> profiles are also subtly different. A greater
          priority can be placed on high quality as the latency, time to
          live, is not a requirement. There are important
          differences in the <a>manifests</a> created. In <a>HLS</a> there is a tag
          that tells the <a>player</a> whether the playlist is describing on
          demand material: #EXT-X-PLAYLIST-TYPE:VOD. As we will see
          shortly this is used by the player. There are also client side
          restrictions where certain profiles are blocked due to rights
          restrictions and network consumption capped bitrates on movies
          and entertainment while being allowed on sports content.
          Fragments size will also effect playout as a player's <a>ABR</a> can
          be more responsive if the <a>chunk</a> is smaller, e.g. 2 seconds
          rather than 10 seconds. </p>
        </section>
        <section>

        <h3>Content Delivery</h3>
        <p> The <a>CDN</a> configuration and topology for delivering <a>VOD</a>

          content is also different to live. There are different
          levels of caching; popular <a>VOD</a> content which is kept closer to
          the edge in the <a>CDN</a> network in this way it can be delivered to
          customers faster than an edge server that isn't tuned for high
          volume delivery. Older and less popular content is retained in
          mid-tier caching while the long tail content is relegated to
          a lower tier. </p>
        </section>
        <section>
        <h4>Content Playback</h4>
        <p> As mentioned in the content generation section the <a>player</a>
          uses a tag within the <a>manifest</a> to determine the playout type.
          In <a>HLS</a> if there is a type with the value of VOD, then the player will not
          reload the manifest. This has important consequences if there
          are changes in availability after the session as commenced. In
          DASH the difference between a live and <a>VOD</a> playlist are more
          subtle but one of the primary identifier for liven content in <a>MPEG-DASH</a>
          is the <code>profiles</code> attribute of the <code>MPD</code> element which could
          for example reference the <code>urn:mpeg:dash:profile:isoff-live:2011</code> profile
          to indicate live content.
        </p>
        <p> There are other differences in playout as well. Unlike
          live, a VOD asset has a predefined duration, information
          around duration and playback position can be used to update the UI
          to provide feedback to the user on the proportion
          of the asset watched. </p>
        <p>Additionally, the UX requirements are different between <a>VOD</a> and live. Unlike live content, which lends itself to being browsed within an EPG (electronic program guide), VOD content is typically navigated using tiles or poster galleries. There is also a <a href='#dfn-trick-play-mode'>trick play</a> bar to view the current playback position and seek to other points within the content. </p>
        </section>
       <section>
        <h4>VOD Pre-caching</h4>
        <p>In the previous sections, we outlined the use cases
          associated with video <a>streaming</a>. In this section, we give some
          examples of use-cases that are specific to on-demand streaming
          and are mainly related to strategies employed on the clients
          to improve performance in some way. </p>

        <p>Pre-caching is a strategy used in on-demand streaming and is important to ensure the best user experience.
          By buffering the beginning segments of a video, you can make playback as close to instantaneous as possible.
          This is important because buffering (the state
          of the video application when the <a>player</a> has insufficient
          content within its framebuffer to continuously play content),
           has a direct relationship to engagement and
           retention. For every second of buffering
          within a session a certain amount of users abandon a video stream.
           Web video application developers will use points within an
          application's UX to pre-cache content. For example, when entering a mezzanine/synopsis page, the application might preemptively cache the content by filling the player’s video buffer or, alternatively, storing the <a>chunk</a>s locally.. Pre-caching allows the video to commence playing without buffering and provides the user with a more responsive initial playback experience. This technique is used by Netflix, Sky
          and the BBC in the case of on-demand content being watched in
          an in-home context. This technique is not used for cellular sessions where the user’s mobile data would be consumed on content that they do not watch. </p>
        </section>
      </section>
        <section>
        <h3>VOD – further considerations</h3>
        <section>
        <h4>Byte range requests in context of web video application</h4>

        <p>
        An HTTP range request allows a client to request a portion of a larger
        file. In the <a>VOD</a> scenario where a user wants to access a specific
        location, a range request allows the <a>player</a> access content at a
        specific location without downloading the entire transport <a>chunk</a>.
        </p>
        <p>
        Both the player and the server need to support range requests. The
        player needs to be able to playback a source buffer and the server must
        be configured to serve ranges. The exchange begins when a client makes
        an HTTP HEAD request. If range requests are supported, the server will
        then respond with a header that includes <code>Accept-Ranges:
            bytes</code> and the client can issue subsequent requests for
        partial content. The returned bytes are added to an
        <code>ArrayBuffer</code> and then appended to the
        <code>SourceBuffer</code> which in turn is used as the SRC parameter of
        the of the video HTML5 tag/element.
        </p>
        </section>
        <section>
        <h4>HDCP security requirement for HDMI</h4>
        <p>
        Some content has limitations placed on its distribution by the
        owners. There are different ways of protecting the owner’s rights. The
        most common is <a>DRM</a> (digital rights management) which prevents the
        content being watched on a client device if the user cannot ‘unlock’ it
        with a key. However, once the content is unlocked the service provider
        should make every effort to prevent the user from re-distributing this
        content.
        </p>
        <p>
        A technical solution to prevent interception as a signal travels from a
        device to a television or projector is termed High Bandwidth Content
        Protection (HDCP). If implemented by a device manufacturer, HDCP takes
        the form of a key exchange between devices. If the exchange fails, the
        signal is not transmitted to the display and the device is responsible
        for notifying the end user of the error.
        </p>
        </section>
        <section>
        <h4>Watermarking</h4>
        <p>
        Support for real-time watermarking is becoming an important consideration
        for distributors of high-value content. A service provider’s right to
        distribute content is linked to their ability to protect it with studios and
        sports channels insisting on the capacity for a service provider to detect a
        breach of copyright on a stream by stream basis.  A service provider can
        include a vendor's <a>SDK</a> in the client that can add a watermark at run-time.
        Normally, the watermark is invisible to the consumer and takes the form of a
        digital signal, this is referred to as forensic water marking. Part of the
        service provided by the watermarking vendor is to monitor the black
        market for re-distributed material. Illicit streams or files are intercepted and
        screened for the digital fingerprint inserted on the client. If a suspect
        stream is found the vendor directs the service provider(s) to the source
        of the misappropriated stream.
        </p>

        <p>
        While watermarking is not an issue that developers will often be faced
        with, the processing requirements can dictate the choice of platform
        for the content distribution. The watermarking process might require
        processing power and low-level functionallity through ring-fenced
        resources that platforms such as Android, iOS or Roku provide. The
        options for watermarking in the browser are limited to overlays and
        this limitation is then reflected in a distributor’s choice of platform
        for their web video application.
        </p>
        </section>
        </section>
        <section>
        <h3>Live Streaming</h3>
            <p>
              Even though Live and On-Demand <a>streaming</a> scenarios have a lot in
              common, there are a few distinct differences.
            </p>
            <section>
            <h4>Content Generation</h4>
            <p>
            In contrast to <a>VOD</a> content generation, the typical input is a live
            feed of data. Usually there is also a higher priority on
            low-latency and time to live, which is in turn reflected in smaller
            segments.
            </p>
            <p>
            A big difference between <a>VOD</a> and Live content generation can be
            found in the differences between <a>manifests</a> for the two content
            types.
            Besides a general profile that tells the <a>player</a> if the manifest
            represents live content, there are a few other, important properties
            that define how a player will be able to playback the live content,
            when updates will be fetched, and how close to the live edge playback
            starts.
            These properties are pre-defined and expressed in the manifest during
            content preparation, but play an important role in content playback
            and view experience.
            </p>
            </section>
            <section>
            <h4>Content Delivery</h4>
            <p>
            The content delivered through a <a>CDN</a> is generally very similar to the
            <a>VOD</a> playback use-case. There might be slight differences when it
            comes to caching and distribution of segments, and you might
            observe higher load on the origin when playback is configured to be
            very close to the live edge.
            </p>
            </section>
            <section>
            <h4>Content Playback</h4>
            <p>
            Even though playback and playback parameters between Live and <a>VOD</a>
            playback are very similar, there are critical differences. Maybe
            the biggest difference is the playback start position. While <a>VOD</a>
            playback session usually start at the beginning on the stream, a
            Live playout usually starts close to the live edge of the stream,
            with only a few segments between the playback start position and
            the end of the current <a>manifest</a>.
            </p>
            <p>
            The other main difference is that the live content changes over
            time and the <a>player</a> needs to be aware of these changes. In both
            DASH and <a>HLS</a>, this results in regular manifest updates. Depending
            on the format, it is possible to define the interval for the
            manifest updates.
            </p>
            <p>
            What happens during a typical playback session is:
            </p>
            <ol>
                <li>Player loads the manifest for the first time</li>
                <li>Playback starts at the live-edge defined in the manifest</li>
                <li>Player re-loads the manifest in regular, pre-defined, intervals.</li>
                <li>The new manifest is used to update playback state information and the segment list</li>
            </ol>
            <p>
            This is a typical loop that the player goes through on each
            manifest update cycle.
            </p>
            </section>
            <section>
            <h4>Potential Issues</h4>
            <p>
            One of the most common cases that an application needs to be
            prepared for is the <a>player</a> falling behind the live window. Assume
            for example that you have a buffer window of 10 seconds on your
            live stream. Even if the user interface does not allow explicit
            seeking, buffering events can still stall the playout and push the
            playback position towards the border of the live window and even
            beyond it. Depending on the player implementation the actual error
            might be different, but implementation should be aware of the
            scenario and be prepared to recover from it.
            </p>
            <p>
            Another common problem that can occur both in Live and VoD playback
            sessions are missing segment data or timestamp alignment issues. It
            might happen that a segment for a given quality is not available on
            the server. At the same time, misalignments in segment timestaps
            might occur. Although both of these issues migh be critical, how
            they are handled depends on the player implementation. A player can
            try to jump over missing segments or alignment gaps. A player can
            also try to work around missing segments by downloading the segment
            from a different <a>rendition</a> and allow a temporary quality switch.

            At the same time, a player implementation might also decide to
            treat such issues as fatal errors and abort playback. Applications
            can try to manually restart playback sessions in such scenarios.
            </p>
            </section>
      </section>
      <section>
          <h3>Thumbnail Navigation</h3>
          <p>
          Thumbnails are small images of the content taken at regular time
          intervals. They are an effective way to visualize scrubbing and
          seeking through content.
          </p>
          <p>
          There is currently no default way to add thumbnail support to a playback
          application. And there is not out-of-the-box browser support. However,
          since thumbnails are just image data, the browser has all the capabilities
          for a client to implement thumbnail navigation in an application.
          </p>
          <p>
          The most common way to generate thumbnails is to render a set of images out of
          the main content in a regular time interval, for example every 10 seconds.
          The information about the location of these images then needs to be passed
          down to the client, which can then request and load an image for a given
          playback position.  For more efficient loading, images
          are often merged into larger grids (sometimes called sprites). This way, the client
          only needs to make a single request to load a set of thumbnails
          instead of a request per image.
          </p>
          <p>
          Unfortunately, neither DASH or <a>HLS</a> do currently specify a way to
          reference thumbnail images directly from <a>manifests</a>. However, the
          DASH-IF Guidelines [[DASHIFIOP]] describe an extension to
          reference thumbnail images. The thumbnails would be exposed as single
          or gridded images. All parameters required to load and display the
          thumbnail images are contained in the Manifest. This approach
          also works for live Manifests that are updated regularly by the <a>player</a>.
          The following example shows how thumbnails can be referenced according
          to the DASH-IF Guidelines:
          </p>
          <pre class="example xml" title="Thumbnail Reference in DASH-Manifest">
          &lt;AdaptationSet id="3" mimeType="image/jpeg" contentType="image"&gt;
            &lt;SegmentTemplate media="$RepresentationID$/tile$Number$.jpg" duration="125" startNumber="1"/&gt;
            &lt;Representation bandwidth="10000" id="thumbnails" width="6400" height="180"&gt;
                &lt;EssentialProperty schemeIdUri="http://dashif.org/guidelines/thumbnail_tile" value="25x1"/&gt;
            &lt;/Representation&gt;
          &lt;/AdaptationSet&gt;
          </pre>
      </section>
    </section>
    <section>
      <h2>Media Playback Methods</h2>
      <section>
        <h3>Device Identification</h3>
        <p>Device identification is required both at the level of device
          type and family and also to uniquely identify a device.
          Different techniques are used in different environments.</p>
        <section>
        <h4>Device Type</h4>
        <p>In the context of a mobile client the broad device type is
          already known - you can’t install an Android client on an iOS
          device. However, operating systems evolve and are extended.
          Within the application layer you will still need to
          determine the level of support for feature X and branch your
          code accordingly.</p>
          <p>
          If the application is hosted or the same application is
          deployed via different app stores then the clients runtime
          could be more ambiguous. The classic approach is to use the
          <code>navigator.userAgent</code>. While this returns a string that does
          not explicitly state the device by name a regular expression
          match can be used to look for patterns that can confirm the
          device family. As an example the name ‘Tizen’ can be found in
          the user agent strings for Samsung and ‘netcast’ for LG
          devices. In the Chrome browser the strings will be different
          on Windows, Mac, Android and webview.</p>
          <p>
          Another method similar to feature detection is to look for
          available APIs, many of these are unique to a device, for
          example <code>if(tizenGetUser){ then do X }</code>.</p>
          <p>Besides the mentioned feature switches, device type identification
          can be crucial to determine which content should be loaded on a
          client device. A typical scenario for DRM protected content is to
          have different content types for different platforms based on the
          platforms playback capabilities. One might for instance package
          <a>HLS</a> and <a>MPEG-DASH</a> variants of the same content but
          using different encryption and DRM schemes. In that case the client
          application needs to decide which content to load based on the device
          type and its capabilities.</p>
        </section>
        <section>
        <h4>Unique device Identifier</h4>
        <p>After you have determined what type of device you are on you
          may need to identify that device uniquely. Most device
          manufacturers provide an API so once you have discovered what
          type of device you are on you then know what API’s will be
          available to you. Each manufacturer provides a string
          constructed in a different way, some use the serial number of
          the device itself while others use lower level unique
          identifiers taken from the hardware. In each case the
          application layer should attempt to namespace this in some way
          to avoid an unlikely, but theoretically possible, clash with
          another user in any server-side database of users and devices.<br>
          In a classic PC/browser, rather that mobile or STB (set-top-box),
          environment there are technical issues with using unique
          identifiers as a user can access these stored locally and
          either delete them or reuse them in another context, this
          could allow them to watch content on more devices than their
          user account allows. For this reason some vendors prefer to
          use <a>DRM</a> solutions that both create and encrypt unique
          identifiers in a way that obscures them from the user.</p>
          <p>As mentioned above one typical use case where a unique device
          identifier is needed is to restrict the number of devices that a user
          can use. This could refer to overall devices or the number of devices
          a user can use in parallel. In both cases a unique device identifier
          will be required.</p>
         </section>
      </section>
      <section>
        <h3>Device Content Protection Capabilities</h3>
        <p>A web video application needs to determine the content
          protection capabilities of the device that is being used to
          playback the content. The method of doing this will vary from
          device to device and between <a>DRM</a> systems.
          </p>

          <p>
          Regardless of a device's capability to play back a stream
          encrypted with a specific <a>DRM</a> it is worth noting that a
          content provider will be aware of this capability in advance
          and consequently encrypt the streams to target a specific
          device. Beyond technical feasibility: can device X play back
          content encrypted with <a>DRM</a> Y? the provider will have a number
          of considerations when choosing a <a>DRM</a> for playback on a
          specific device; cost, utility, complexity and content value
          (the last consideration being mapped to contractual
          obligations). As a consequence, in most situations the API
          call that requests the stream from a back end service will
          either be to a service that is only configured to return
          streams with a suitable encryption or the server will use data
          from the request headers or key-value pairs in the request
          payload to determine which streams to return.</p>

          <p>
          In the context of the browser the major vendors broadly
          dictate the <a>DRM</a>s available, Apple’s Safari browser and Apple
          TV support Apple’s FairPlay <a>DRM</a>. Microsoft's Internet Explorer
          and Edge browsers only support Microsoft's PlayReady out of
          the box while Google’s Widevine Modular is supported by
          Google’s Chrome browser but is also included the browser
          developers not tied to a significant hardware providers; Opera
          and Firefox.</p>
          <p>
          Browser <a>DRM</a> detection capabilities are tested via the <a>EME</a>
          (Encrypted Media Extensions) API [[ENCRYPTED-MEDIA]]. The API is an extension to
          the HTMLMediaElement. The process of determining the available
          system is broadly as follows: </p>
        <ol>
          <li>The stream is passed to the VIDEO or AUDIO HTML5 tag/media element</li>
          <li>The browser detects the stream is encrypted</li>
          <li>The media event <code>encrypted</code> is thrown</li>
          <li>A check is made to see if there are already MediaKeys
            associated with the element</li>
          <li>If there are no keys already associated with the element
            then use the <a>EME</a> API <code>navigator.requestMediaKeySystemAccess()</code>
            to determine which <a>DRM</a> system is available. This is done by
            passing a key value pair that includes a string
            representation of each MediaKey system to the above method
            which in turn returns a boolean</li>
          <li>Use the <code>MediaKeySystemAccess.createMediaKeys()</code> to return a
            new MediaKeys object.</li>
          <li>Then use <code>HTMLMediaElement.setMediaKeys(mediaKeys)</code></li>
        </ol>
        <p>
        Please note that the browser will throw the encrypted event if it
        detects <a>DRM</a> init data or identifies an encrypted stream by other means.
        That does not mean that a client can establish encrypted playback only
        once the encrypted event was triggered. An encrypted session can also
        be established manually, without relying on the browser to detect
        encrypted content first.
        </p>

        <p>The remaining steps are covered in a later section on using EME
        but it’s worth noting at this stage that the
        navigator.requestMediaKeySystemAccess() is not uniformly
        implemented across all modern browsers that support EME. As an
        example Chrome returns true for <code>com.widevine.alpha</code> however IE
        and Safari throw errors and Firefox returns null. A possible
        solution to this is offered <a href="https://stackoverflow.com/questions/35086625/determine-drm-system-supported-by-browser">here</a>.
        </p>
        </section>

      <section>
        <h3>Ad Insertions</h3>
        <p>There are many ways to incorporate media advertisements into an application. The most
        	common types are:</p>
      	<ol>
      		<li><a>In-stream</a>: Video advertisement that is played before, mid, or after main content
      		(such as a clip or episode)</li>
      		<li>Outstream: Video advertisement stands alone (without other video content) and is
      		placed natively within other parts of the application</li>
      	</ol>
        <p>For the purposes of these guidelines, we will focus on <a>in-stream</a>.</p>
        <p>There are two primary ways to insert advertisements into an <a>in-stream</a> media playback
        session. <i>Client-side ad insertion (CSAI)</i> and <i>Server-side
        ad insertion (SSAI)</i>. The difference here is that Server-side
        insertions are <a>embed</a>ded into the playout directly, while Client-side
        insertions are added dynamically by the <a>player</a> and are handled in
        parallel to the main playout.</p>

        <section>
        <h4>Client-side Ad Insertion</h4>
        <p>
        Client-side ad insertions typically involve using a script
        available in the client runtime, JavaScript in the case of the web,
        to insert an advertisement during the user’s session. CSAI can be used for any type
        of content, including on-demand and live.
        </p>
        <p>
        For <a>Live Streaming</a> use cases, in-video-stream metadata (for example, using ID3 in <a>HLS</a>) can be used to enable all viewers to see an ad at the same time, synced with the content at a specific time offset. Another approach is to use web sockets to signal a <a>player</a> via a pushed event to play an advertisement.
        </p>
        <p>
		Additionally, for live TV (broadcast TV), many broadcasters use SCTE-104/35 in their workflows to signal ads. When preparing the content to be distributed through Internet these SCTE-104/35 signals are usually "converted" to ID3/EMSG messages for <a>HLS</a>/DASH respectively. This keeps the media workflow for live ads insertion easy (SCTE to ID3/EMSG conversion is supported by most of encoders/media servers used by live content generators), is easy to consume from media players (most of players support parsing ID3/EMSG metadata), ensures synchronization with the content and avoids the need of using alternative communication channels like websocket that add complexity to the solution.
        </p>
        <p>
        As an example workflow: an ad serving vendor typically provides a client-side script that a publisher
    	can <a>embed</a> in their application. These scripts createa bridge between the advertising demand source and the publisher's video player. Close
        to the player’s initiation the client library makes its API
        available. The web application listens to events associated with
        playback, for example the video elements media event ‘playing’. The
        web application then calls the DOM pause method on the video
        element and then calls the play method provided by the client-side
        ad library, passing it the ‘id’ of the video asset. This is then
        returned to the vendor, possibly along with other identifiers that
        can be used to target the audience with a specific ad. At this
        stage an auction is performed with business logic at the ad vendor
        determining which provider supplies the ad (this is a complex topic
        and outside the scope of this document). The vendor responds with a
        VAST (Video Ad Serving Template) payload that includes the URI of
        the ad content appropriate for the playback environment. In some
        cases, there is no ad, if this is the case the user is presented
        with the content they originally requested and control is passed
        back to the web video application.
    	</p>
    	<p>If there is an ad targeted
        against the content then the library performs DOM manipulation and
        injects a new video element into the document this is typically
        accompanied by a further script that provides the vendor with
        insights based on the current session. The ad plays. The ad object
        will conform to VPAID (Video Player Ad Serving Definition) and
        present a standardized interface to the player for possible
        interaction, it will issue a standard set of events which the web
        application can listen to. In response to an ‘adEnded’ event the
        local library will tear down the injected DOM elements and in turn
        issue an event that the web application can use to trigger a return
        to playing the original content.
        </p>
        </section>

        <section>
        <h4>Server Side Ad Insertion</h4>
        <p>
        Especially in live playback environments, because of the live
        nature of the playout, ad-insertions are usually not handle client
        side. One of the reasons is that the live feed continues
        independent of the client side ad insertion. In that case the
        <a>player</a> can easily fall behind the live window. Server side ad
        insertions allow the player to play a continuous feed, independent
        of any ad insertions, which are "injected" on the server side,
        directly into the playout feed.
        </p>
        <p>
        One of the requirements of the underlying <a>streaming</a> format such as
        DASH or <a>HLS</a> is that the <a>encoding</a> does not change during the playout
        of a single rendition. To still be able to insert advertisements,
        <a>HLS</a> and DASH propose different strategies.
        </p>
        <p>
        <a>HLS</a> allows the packager to add a "discontinuity" tag. This is
        expressed with a <code>#EXT-X-DISCONTINUITY</code> entry before any
        potential format change.  See Section 4.3.2.3. in [[HLS]] for more
        detail. This tag then usually appears before the ad starts and
        again before the main content continues.
        </p>
        <p>
        In contrast, DASH uses multiple "periods" to split the content into
        main and advertisement sections.  See Section 3.1.31 in [[MPEGDASH]]
        for more details. Each switch between periods is a trigger for the
        player to potentially reset the decoder.
        </p>
        <p>
        Where the player usually receives dedicated events in CSAI such as
        'adStarted' or 'adEnded', Server-side insertions require a
        different setup to trigger such events.  Typically, in-band
        notification are used to trigger time based events. For example,
        SCTE-35 markers are a common format to insert time based
        meta-data into the feed. These markers are read and interpreted
        by the player and can be used to trigger events or carry additional
        meta-data such as callback URLs.
        </p>
        </section>
      </section>
    </section>
      <section>
      <h2>Content Encoding Guidelines</h2>
      <p><a>Encoding</a> video for playback can be a challenge given the variety of ways users may watch your content.  In order to provide quality video for your audience, there are multiple aspects to consider such as <a>codec</a> support, resolutions, frame-rates,  browser versions, bandwidth availability and device compatibilities.  Fortunately there are some steps you can follow that will help you produce content that is playable for all audiences.  Preparing your source content and planning your outputs ahead of time are important activities to complete prior to actual transcoding.  Time spent here will help reduce trial and error later, and ensures maximum quality throughout your workflow, resulting in an excellent viewer experience for your entire audience.  If these steps are rushed, then there will certainly be inefficiencies in video processing and wasted bandwidth, and the outputs may not be sufficient to cover your target audience, leading to higher cost and missed opportunities.  There are a few key steps to get your content ready for web delivery.</p>

      <p>These steps are best done in the following order :</p>

      <ol>
        <li>Create a <a>mezzanine file</a></li>
        <li>Decide on <a>rendition</a> set</li>
        <li>Decide on delivery formats to support</li>
        <li>Create <a>encoding</a> profiles</li>
        <li>Transcode mezzanine into specific rendition output formats</li>
      </ol>

      <section>
      <h3>Create a Mezzanine File</h3>

      <p>The first step in preparing your content is to create a high quality encoding master file from your source footage. This <a>mezzanine file</a> will be used as a source file to create all downstream outputs.  Content producers typically export files from a non-live editor using the highest resolution input files available; from a master magnetic, optical media or digital source.  This ensures all downstream outputs have the necessary quality to work with, without needing to re-export from the editor every time. </p>

      <p>In order to maximize quality of the mezzanine, your export settings should use a lossless <a>codec</a> such as Apple ProRes, <a>H.264</a> Intra-frame, or Motion JPEG 2000.  A lossless <a>codec</a> guarantees all data from the original will exist on the export. For detailed info on how to configure your output settings, see the appropriate provider recommended settings documentation for your <a>codec</a> of choice.</p>

      <p>Mezzanines are typically rendered in the native
        resolution and frame-rate that the source material was captured with. A common
        configuration using Apple ProRes would be as follows:</p>
        <table>
          <tbody>
            <tr>
              <td><a>codec</a></td>
              <td>Resolution</td>
              <td>Bitrate</td>
              <td>Framerates</td>
            </tr>
            <tr>
              <td rowspan="4">Apple ProRes 422</td>
              <td>3840x2160</td>
              <td>340-1650 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>1920x1080</td>
              <td>145-220 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>1280x720</td>
              <td>20-60 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>640x480</td>
              <td>8-12 Mbps</td>
              <td>25, 29.97, 50, 59.94</td>
            </tr>
          </tbody>
        </table>
      </section>
      <section>
      <h3>Decide on Rendition Set</h3>
      <p>In modern video <a>streaming</a> use-cases, adaptive bitrate (<a>ABR</a>) streaming technologies use a <a>rendition</a> set to ensure every playback environment has an appropriate file to render.  A rendition set is simply a grouping of the different transcodes of the same mezzanine file. During playback,  <a>ABR</a> technologies detect the user's playback environment, available <a>codec</a>s, resolution and bandwidth; then selects the right segment from one of the video renditions to send to the video <a>player</a>.  This reduces probability of buffering as only the segments of the matching video <a>rendition</a> is loaded.  To create a proper rendition set, you should list the <a>codec</a>s and resolutions that exists for your target audience. Then you need to ascertain the proper bitrate for each <a>codec</a> &amp; resolution, high enough to meet your quality goals but not exceed bandwidth availability.<p>

      <p>In order to select optimum bitrates, determine your target user based upon
          target devices and user experience. For example, for OTT (Over-the-top)
        applications for Smart Televisions, typically a higher bitrate <a>rendition</a> is
        set as the default since resolution &amp; bandwidth are reliably available. The table
          below shows an example of some possible renditions, resolutions, and framerates. This is not an exhaustive list nor is it meant to be used as best practices.</p>

        <table>
          <tbody>
            <tr>
              <td><a>codec</a></td>
              <td>Example Resolutions</td>
              <td>Example Bitrates</td>
              <td>Example Framerates</td>
            </tr>
            <tr>
              <td rowspan="4"><a>codec</a> A</td>
              <td>2160p</td>
              <td>10 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>1080p</td>
              <td>5 Mbps<br>
                3 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>720p</td>
              <td>2 Mbps<br>
                1 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>480p</td>
              <td>768Kbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td rowspan="4"><a>codec</a> B</td>
              <td>2160p</td>
              <td>18 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>1080p</td>
              <td>9 Mbps<br>
                5 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>720p</td>
              <td>4 Mbps<br>
                2.5 Mbps</td>
              <td>23.98, 24, 25, 29.97, 30, 50, 59.94, 60</td>
            </tr>
            <tr>
              <td>480p</td>
              <td>1.5 Kbps<br>
                768 Kbps<br>
                380 Kbps</td>
              <td>25, 29.97, 50, 59.94</td>
            </tr>
          </tbody>
        </table>
        </section>
        <section>
    <h3>Decide on Delivery Formats to support</h3>
        <p>In addition to defining resolutions and bitrates in your Rendition Set, the delivery format should also be considered.  For most audiences, a format of a MP4 <a>container</a> with <a>H.264</a> video and <a>AAC</a> audio is sufficient for most SD/HD videos.  However if your video content warrants the need for advanced features, then you should also consider adding a <a>rendition</a> to your set that includes higher profile <a>H.264</a>, H.265, VP9, <a>AAC</a>-HI formats, so that users with environments supporting those features can enjoy them, while still providing a baseline experience via the <a>H.264</a>/AAC MP4 <a>rendition</a> set.</p>
        </section>
        <section>
    <h3>Create encoding profiles</h3>
        <p>After you have listed all the renditions that will be necessary, then create <a>transcoding</a> profiles (or templates depending on your tool) for each <a>rendition</a> in your <a>transcoding</a> software of choice (FFMPEG, Compressor, Vantage). </p>
        </section>
        <section>
    <h3>Transcode</h3>
        <p>If all the previous steps have been carefully thought out, then actual <a>transcoding</a> is comparatively easy.  It is the act of taking the mezzanine file and sending it to each <a>transcoding</a> profile.  To save time, this should be eventually scripted if possible since you may need to revisit this step each time you adjust your <a>rendition</a> set.</p>

    <p>At the end of this process you should have a set of files that can be hosted at for web delivery.  You will need to package these so that your chosen <a>ABR</a> technology can utilize them, this process is known as packaging.</p>
    </section>


    <section>
        <h3>Packaging your video files for <a>ABR</a> delivery</h3>
    <p>In order for an <a>ABR</a> technology to utilize your <a>rendition</a> set, each file needs to be properly segmented and exposed to the <a>ABR</a> technology following their spec.  Segmentation refers to the breakdown of each file into timed <a>chunk</a>s so <a>ABR</a> can smoothly switch <a>rendition</a>s without causing a break in the video playback.  Exposure of <a>rendition</a> set and segments are done through a <a>manifest</a> file - .m3u8 for <a>HLS</a> and .mpd for DASH.  The manifest is simply a text file that serves as a directory for the <a>rendition</a>s and includes details (resolution, <a>codec</a>, bitrate, etc) about each file available.  This is necessary so that the <a>ABR</a> can find the correct segment for the users playback environment. A manifest can also contain references to other files that may be necessary for specific video features, such as .vtt files for subtitles and additional audio tracks for multi-lingual video. </p>
    <p>Information about authoring a manifest for <a>HLS</a> can be found
      <a href="https://tools.ietf.org/html/draft-pantos-http-live-streaming-23">here</a>,
      and authoring for DASH can be <a href="http://dashif.org/guidelines/">here</a>.
    </p>
    </section>
    </section>
    <section>
      <h2>Web App Structure</h2>
      <p>
        Media companies more and more are making their content and channels available where ever users are consuming media.  More often than not this includes the web and native app platforms.  As we see rise in media companies using web technologies to build cross platform apps (that stretch across the web and native app platforms) it becomes more important to define clarify the different approaches you can take to building "apps" with web technologies.
      </p>
      <section>
        <h3>Web App Content Management Approaches</h3>
        <p>The easiest way to delineate between the types of web apps are by how we manage the content inside them. In the case of traditional applications and mobile apps, the entire application is downloaded from the store and then ran locally. In contrast, a website keeps all of its content on the server and then when the pages loader. It brings the entire application down into the browser. The same approaches are utilized with web content inside of the app space there often known as packaged web apps and hosted web apps.
        </p>
        <section>
          <h4>The Packaged App</h4>
          <p>The Packaged App is a type of web app that distributed in its entirety to a user through a single download. Generally, we see packaged apps within App Markets. Packaged Apps have these characters:</p>
          <ul>
            <li>Downloaded in entirety through a store or market</li>
            <li>Generally offline by default, as all the code is living on the users device</li>
            <li>Retrieve data from the Internet via services similar to native apps. generally called via AJAX</li>
          </ul>
          <p>Some Markets, such as Window Store and Amazon App Store enable you to submit packaged apps directly to the market without the need of any external packaging tools. These markets also expose additional APIs for packaged apps that aren’t available to apps via the browser. Other markets, such as iOS App Store and Android Play require a third party utility such as Cordova or Crosswalk to enable packaged web apps.
          </p>
        </section>
      <section>
        <h4>The Hosted App</h4>
        <p>Hosted Apps are web apps submitted to app markets that point to web content that remains on the webserver as appose to being packaged and submitted to the store. Hosted app’s have these Characteristics:
        </p>
        <ul>
          <li>Have a thin layer of native code, usually containing a webview and / or an app manifest that is submitted to the market</li>
          <li>Can run dynamic web content from web server (thank asp.net PHP, etc).</li>
          <li>By default have no functionality when device is offline (just like web)</li>
          <li>Hosted web apps are updated on the web server instead of pushing packages to the app store.</li>
        </ul>
      </section>
      <section>
        <h4>Progressive Web Apps</h4>
        <p>
          Progressive web apps or PWAs has become the de facto app format for the web. Just like hosted web apps, Progressive web apps load their content from the server when the app is opened. However, Progressive web apps also can store this web app locally through a new technology called service workers, which allows Progressive web apps to bridge across both the benefits of package web apps and hosted web apps.  Some platforms run PWAs in the browser while others (like Android and Windows 10) run PWAs in a stand along app container.
        </p>
        <p>
          PWAs are build around a common universal <a href="http://www.w3.org/TR/appmanifest/">web app manifest</a> that is being standardized by the W3C.  The web manifest contains meta data about your application such as presentation preference, start URL, and even ratings and categories for store listing.
        </p>
        <pre class="example json" title="common manifest">
          {
            "lang": "en",
            "dir": "ltr",
            "name": "Donate App",
            "description": "This app helps you donate to worthy causes.",
            "short_name": "Donate",
            "icons": [{
              "src": "icon/lowres.webp",
              "sizes": "64x64",
              "type": "image/webp"
            },{
              "src": "icon/lowres.png",
              "sizes": "64x64"
            }, {
              "src": "icon/hd_hi",
              "sizes": "128x128"
            }],
            "scope": "/racer/",
            "start_url": "/racer/start.html",
            "display": "fullscreen",
            "orientation": "landscape",
            "theme_color": "aliceblue",
            "background_color": "red",
            "serviceworker": {
              "src": "sw.js",
              "scope": "/racer/",
              "use_cache": false
            },
            "screenshots": [{
              "src": "screenshots/in-game-1x.jpg",
              "sizes": "640x480",
              "type": "image/jpeg"
            },{
              "src": "screenshots/in-game-2x.jpg",
              "sizes": "1280x920",
              "type": "image/jpeg"
            }]
          }

        </pre>
        <p>
          You can learn more about building Progressive Web Apps <a href="https://developers.google.com/web/progressive-web-apps/">here</a>.
        </p>
      </section>
      </section>
      <section>
        <h3>Web Based Platforms</h3>
        <p>
          Some platforms such as webOS are themselves a web based environment, meaning all applications in that environment or built with the web and run as independent applications within the environment just like a web page may run inside of a web browser.  Device manufacturers (of all kinds) use web engines as a basis for a web based platform that will allow web apps to run as standalone applications.
        </p>
        <p>
          The most common platform may be <a href="http://www.chromium.org/chromium-os">ChromeOS</a> itself.  ChromeOS is Chromium running on top of Linux that uses web apps for it's primary app ecosystem (however it recently has been augmented to run Android apps as well).  Another common Example is WebOS, which powers many LG tvs today, and runs web content as stand alone apps.
        </p>
      </section>
      <section>
        <h3>Web App Containers</h3>
        <p>
          Delivering the web app on the platform is not a trivial task as each platform may have a different approach on how a web app is delivered to a user. Some platforms such as <a href="https://developer.microsoft.com/en-us/windows/bridges/hosted-web-apps">Xbox (Windows 10)</a> and <a href="https://www.playstation.com/develop/">Play station</a> give the web apps their own standalone <a>container</a>, which allows the app to be downloaded from a store and run independently of a browser.  These apps are rendered with the web rendering engine native to the operating system, and generally have lower overhead than running in a browser, and often are granted access to OS level APIs not available in the browser.
        </p>
        <section>
          <h4>Mobile App container</h4>
          <p>
            For a web app to be distributed through the store on platforms like iOS and Android, an app <a>container</a> needs to be used. Popular app containers such as <a href="http://cordova.apache.org/">Cordova</a> and <a href="https://crosswalk-project.org/">Crosswalk</a> are used by web developers to deliver a web app through a store.  This approach often adds over head to the application as the web app is rendered in a webview or the app may contain the entire browser runtime within each app <a>container</a>.
          </p>
        </section>
        <section>
          <h4>Chrome Embedded Framework and Electron</h4>
          <p>
            In desktop environments, we see the use of <a href="https://electron.atom.io/">Electron</a> and <a href="https://code.google.com/archive/p/chromiumembedded/">Chromium Embedded Framework (CEF)</a> to deliver web apps as desktop applications.  CEF and Electron uses the open source version of chrome as a runtime environment for the application. Much like we see one mobile platforms, this approach has additional overhead as instead of using the native rendering engine of the platform, chromium is delivered inside each application to render the web app. This gives the web app additional privileges to access native API’s, but at the same time raises many security concerns.
          </p>
        </section>
      </section>
    </section>
  </body>
</html>