This document specifies two profiles of [TTML1]: a text-only profile and an image-only profile. These profiles are intended to be used across subtitle and caption delivery applications worldwide, thereby simplifying interoperability, consistent rendering and conversion to other subtitling and captioning formats. The text profile is a superset of [SDPUS].

This document specifies two profiles of [TTML1]: a text-only profile and an image-only profile. These profiles are intended for subtitle and caption delivery worldwide, including dialog language translation, content description, captions for deaf and hard of hearing, etc.

In applications that require subtitle/caption content in image form to be simultaneously available in text form, two distinct subtitle documents, one conforming to the Text Profile and the other conforming to the Image Profile, SHOULD be offered. In addition, the Text Profile subtitle document SHOULD be associated with the Image Profile subtitle document such that, when image content is encountered, assistive technologies have access to its corresponding text form.

A subtitle document MAY contain elements and attributes that are neither specifically permitted nor forbidden by a profile. Such elements and attributes MAY be ignored by the presentation processor or transformation processor.

A subtitle document MAY be associated with a related video object, which SHALL consist of a sequence of image frames, each a rectangular array of pixels, and SHALL be considered the Related Media Object.

When mapping a media time expression M to a frame F of a related video object, e.g. for the purpose of rendering a subtitle document onto the related video object, the presentation processor SHALL map M to the frame F with the presentation time that is the closest to, but not less, than M.

Otherwise, the root container of a subtitle document SHALL be mapped to the related video object frame in its entirety. If tts:extent is present on the tt element, the extents of the root container SHALL be equal to the dimensions of the related video object frame.

A progressively decodable subtitle document is structured to facilitate presentation before the document is received in its entirety, and can be identified using ittp:progressivelyDecodable attribute.

A subtitle document for which the computed value of ittp:progressivelyDecodable is "false" is neither asserted to be a progressively decodable subtitle document nor asserted not to be a progressively decodable subtitle document.

Annex B. Forced content (non-normative) illustrates the use of itts:forcedDisplay in an application in which a single document contains both hard of hearing captions and translated foreign language subtitles, using itts:forcedDisplay to display translation subtitles always, independently of whether the hard of hearing captions are displayed or hidden.

The purpose of the model is to limit subtitle document complexity. It is not intended as a specification of the processing requirements for implementations. For instance, while the model defines a glyph buffer for the purpose of limiting the number of glyphs displayed at any given point in time, it neither requires the implementation of such a buffer, nor models the sub-pixel character positioning and anti-aliased glyph rendering that can be used to produce text output.

The model operates on successive intermediate synchronic documents obtained from an input subtitle document, and uses a simple double buffering model: while an intermediate synchronic document En is being painted into Presentation Buffer Pn (the "front buffer" of the model), the previous intermediate synchronic document En-1 is available for display in Presentation Buffer Pn-1 (the "back buffer" of the model).

Fig. 3 Illustration of the use of itts:forcedDisplay below illustrates the use of forced content, i.e. itts:forcedDisplay and displayForcedOnlyMode. The content with itts:forcedDisplay="true" is the French translation of the "High School" sign. The content with itts:forcedDisplay="false" are French subtitles capturing a voiceover.

When the user selects French as the playback language but does not select French subtitles, displayForcedOnlyMode is set to "true", causing the display of the sign translation, which is useful to any French speaker, but hiding the voiceover subtitles as the voiceover is heard in French.

If the user selects French as the playback language and also selects French subtitles, e.g. if the user is hard-of-hearing, displayForcedOnlyMode is set to "false", causing the display of both the sign translation and the voiceover subtitles.

Guideline 1.1 of [WCAG20] recommends that an implementation provide text alternatives for all non-text content. In the context of this specification, this text alternative is intended primarily to support users of the subtitles who cannot see images. Since the images of an Image Profile subtitle document usually represent subtitle or caption text, the guidelines for authoring text equivalent strings given at Images of text of [HTML5] are appropriate.

Thus, for each subtitle in an Image Profile subtitle document, a text equivalent content in a Text Profile subtitle document SHOULD be written so that it conveys all essential content and fulfills the same function as the corresponding subtitle image. In the context of subtitling and captioning, this content will be (as a minimum) the verbatim equivalent of the image without précis or summarization. However, the author MAY include extra information to the text equivalent string in cases where styling is applied to the text image with a deliberate connotation, as a functional replacement for the applied style.

For instance, in subtitling and captioning, italics can be used to indicate an off screen speaker context (for example a voice from a radio). An author can choose to include this functional information in the text equivalent; for example, by including the word "Radio: " before the image equivalent text. It should also be noted that images in an Image Profile subtitle document that are intended for use as captions, i.e. intended for a hard of hearing audience, might already include this functional information in the rendered text.

