summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTim-Philipp Müller <tim@centricular.com>2016-12-08 22:59:58 +0000
committerTim-Philipp Müller <tim@centricular.com>2016-12-08 23:00:45 +0000
commit46138b1b1dc3bc9bfd4ce9102deddbf7f46de223 (patch)
tree1682a049bc28ea95e38f0572cc3750295c156a24
parent49653b058a2a4b2093f5ce34f43d567eba51f76b (diff)
docs: design: move most design docs to gst-docs module
-rw-r--r--docs/design/Makefile.am13
-rw-r--r--docs/design/design-audiosinks.txt138
-rw-r--r--docs/design/design-decodebin.txt274
-rw-r--r--docs/design/design-encoding.txt571
-rw-r--r--docs/design/design-orc-integration.txt204
-rw-r--r--docs/design/draft-keyframe-force.txt91
-rw-r--r--docs/design/draft-subtitle-overlays.txt546
-rw-r--r--docs/design/part-interlaced-video.txt107
-rw-r--r--docs/design/part-mediatype-audio-raw.txt76
-rw-r--r--docs/design/part-mediatype-text-raw.txt28
-rw-r--r--docs/design/part-mediatype-video-raw.txt1258
-rw-r--r--docs/design/part-playbin.txt69
-rw-r--r--docs/design/part-stereo-multiview-video.markdown278
13 files changed, 1 insertions, 3652 deletions
diff --git a/docs/design/Makefile.am b/docs/design/Makefile.am
index 94dece6d3..bd55852c1 100644
--- a/docs/design/Makefile.am
+++ b/docs/design/Makefile.am
@@ -2,16 +2,5 @@ SUBDIRS =
EXTRA_DIST = \
- design-audiosinks.txt \
- design-decodebin.txt \
- design-encoding.txt \
- design-orc-integration.txt \
draft-hw-acceleration.txt \
- draft-keyframe-force.txt \
- draft-subtitle-overlays.txt\
- draft-va.txt \
- part-interlaced-video.txt \
- part-mediatype-audio-raw.txt\
- part-mediatype-text-raw.txt\
- part-mediatype-video-raw.txt\
- part-playbin.txt
+ draft-va.txt
diff --git a/docs/design/design-audiosinks.txt b/docs/design/design-audiosinks.txt
deleted file mode 100644
index e2dafad4d..000000000
--- a/docs/design/design-audiosinks.txt
+++ /dev/null
@@ -1,138 +0,0 @@
-Audiosink design
-----------------
-
-Requirements:
-
- - must operate chain based.
- Most simple playback pipelines will push audio from the decoders
- into the audio sink.
-
- - must operate getrange based
- Most professional audio applications will operate in a mode where
- the audio sink pulls samples from the pipeline. This is typically
- done in a callback from the audiosink requesting N samples. The
- callback is either scheduled from a thread or from an interrupt
- from the audio hardware device.
-
- - Exact sample accurate clocks.
- the audiosink must be able to provide a clock that is sample
- accurate even if samples are dropped or when discontinuities are
- found in the stream.
-
- - Exact timing of playback.
- The audiosink must be able to play samples at their exact times.
-
- - use DMA access when possible.
- When the hardware can do DMA we should use it. This should also
- work over bufferpools to avoid data copying to/from kernel space.
-
-
-Design:
-
- The design is based on a set of base classes and the concept of a
- ringbuffer of samples.
-
- +-----------+ - provide preroll, rendering, timing
- + basesink + - caps nego
- +-----+-----+
- |
- +-----V----------+ - manages ringbuffer
- + audiobasesink + - manages scheduling (push/pull)
- +-----+----------+ - manages clock/query/seek
- | - manages scheduling of samples in the ringbuffer
- | - manages caps parsing
- |
- +-----V------+ - default ringbuffer implementation with a GThread
- + audiosink + - subclasses provide open/read/close methods
- +------------+
-
- The ringbuffer is a contiguous piece of memory divided into segtotal
- pieces of segments. Each segment has segsize bytes.
-
- play position
- v
- +---+---+---+-------------------------------------+----------+
- + 0 | 1 | 2 | .... | segtotal |
- +---+---+---+-------------------------------------+----------+
- <--->
- segsize bytes = N samples * bytes_per_sample.
-
-
- The ringbuffer has a play position, which is expressed in
- segments. The play position is where the device is currently reading
- samples from the buffer.
-
- The ringbuffer can be put to the PLAYING or STOPPED state.
-
- In the STOPPED state no samples are played to the device and the play
- pointer does not advance.
-
- In the PLAYING state samples are written to the device and the ringbuffer
- should call a configurable callback after each segment is written to the
- device. In this state the play pointer is advanced after each segment is
- written.
-
- A write operation to the ringbuffer will put new samples in the ringbuffer.
- If there is not enough space in the ringbuffer, the write operation will
- block. The playback of the buffer never stops, even if the buffer is
- empty. When the buffer is empty, silence is played by the device.
-
- The ringbuffer is implemented with lockfree atomic operations, especially
- on the reading side so that low-latency operations are possible.
-
- Whenever new samples are to be put into the ringbuffer, the position of the
- read pointer is taken. The required write position is taken and the diff
- is made between the required and actual position. If the difference is <0,
- the sample is too late. If the difference is bigger than segtotal, the
- writing part has to wait for the play pointer to advance.
-
-
-Scheduling:
-
- - chain based mode:
-
- In chain based mode, bytes are written into the ringbuffer. This operation
- will eventually block when the ringbuffer is filled.
-
- When no samples arrive in time, the ringbuffer will play silence. Each
- buffer that arrives will be placed into the ringbuffer at the correct
- times. This means that dropping samples or inserting silence is done
- automatically and very accurate and independend of the play pointer.
-
- In this mode, the ringbuffer is usually kept as full as possible. When
- using a small buffer (small segsize and segtotal), the latency for audio
- to start from the sink to when it is played can be kept low but at least
- one context switch has to be made between read and write.
-
- - getrange based mode
-
- In getrange based mode, the audiobasesink will use the callback function
- of the ringbuffer to get a segsize samples from the peer element. These
- samples will then be placed in the ringbuffer at the next play position.
- It is assumed that the getrange function returns fast enough to fill the
- ringbuffer before the play pointer reaches the write pointer.
-
- In this mode, the ringbuffer is usually kept as empty as possible. There
- is no context switch needed between the elements that create the samples
- and the actual writing of the samples to the device.
-
-
-DMA mode:
-
- - Elements that can do DMA based access to the audio device have to subclass
- from the GstAudioBaseSink class and wrap the DMA ringbuffer in a subclass
- of GstRingBuffer.
-
- The ringbuffer subclass should trigger a callback after writing or playing
- each sample to the device. This callback can be triggered from a thread or
- from a signal from the audio device.
-
-
-Clocks:
-
- The GstAudioBaseSink class will use the ringbuffer to act as a clock provider.
- It can do this by using the play pointer and the delay to calculate the
- clock time.
-
-
-
diff --git a/docs/design/design-decodebin.txt b/docs/design/design-decodebin.txt
deleted file mode 100644
index ec8df9a5d..000000000
--- a/docs/design/design-decodebin.txt
+++ /dev/null
@@ -1,274 +0,0 @@
-Decodebin design
-
-GstDecodeBin
-------------
-
-Description:
-
- Autoplug and decode to raw media
-
- Input : single pad with ANY caps Output : Dynamic pads
-
-* Contents
-
- _ a GstTypeFindElement connected to the single sink pad
-
- _ optionally a demuxer/parser
-
- _ optionally one or more DecodeGroup
-
-* Autoplugging
-
- The goal is to reach 'target' caps (by default raw media).
-
- This is done by using the GstCaps of a source pad and finding the available
-demuxers/decoders GstElement that can be linked to that pad.
-
- The process starts with the source pad of typefind and stops when no more
-non-target caps are left. It is commonly done while pre-rolling, but can also
-happen whenever a new pad appears on any element.
-
- Once a target caps has been found, that pad is ghosted and the
-'pad-added' signal is emitted.
-
- If no compatible elements can be found for a GstCaps, the pad is ghosted and
-the 'unknown-type' signal is emitted.
-
-
-* Assisted auto-plugging
-
- When starting the auto-plugging process for a given GstCaps, two signals are
-emitted in the following way in order to allow the application/user to assist or
-fine-tune the process.
-
- _ 'autoplug-continue' :
-
- gboolean user_function (GstElement * decodebin, GstPad *pad, GstCaps * caps)
-
- This signal is fired at the very beginning with the source pad GstCaps. If
- the callback returns TRUE, the process continues normally. If the callback
- returns FALSE, then the GstCaps are considered as a target caps and the
- autoplugging process stops.
-
- - 'autoplug-factories' :
-
- GValueArray user_function (GstElement* decodebin, GstPad* pad,
- GstCaps* caps);
-
- Get a list of elementfactories for @pad with @caps. This function is used to
- instruct decodebin2 of the elements it should try to autoplug. The default
- behaviour when this function is not overriden is to get all elements that
- can handle @caps from the registry sorted by rank.
-
- - 'autoplug-select' :
-
- gint user_function (GstElement* decodebin, GstPad* pad, GstCaps* caps,
- GValueArray* factories);
-
- This signal is fired once autoplugging has got a list of compatible
- GstElementFactory. The signal is emitted with the GstCaps of the source pad
- and a pointer on the GValueArray of compatible factories.
-
- The callback should return the index of the elementfactory in @factories
- that should be tried next.
-
- If the callback returns -1, the autoplugging process will stop as if no
- compatible factories were found.
-
- The default implementation of this function will try to autoplug the first
- factory of the list.
-
-* Target Caps
-
- The target caps are a read/write GObject property of decodebin.
-
- By default the target caps are:
-
- _ Raw audio : audio/x-raw
-
- _ and raw video : video/x-raw
-
- _ and Text : text/plain, text/x-pango-markup
-
-
-* media chain/group handling
-
- When autoplugging, all streams coming out of a demuxer will be grouped in a
-DecodeGroup.
-
- All new source pads created on that demuxer after it has emitted the
-'no-more-pads' signal will be put in another DecodeGroup.
-
- Only one decodegroup can be active at any given time. If a new decodegroup is
-created while another one exists, that decodegroup will be set as blocking until
-the existing one has drained.
-
-
-
-DecodeGroup
------------
-
-Description:
-
- Streams belonging to the same group/chain of a media file.
-
-* Contents
-
- The DecodeGroup contains:
-
- _ a GstMultiQueue to which all streams of a the media group are connected.
-
- _ the eventual decoders which are autoplugged in order to produce the
- requested target pads.
-
-* Proper group draining
-
- The DecodeGroup takes care that all the streams in the group are completely
-drained (EOS has come through all source ghost pads).
-
-* Pre-roll and block
-
- The DecodeGroup has a global blocking feature. If enabled, all the ghosted
-source pads for that group will be blocked.
-
- A method is available to unblock all blocked pads for that group.
-
-
-
-GstMultiQueue
--------------
-
-Description:
-
- Multiple input-output data queue
-
- The GstMultiQueue achieves the same functionality as GstQueue, with a few
-differences:
-
- * Multiple streams handling.
-
- The element handles queueing data on more than one stream at once. To
- achieve such a feature it has request sink pads (sink_%u) and 'sometimes' src
- pads (src_%u).
-
- When requesting a given sinkpad, the associated srcpad for that stream will
- be created. Ex: requesting sink_1 will generate src_1.
-
-
- * Non-starvation on multiple streams.
-
- If more than one stream is used with the element, the streams' queues will
- be dynamically grown (up to a limit), in order to ensure that no stream is
- risking data starvation. This guarantees that at any given time there are at
- least N bytes queued and available for each individual stream.
-
- If an EOS event comes through a srcpad, the associated queue should be
- considered as 'not-empty' in the queue-size-growing algorithm.
-
-
- * Non-linked srcpads graceful handling.
-
- A GstTask is started for all srcpads when going to GST_STATE_PAUSED.
-
- The task are blocking against a GCondition which will be fired in two
- different cases:
-
- _ When the associated queue has received a buffer.
-
- _ When the associated queue was previously declared as 'not-linked' and the
- first buffer of the queue is scheduled to be pushed synchronously in
- relation to the order in which it arrived globally in the element (see
- 'Synchronous data pushing' below).
-
- When woken up by the GCondition, the GstTask will try to push the next
- GstBuffer/GstEvent on the queue. If pushing the GstBuffer/GstEvent returns
- GST_FLOW_NOT_LINKED, then the associated queue is marked as 'not-linked'. If
- pushing the GstBuffer/GstEvent succeeded the queue will no longer be marked as
- 'not-linked'.
-
- If pushing on all srcpads returns GstFlowReturn different from GST_FLOW_OK,
- then all the srcpads' tasks are stopped and subsequent pushes on sinkpads will
- return GST_FLOW_NOT_LINKED.
-
- * Synchronous data pushing for non-linked pads.
-
- In order to better support dynamic switching between streams, the multiqueue
- (unlike the current GStreamer queue) continues to push buffers on non-linked
- pads rather than shutting down.
-
- In addition, to prevent a non-linked stream from very quickly consuming all
- available buffers and thus 'racing ahead' of the other streams, the element
- must ensure that buffers and inlined events for a non-linked stream are pushed
- in the same order as they were received, relative to the other streams
- controlled by the element. This means that a buffer cannot be pushed to a
- non-linked pad any sooner than buffers in any other stream which were received
- before it.
-
-
-=====================================
- Parsers, decoders and auto-plugging
-=====================================
-
-This section has DRAFT status.
-
-Some media formats come in different "flavours" or "stream formats". These
-formats differ in the way the setup data and media data is signalled and/or
-packaged. An example for this is H.264 video, where there is a bytestream
-format (with codec setup data signalled inline and units prefixed by a sync
-code and packet length information) and a "raw" format where codec setup
-data is signalled out of band (via the caps) and the chunking is implicit
-in the way the buffers were muxed into a container, to mention just two of
-the possible variants.
-
-Especially on embedded platforms it is common that decoders can only
-handle one particular stream format, and not all of them.
-
-Where there are multiple stream formats, parsers are usually expected
-to be able to convert between the different formats. This will, if
-implemented correctly, work as expected in a static pipeline such as
-
- ... ! parser ! decoder ! sink
-
-where the parser can query the decoder's capabilities even before
-processing the first piece of data, and configure itself to convert
-accordingly, if conversion is needed at all.
-
-In an auto-plugging context this is not so straight-forward though,
-because elements are plugged incrementally and not before the previous
-element has processes some data and decided what it will output exactly
-(unless the template caps are completely fixed, then it can continue
-right away, this is not always the case here though, see below). A
-parser will thus have to decide on *some* output format so auto-plugging
-can continue. It doesn't know anything about the available decoders and
-their capabilities though, so it's possible that it will choose a format
-that is not supported by any of the available decoders, or by the preferred
-decoder.
-
-If the parser had sufficiently concise but fixed source pad template caps,
-decodebin could continue to plug a decoder right away, allowing the
-parser to configure itself in the same way as it would with a static
-pipeline. This is not an option, unfortunately, because often the
-parser needs to process some data to determine e.g. the format's profile or
-other stream properties (resolution, sample rate, channel configuration, etc.),
-and there may be different decoders for different profiles (e.g. DSP codec
-for baseline profile, and software fallback for main/high profile; or a DSP
-codec only supporting certain resolutions, with a software fallback for
-unusual resolutions). So if decodebin just plugged the most highest-ranking
-decoder, that decoder might not be be able to handle the actual stream later
-on, which would yield an error (this is a data flow error then which would
-be hard to intercept and avoid in decodebin). In other words, we can't solve
-this issue by plugging a decoder right away with the parser.
-
-So decodebin needs to communicate to the parser the set of available decoder
-caps (which would contain the relevant capabilities/restrictions such as
-supported profiles, resolutions, etc.), after the usual "autoplug-*" signal
-filtering/sorting of course.
-
-This is done by plugging a capsfilter element right after the parser, and
-constructing set of filter caps from the list of available decoders (one
-appends at the end just the name(s) of the caps structures from the parser
-pad template caps to function as an 'ANY other' caps equivalent). This let
-the parser negotiate to a supported stream format in the same way as with
-the static pipeline mentioned above, but of course incur some overhead
-through the additional capsfilter element.
-
diff --git a/docs/design/design-encoding.txt b/docs/design/design-encoding.txt
deleted file mode 100644
index b28de89d3..000000000
--- a/docs/design/design-encoding.txt
+++ /dev/null
@@ -1,571 +0,0 @@
-Encoding and Muxing
--------------------
-
-Summary
--------
- A. Problems
- B. Goals
- 1. EncodeBin
- 2. Encoding Profile System
- 3. Helper Library for Profiles
- I. Use-cases researched
-
-
-A. Problems this proposal attempts to solve
--------------------------------------------
-
-* Duplication of pipeline code for gstreamer-based applications
- wishing to encode and or mux streams, leading to subtle differences
- and inconsistencies across those applications.
-
-* No unified system for describing encoding targets for applications
- in a user-friendly way.
-
-* No unified system for creating encoding targets for applications,
- resulting in duplication of code across all applications,
- differences and inconsistencies that come with that duplication,
- and applications hardcoding element names and settings resulting in
- poor portability.
-
-
-
-B. Goals
---------
-
-1. Convenience encoding element
-
- Create a convenience GstBin for encoding and muxing several streams,
- hereafter called 'EncodeBin'.
-
- This element will only contain one single property, which is a
- profile.
-
-2. Define a encoding profile system
-
-2. Encoding profile helper library
-
- Create a helper library to:
- * create EncodeBin instances based on profiles, and
- * help applications to create/load/save/browse those profiles.
-
-
-
-
-1. EncodeBin
-------------
-
-1.1 Proposed API
-----------------
-
- EncodeBin is a GstBin subclass.
-
- It implements the GstTagSetter interface, by which it will proxy the
- calls to the muxer.
-
- Only two introspectable property (i.e. usable without extra API):
- * A GstEncodingProfile*
- * The name of the profile to use
-
- When a profile is selected, encodebin will:
- * Add REQUEST sinkpads for all the GstStreamProfile
- * Create the muxer and expose the source pad
-
- Whenever a request pad is created, encodebin will:
- * Create the chain of elements for that pad
- * Ghost the sink pad
- * Return that ghost pad
-
- This allows reducing the code to the minimum for applications
- wishing to encode a source for a given profile:
-
- ...
-
- encbin = gst_element_factory_make("encodebin, NULL);
- g_object_set (encbin, "profile", "N900/H264 HQ", NULL);
- gst_element_link (encbin, filesink);
-
- ...
-
- vsrcpad = gst_element_get_src_pad(source, "src1");
- vsinkpad = gst_element_get_request_pad (encbin, "video_%u");
- gst_pad_link(vsrcpad, vsinkpad);
-
- ...
-
-
-1.2 Explanation of the Various stages in EncodeBin
---------------------------------------------------
-
- This describes the various stages which can happen in order to end
- up with a multiplexed stream that can then be stored or streamed.
-
-1.2.1 Incoming streams
-
- The streams fed to EncodeBin can be of various types:
-
- * Video
- * Uncompressed (but maybe subsampled)
- * Compressed
- * Audio
- * Uncompressed (audio/x-raw)
- * Compressed
- * Timed text
- * Private streams
-
-
-1.2.2 Steps involved for raw video encoding
-
-(0) Incoming Stream
-
-(1) Transform raw video feed (optional)
-
- Here we modify the various fundamental properties of a raw video
- stream to be compatible with the intersection of:
- * The encoder GstCaps and
- * The specified "Stream Restriction" of the profile/target
-
- The fundamental properties that can be modified are:
- * width/height
- This is done with a video scaler.
- The DAR (Display Aspect Ratio) MUST be respected.
- If needed, black borders can be added to comply with the target DAR.
- * framerate
- * format/colorspace/depth
- All of this is done with a colorspace converter
-
-(2) Actual encoding (optional for raw streams)
-
- An encoder (with some optional settings) is used.
-
-(3) Muxing
-
- A muxer (with some optional settings) is used.
-
-(4) Outgoing encoded and muxed stream
-
-
-1.2.3 Steps involved for raw audio encoding
-
- This is roughly the same as for raw video, expect for (1)
-
-(1) Transform raw audo feed (optional)
-
- We modify the various fundamental properties of a raw audio stream to
- be compatible with the intersection of:
- * The encoder GstCaps and
- * The specified "Stream Restriction" of the profile/target
-
- The fundamental properties that can be modifier are:
- * Number of channels
- * Type of raw audio (integer or floating point)
- * Depth (number of bits required to encode one sample)
-
-
-1.2.4 Steps involved for encoded audio/video streams
-
- Steps (1) and (2) are replaced by a parser if a parser is available
- for the given format.
-
-
-1.2.5 Steps involved for other streams
-
- Other streams will just be forwarded as-is to the muxer, provided the
- muxer accepts the stream type.
-
-
-
-
-2. Encoding Profile System
---------------------------
-
- This work is based on:
- * The existing GstPreset system for elements [0]
- * The gnome-media GConf audio profile system [1]
- * The investigation done into device profiles by Arista and
- Transmageddon [2 and 3]
-
-2.2 Terminology
----------------
-
-* Encoding Target Category
- A Target Category is a classification of devices/systems/use-cases
- for encoding.
-
- Such a classification is required in order for:
- * Applications with a very-specific use-case to limit the number of
- profiles they can offer the user. A screencasting application has
- no use with the online services targets for example.
- * Offering the user some initial classification in the case of a
- more generic encoding application (like a video editor or a
- transcoder).
-
- Ex:
- Consumer devices
- Online service
- Intermediate Editing Format
- Screencast
- Capture
- Computer
-
-* Encoding Profile Target
- A Profile Target describes a specific entity for which we wish to
- encode.
- A Profile Target must belong to at least one Target Category.
- It will define at least one Encoding Profile.
-
- Ex (with category):
- Nokia N900 (Consumer device)
- Sony PlayStation 3 (Consumer device)
- Youtube (Online service)
- DNxHD (Intermediate editing format)
- HuffYUV (Screencast)
- Theora (Computer)
-
-* Encoding Profile
- A specific combination of muxer, encoders, presets and limitations.
-
- Ex:
- Nokia N900/H264 HQ
- Ipod/High Quality
- DVD/Pal
- Youtube/High Quality
- HTML5/Low Bandwith
- DNxHD
-
-2.3 Encoding Profile
---------------------
-
-An encoding profile requires the following information:
-
- * Name
- This string is not translatable and must be unique.
- A recommendation to guarantee uniqueness of the naming could be:
- <target>/<name>
- * Description
- This is a translatable string describing the profile
- * Muxing format
- This is a string containing the GStreamer media-type of the
- container format.
- * Muxing preset
- This is an optional string describing the preset(s) to use on the
- muxer.
- * Multipass setting
- This is a boolean describing whether the profile requires several
- passes.
- * List of Stream Profile
-
-2.3.1 Stream Profiles
-
-A Stream Profile consists of:
-
- * Type
- The type of stream profile (audio, video, text, private-data)
- * Encoding Format
- This is a string containing the GStreamer media-type of the encoding
- format to be used. If encoding is not to be applied, the raw audio
- media type will be used.
- * Encoding preset
- This is an optional string describing the preset(s) to use on the
- encoder.
- * Restriction
- This is an optional GstCaps containing the restriction of the
- stream that can be fed to the encoder.
- This will generally containing restrictions in video
- width/heigh/framerate or audio depth.
- * presence
- This is an integer specifying how many streams can be used in the
- containing profile. 0 means that any number of streams can be
- used.
- * pass
- This is an integer which is only meaningful if the multipass flag
- has been set in the profile. If it has been set it indicates which
- pass this Stream Profile corresponds to.
-
-2.4 Example profile
--------------------
-
-The representation used here is XML only as an example. No decision is
-made as to which formatting to use for storing targets and profiles.
-
-<gst-encoding-target>
- <name>Nokia N900</name>
- <category>Consumer Device</category>
- <profiles>
- <profile>Nokia N900/H264 HQ</profile>
- <profile>Nokia N900/MP3</profile>
- <profile>Nokia N900/AAC</profile>
- </profiles>
-</gst-encoding-target>
-
-<gst-encoding-profile>
- <name>Nokia N900/H264 HQ</name>
- <description>
- High Quality H264/AAC for the Nokia N900
- </description>
- <format>video/quicktime,variant=iso</format>
- <streams>
- <stream-profile>
- <type>audio</type>
- <format>audio/mpeg,mpegversion=4</format>
- <preset>Quality High/Main</preset>
- <restriction>audio/x-raw,channels=[1,2]</restriction>
- <presence>1</presence>
- </stream-profile>
- <stream-profile>
- <type>video</type>
- <format>video/x-h264</format>
- <preset>Profile Baseline/Quality High</preset>
- <restriction>
- video/x-raw,width=[16, 800],\
- height=[16, 480],framerate=[1/1, 30000/1001]
- </restriction>
- <presence>1</presence>
- </stream-profile>
- </streams>
-
-</gst-encoding-profile>
-
-2.5 API
--------
- A proposed C API is contained in the gstprofile.h file in this directory.
-
-
-2.6 Modifications required in the existing GstPreset system
------------------------------------------------------------
-
-2.6.1. Temporary preset.
-
- Currently a preset needs to be saved on disk in order to be
- used.
-
- This makes it impossible to have temporary presets (that exist only
- during the lifetime of a process), which might be required in the
- new proposed profile system
-
-2.6.2 Categorisation of presets.
-
- Currently presets are just aliases of a group of property/value
- without any meanings or explanation as to how they exclude each
- other.
-
- Take for example the H264 encoder. It can have presets for:
- * passes (1,2 or 3 passes)
- * profiles (Baseline, Main, ...)
- * quality (Low, medium, High)
-
- In order to programmatically know which presets exclude each other,
- we here propose the categorisation of these presets.
-
- This can be done in one of two ways
- 1. in the name (by making the name be [<category>:]<name>)
- This would give for example: "Quality:High", "Profile:Baseline"
- 2. by adding a new _meta key
- This would give for example: _meta/category:quality
-
-2.6.3 Aggregation of presets.
-
- There can be more than one choice of presets to be done for an
- element (quality, profile, pass).
-
- This means that one can not currently describe the full
- configuration of an element with a single string but with many.
-
- The proposal here is to extend the GstPreset API to be able to set
- all presets using one string and a well-known separator ('/').
-
- This change only requires changes in the core preset handling code.
-
- This would allow doing the following:
- gst_preset_load_preset (h264enc,
- "pass:1/profile:baseline/quality:high");
-
-2.7 Points to be determined
----------------------------
-
- This document hasn't determined yet how to solve the following
- problems:
-
-2.7.1 Storage of profiles
-
- One proposal for storage would be to use a system wide directory
- (like $prefix/share/gstreamer-0.10/profiles) and store XML files for
- every individual profiles.
-
- Users could then add their own profiles in ~/.gstreamer-0.10/profiles
-
- This poses some limitations as to what to do if some applications
- want to have some profiles limited to their own usage.
-
-
-3. Helper library for profiles
-------------------------------
-
- These helper methods could also be added to existing libraries (like
- GstPreset, GstPbUtils, ..).
-
- The various API proposed are in the accompanying gstprofile.h file.
-
-3.1 Getting user-readable names for formats
-
- This is already provided by GstPbUtils.
-
-3.2 Hierarchy of profiles
-
- The goal is for applications to be able to present to the user a list
- of combo-boxes for choosing their output profile:
-
- [ Category ] # optional, depends on the application
- [ Device/Site/.. ] # optional, depends on the application
- [ Profile ]
-
- Convenience methods are offered to easily get lists of categories,
- devices, and profiles.
-
-3.3 Creating Profiles
-
- The goal is for applications to be able to easily create profiles.
-
- The applications needs to be able to have a fast/efficient way to:
- * select a container format and see all compatible streams he can use
- with it.
- * select a codec format and see which container formats he can use
- with it.
-
- The remaining parts concern the restrictions to encoder
- input.
-
-3.4 Ensuring availability of plugins for Profiles
-
- When an application wishes to use a Profile, it should be able to
- query whether it has all the needed plugins to use it.
-
- This part will use GstPbUtils to query, and if needed install the
- missing plugins through the installed distribution plugin installer.
-
-
-I. Use-cases researched
------------------------
-
- This is a list of various use-cases where encoding/muxing is being
- used.
-
-* Transcoding
-
- The goal is to convert with as minimal loss of quality any input
- file for a target use.
- A specific variant of this is transmuxing (see below).
-
- Example applications: Arista, Transmageddon
-
-* Rendering timelines
-
- The incoming streams are a collection of various segments that need
- to be rendered.
- Those segments can vary in nature (i.e. the video width/height can
- change).
- This requires the use of identiy with the single-segment property
- activated to transform the incoming collection of segments to a
- single continuous segment.
-
- Example applications: PiTiVi, Jokosher
-
-* Encoding of live sources
-
- The major risk to take into account is the encoder not encoding the
- incoming stream fast enough. This is outside of the scope of
- encodebin, and should be solved by using queues between the sources
- and encodebin, as well as implementing QoS in encoders and sources
- (the encoders emitting QoS events, and the upstream elements
- adapting themselves accordingly).
-
- Example applications: camerabin, cheese
-
-* Screencasting applications
-
- This is similar to encoding of live sources.
- The difference being that due to the nature of the source (size and
- amount/frequency of updates) one might want to do the encoding in
- two parts:
- * The actual live capture is encoded with a 'almost-lossless' codec
- (such as huffyuv)
- * Once the capture is done, the file created in the first step is
- then rendered to the desired target format.
-
- Fixing sources to only emit region-updates and having encoders
- capable of encoding those streams would fix the need for the first
- step but is outside of the scope of encodebin.
-
- Example applications: Istanbul, gnome-shell, recordmydesktop
-
-* Live transcoding
-
- This is the case of an incoming live stream which will be
- broadcasted/transmitted live.
- One issue to take into account is to reduce the encoding latency to
- a minimum. This should mostly be done by picking low-latency
- encoders.
-
- Example applications: Rygel, Coherence
-
-* Transmuxing
-
- Given a certain file, the aim is to remux the contents WITHOUT
- decoding into either a different container format or the same
- container format.
- Remuxing into the same container format is useful when the file was
- not created properly (for example, the index is missing).
- Whenever available, parsers should be applied on the encoded streams
- to validate and/or fix the streams before muxing them.
-
- Metadata from the original file must be kept in the newly created
- file.
-
- Example applications: Arista, Transmaggedon
-
-* Loss-less cutting
-
- Given a certain file, the aim is to extract a certain part of the
- file without going through the process of decoding and re-encoding
- that file.
- This is similar to the transmuxing use-case.
-
- Example applications: PiTiVi, Transmageddon, Arista, ...
-
-* Multi-pass encoding
-
- Some encoders allow doing a multi-pass encoding.
- The initial pass(es) are only used to collect encoding estimates and
- are not actually muxed and outputted.
- The final pass uses previously collected information, and the output
- is then muxed and outputted.
-
-* Archiving and intermediary format
-
- The requirement is to have lossless
-
-* CD ripping
-
- Example applications: Sound-juicer
-
-* DVD ripping
-
- Example application: Thoggen
-
-
-
-* Research links
-
- Some of these are still active documents, some other not
-
-[0] GstPreset API documentation
- http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstPreset.html
-
-[1] gnome-media GConf profiles
- http://www.gnome.org/~bmsmith/gconf-docs/C/gnome-media.html
-
-[2] Research on a Device Profile API
- http://gstreamer.freedesktop.org/wiki/DeviceProfile
-
-[3] Research on defining presets usage
- http://gstreamer.freedesktop.org/wiki/PresetDesign
-
diff --git a/docs/design/design-orc-integration.txt b/docs/design/design-orc-integration.txt
deleted file mode 100644
index a6a401dd4..000000000
--- a/docs/design/design-orc-integration.txt
+++ /dev/null
@@ -1,204 +0,0 @@
-
-Orc Integration
-===============
-
-Sections
---------
-
- - About Orc
- - Fast memcpy()
- - Normal Usage
- - Build Process
- - Testing
- - Orc Limitations
-
-
-About Orc
----------
-
-Orc code can be in one of two forms: in .orc files that is converted
-by orcc to C code that calls liborc functions, or C code that calls
-liborc to create complex operations at runtime. The former is mostly
-for functions with predetermined functionality. The latter is for
-functionality that is determined at runtime, where writing .orc
-functions for all combinations would be prohibitive. Orc also has
-a fast memcpy and memset which are useful independently.
-
-
-Fast memcpy()
--------------
-
-*** This part is not integrated yet. ***
-
-Orc has built-in functions orc_memcpy() and orc_memset() that work
-like memcpy() and memset(). These are meant for large copies only.
-A reasonable cutoff for using orc_memcpy() instead of memcpy() is
-if the number of bytes is generally greater than 100. DO NOT use
-orc_memcpy() if the typical is size is less than 20 bytes, especially
-if the size is known at compile time, as these cases are inlined by
-the compiler.
-
-(Example: sys/ximage/ximagesink.c)
-
-Add $(ORC_CFLAGS) to libgstximagesink_la_CFLAGS and $(ORC_LIBS) to
-libgstximagesink_la_LIBADD. Then, in the source file, add:
-
- #ifdef HAVE_ORC
- #include <orc/orc.h>
- #else
- #define orc_memcpy(a,b,c) memcpy(a,b,c)
- #endif
-
-Then switch relevant uses of memcpy() to orc_memcpy().
-
-The above example works whether or not Orc is enabled at compile
-time.
-
-
-Normal Usage
-------------
-
-The following lines are added near the top of Makefile.am for plugins
-that use Orc code in .orc files (this is for the volume plugin):
-
- ORC_BASE=volume
- include $(top_srcdir)/common/orc.mk
-
-Also add the generated source file to the plugin build:
-
- nodist_libgstvolume_la_SOURCES = $(ORC_SOURCES)
-
-And of course, add $(ORC_CFLAGS) to libgstvolume_la_CFLAGS, and
-$(ORC_LIBS) to libgstvolume_la_LIBADD.
-
-The value assigned to ORC_BASE does not need to be related to
-the name of the plugin.
-
-
-Advanced Usage
---------------
-
-The Holy Grail of Orc usage is to programmatically generate Orc code
-at runtime, have liborc compile it into binary code at runtime, and
-then execute this code. Currently, the best example of this is in
-Schroedinger. An example of how this would be used is audioconvert:
-given an input format, channel position manipulation, dithering and
-quantizing configuration, and output format, a Orc code generator
-would create an OrcProgram, add the appropriate instructions to do
-each step based on the configuration, and then compile the program.
-Successfully compiling the program would return a function pointer
-that can be called to perform the operation.
-
-This sort of advanced usage requires structural changes to current
-plugins (e.g., audioconvert) and will probably be developed
-incrementally. Moreover, if such code is intended to be used without
-Orc as strict build/runtime requirement, two codepaths would need to
-be developed and tested. For this reason, until GStreamer requires
-Orc, I think it's a good idea to restrict such advanced usage to the
-cog plugin in -bad, which requires Orc.
-
-
-Build Process
--------------
-
-The goal of the build process is to make Orc non-essential for most
-developers and users. This is not to say you shouldn't have Orc
-installed -- without it, you will get slow backup C code, just that
-people compiling GStreamer are not forced to switch from Liboil to
-Orc immediately.
-
-With Orc installed, the build process will use the Orc Compiler (orcc)
-to convert each .orc file into a temporary C source (tmp-orc.c) and a
-temporary header file (${name}orc.h if constructed from ${base}.orc).
-The C source file is compiled and linked to the plugin, and the header
-file is included by other source files in the plugin.
-
-If 'make orc-update' is run in the source directory, the files
-tmp-orc.c and ${base}orc.h are copied to ${base}orc-dist.c and
-${base}orc-dist.h respectively. The -dist.[ch] files are automatically
-disted via orc.mk. The -dist.[ch] files should be checked in to
-git whenever the .orc source is changed and checked in. Example
-workflow:
-
- edit .orc file
- ... make, test, etc.
- make orc-update
- git add volume.orc volumeorc-dist.c volumeorc-dist.h
- git commit
-
-At 'make dist' time, all of the .orc files are compiled, and then
-copied to their -dist.[ch] counterparts, and then the -dist.[ch]
-files are added to the dist directory.
-
-Without Orc installed (or --disable-orc given to configure), the
--dist.[ch] files are copied to tmp-orc.c and ${name}orc.h. When
-compiled Orc disabled, DISABLE_ORC is defined in config.h, and
-the C backup code is compiled. This backup code is pure C, and
-does not include orc headers or require linking against liborc.
-
-The common/orc.mk build method is limited by the inflexibility of
-automake. The file tmp-orc.c must be a fixed filename, using ORC_NAME
-to generate the filename does not work because it conflicts with
-automake's dependency generation. Building multiple .orc files
-is not possible due to this restriction.
-
-
-Testing
--------
-
-If you create another .orc file, please add it to
-tests/orc/Makefile.am. This causes automatic test code to be
-generated and run during 'make check'. Each function in the .orc
-file is tested by comparing the results of executing the run-time
-compiled code and the C backup function.
-
-
-Orc Limitations
----------------
-
-audioconvert
-
- Orc doesn't have a mechanism for generating random numbers, which
- prevents its use as-is for dithering. One way around this is to
- generate suitable dithering values in one pass, then use those
- values in a second Orc-based pass.
-
- Orc doesn't handle 64-bit float, for no good reason.
-
- Irrespective of Orc handling 64-bit float, it would be useful to
- have a direct 32-bit float to 16-bit integer conversion.
-
- audioconvert is a good candidate for programmatically generated
- Orc code.
-
- audioconvert enumerates functions in terms of big-endian vs.
- little-endian. Orc's functions are "native" and "swapped".
- Programmatically generating code removes the need to worry about
- this.
-
- Orc doesn't handle 24-bit samples. Fixing this is not a priority
- (for ds).
-
-videoscale
-
- Orc doesn't handle horizontal resampling yet. The plan is to add
- special sampling opcodes, for nearest, bilinear, and cubic
- interpolation.
-
-videotestsrc
-
- Lots of code in videotestsrc needs to be rewritten to be SIMD
- (and Orc) friendly, e.g., stuff that uses oil_splat_u8().
-
- A fast low-quality random number generator in Orc would be useful
- here.
-
-volume
-
- Many of the comments on audioconvert apply here as well.
-
- There are a bunch of FIXMEs in here that are due to misapplied
- patches.
-
-
-
diff --git a/docs/design/draft-keyframe-force.txt b/docs/design/draft-keyframe-force.txt
deleted file mode 100644
index 14945f0b4..000000000
--- a/docs/design/draft-keyframe-force.txt
+++ /dev/null
@@ -1,91 +0,0 @@
-Forcing keyframes
------------------
-
-Consider the following use case:
-
- We have a pipeline that performs video and audio capture from a live source,
- compresses and muxes the streams and writes the resulting data into a file.
-
- Inside the uncompressed video data we have a specific pattern inserted at
- specific moments that should trigger a switch to a new file, meaning, we close
- the existing file we are writing to and start writing to a new file.
-
- We want the new file to start with a keyframe so that one can start decoding
- the file immediately.
-
-Components:
-
- 1) We need an element that is able to detect the pattern in the video stream.
-
- 2) We need to inform the video encoder that it should start encoding a keyframe
- starting from exactly the frame with the pattern.
-
- 3) We need to inform the demuxer that it should flush out any pending data and
- start creating the start of a new file with the keyframe as a first video
- frame.
-
- 4) We need to inform the sink element that it should start writing to the next
- file. This requires application interaction to instruct the sink of the new
- filename. The application should also be free to ignore the boundary and
- continue to write to the existing file. The application will typically use
- an event pad probe to detect the custom event.
-
-Implementation:
-
- The implementation would consist of generating a GST_EVENT_CUSTOM_DOWNSTREAM
- event that marks the keyframe boundary. This event is inserted into the
- pipeline by the application upon a certain trigger. In the above use case this
- trigger would be given by the element that detects the pattern, in the form of
- an element message.
-
- The custom event would travel further downstream to instruct encoder, muxer and
- sink about the possible switch.
-
- The information passed in the event consists of:
-
- name: GstForceKeyUnit
- (G_TYPE_UINT64)"timestamp" : the timestamp of the buffer that
- triggered the event.
- (G_TYPE_UINT64)"stream-time" : the stream position that triggered the
- event.
- (G_TYPE_UINT64)"running-time" : the running time of the stream when the
- event was triggered.
- (G_TYPE_BOOLEAN)"all-headers" : Send all headers, including those in
- the caps or those sent at the start of
- the stream.
-
- .... : optional other data fields.
-
- Note that this event is purely informational, no element is required to
- perform an action but it should forward the event downstream, just like any
- other event it does not handle.
-
- Elements understanding the event should behave as follows:
-
- 1) The video encoder receives the event before the next frame. Upon reception
- of the event it schedules to encode the next frame as a keyframe.
- Before pushing out the encoded keyframe it must push the GstForceKeyUnit
- event downstream.
-
- 2) The muxer receives the GstForceKeyUnit event and flushes out its current state,
- preparing to produce data that can be used as a keyunit. Before pushing out
- the new data it pushes the GstForceKeyUnit event downstream.
-
- 3) The application receives the GstForceKeyUnit on a sink padprobe of the sink
- and reconfigures the sink to make it perform new actions after receiving
- the next buffer.
-
-
-Upstream
---------
-
-When using RTP packets can get lost or receivers can be added at any time,
-they may request a new key frame.
-
-An downstream element sends an upstream "GstForceKeyUnit" event up the
-pipeline.
-
-When an element produces some kind of key unit in output, but has
-no such concept in its input (like an encoder that takes raw frames),
-it consumes the event (doesn't pass it upstream), and instead sends
-a downstream GstForceKeyUnit event and a new keyframe.
diff --git a/docs/design/draft-subtitle-overlays.txt b/docs/design/draft-subtitle-overlays.txt
deleted file mode 100644
index 87f2c2c61..000000000
--- a/docs/design/draft-subtitle-overlays.txt
+++ /dev/null
@@ -1,546 +0,0 @@
-===============================================================
- Subtitle overlays, hardware-accelerated decoding and playbin
-===============================================================
-
-Status: EARLY DRAFT / BRAINSTORMING
-
- === 1. Background ===
-
-Subtitles can be muxed in containers or come from an external source.
-
-Subtitles come in many shapes and colours. Usually they are either
-text-based (incl. 'pango markup'), or bitmap-based (e.g. DVD subtitles
-and the most common form of DVB subs). Bitmap based subtitles are
-usually compressed in some way, like some form of run-length encoding.
-
-Subtitles are currently decoded and rendered in subtitle-format-specific
-overlay elements. These elements have two sink pads (one for raw video
-and one for the subtitle format in question) and one raw video source pad.
-
-They will take care of synchronising the two input streams, and of
-decoding and rendering the subtitles on top of the raw video stream.
-
-Digression: one could theoretically have dedicated decoder/render elements
-that output an AYUV or ARGB image, and then let a videomixer element do
-the actual overlaying, but this is not very efficient, because it requires
-us to allocate and blend whole pictures (1920x1080 AYUV = 8MB,
-1280x720 AYUV = 3.6MB, 720x576 AYUV = 1.6MB) even if the overlay region
-is only a small rectangle at the bottom. This wastes memory and CPU.
-We could do something better by introducing a new format that only
-encodes the region(s) of interest, but we don't have such a format yet, and
-are not necessarily keen to rewrite this part of the logic in playbin
-at this point - and we can't change existing elements' behaviour, so would
-need to introduce new elements for this.
-
-Playbin2 supports outputting compressed formats, i.e. it does not
-force decoding to a raw format, but is happy to output to a non-raw
-format as long as the sink supports that as well.
-
-In case of certain hardware-accelerated decoding APIs, we will make use
-of that functionality. However, the decoder will not output a raw video
-format then, but some kind of hardware/API-specific format (in the caps)
-and the buffers will reference hardware/API-specific objects that
-the hardware/API-specific sink will know how to handle.
-
-
- === 2. The Problem ===
-
-In the case of such hardware-accelerated decoding, the decoder will not
-output raw pixels that can easily be manipulated. Instead, it will
-output hardware/API-specific objects that can later be used to render
-a frame using the same API.
-
-Even if we could transform such a buffer into raw pixels, we most
-likely would want to avoid that, in order to avoid the need to
-map the data back into system memory (and then later back to the GPU).
-It's much better to upload the much smaller encoded data to the GPU/DSP
-and then leave it there until rendered.
-
-Currently playbin only supports subtitles on top of raw decoded video.
-It will try to find a suitable overlay element from the plugin registry
-based on the input subtitle caps and the rank. (It is assumed that we
-will be able to convert any raw video format into any format required
-by the overlay using a converter such as videoconvert.)
-
-It will not render subtitles if the video sent to the sink is not
-raw YUV or RGB or if conversions have been disabled by setting the
-native-video flag on playbin.
-
-Subtitle rendering is considered an important feature. Enabling
-hardware-accelerated decoding by default should not lead to a major
-feature regression in this area.
-
-This means that we need to support subtitle rendering on top of
-non-raw video.
-
-
- === 3. Possible Solutions ===
-
-The goal is to keep knowledge of the subtitle format within the
-format-specific GStreamer plugins, and knowledge of any specific
-video acceleration API to the GStreamer plugins implementing
-that API. We do not want to make the pango/dvbsuboverlay/dvdspu/kate
-plugins link to libva/libvdpau/etc. and we do not want to make
-the vaapi/vdpau plugins link to all of libpango/libkate/libass etc.
-
-
-Multiple possible solutions come to mind:
-
- (a) backend-specific overlay elements
-
- e.g. vaapitextoverlay, vdpautextoverlay, vaapidvdspu, vdpaudvdspu,
- vaapidvbsuboverlay, vdpaudvbsuboverlay, etc.
-
- This assumes the overlay can be done directly on the backend-specific
- object passed around.
-
- The main drawback with this solution is that it leads to a lot of
- code duplication and may also lead to uncertainty about distributing
- certain duplicated pieces of code. The code duplication is pretty
- much unavoidable, since making textoverlay, dvbsuboverlay, dvdspu,
- kate, assrender, etc. available in form of base classes to derive
- from is not really an option. Similarly, one would not really want
- the vaapi/vdpau plugin to depend on a bunch of other libraries
- such as libpango, libkate, libtiger, libass, etc.
-
- One could add some new kind of overlay plugin feature though in
- combination with a generic base class of some sort, but in order
- to accommodate all the different cases and formats one would end
- up with quite convoluted/tricky API.
-
- (Of course there could also be a GstFancyVideoBuffer that provides
- an abstraction for such video accelerated objects and that could
- provide an API to add overlays to it in a generic way, but in the
- end this is just a less generic variant of (c), and it is not clear
- that there are real benefits to a specialised solution vs. a more
- generic one).
-
-
- (b) convert backend-specific object to raw pixels and then overlay
-
- Even where possible technically, this is most likely very
- inefficient.
-
-
- (c) attach the overlay data to the backend-specific video frame buffers
- in a generic way and do the actual overlaying/blitting later in
- backend-specific code such as the video sink (or an accelerated
- encoder/transcoder)
-
- In this case, the actual overlay rendering (i.e. the actual text
- rendering or decoding DVD/DVB data into pixels) is done in the
- subtitle-format-specific GStreamer plugin. All knowledge about
- the subtitle format is contained in the overlay plugin then,
- and all knowledge about the video backend in the video backend
- specific plugin.
-
- The main question then is how to get the overlay pixels (and
- we will only deal with pixels here) from the overlay element
- to the video sink.
-
- This could be done in multiple ways: One could send custom
- events downstream with the overlay data, or one could attach
- the overlay data directly to the video buffers in some way.
-
- Sending inline events has the advantage that is is fairly
- transparent to any elements between the overlay element and
- the video sink: if an effects plugin creates a new video
- buffer for the output, nothing special needs to be done to
- maintain the subtitle overlay information, since the overlay
- data is not attached to the buffer. However, it slightly
- complicates things at the sink, since it would also need to
- look for the new event in question instead of just processing
- everything in its buffer render function.
-
- If one attaches the overlay data to the buffer directly, any
- element between overlay and video sink that creates a new
- video buffer would need to be aware of the overlay data
- attached to it and copy it over to the newly-created buffer.
-
- One would have to do implement a special kind of new query
- (e.g. FEATURE query) that is not passed on automatically by
- gst_pad_query_default() in order to make sure that all elements
- downstream will handle the attached overlay data. (This is only
- a problem if we want to also attach overlay data to raw video
- pixel buffers; for new non-raw types we can just make it
- mandatory and assume support and be done with it; for existing
- non-raw types nothing changes anyway if subtitles don't work)
- (we need to maintain backwards compatibility for existing raw
- video pipelines like e.g.: ..decoder ! suboverlay ! encoder..)
-
- Even though slightly more work, attaching the overlay information
- to buffers seems more intuitive than sending it interleaved as
- events. And buffers stored or passed around (e.g. via the
- "last-buffer" property in the sink when doing screenshots via
- playbin) always contain all the information needed.
-
-
- (d) create a video/x-raw-*-delta format and use a backend-specific videomixer
-
- This possibility was hinted at already in the digression in
- section 1. It would satisfy the goal of keeping subtitle format
- knowledge in the subtitle plugins and video backend knowledge
- in the video backend plugin. It would also add a concept that
- might be generally useful (think ximagesrc capture with xdamage).
- However, it would require adding foorender variants of all the
- existing overlay elements, and changing playbin to that new
- design, which is somewhat intrusive. And given the general
- nature of such a new format/API, we would need to take a lot
- of care to be able to accommodate all possible use cases when
- designing the API, which makes it considerably more ambitious.
- Lastly, we would need to write videomixer variants for the
- various accelerated video backends as well.
-
-
-Overall (c) appears to be the most promising solution. It is the least
-intrusive and should be fairly straight-forward to implement with
-reasonable effort, requiring only small changes to existing elements
-and requiring no new elements.
-
-Doing the final overlaying in the sink as opposed to a videomixer
-or overlay in the middle of the pipeline has other advantages:
-
- - if video frames need to be dropped, e.g. for QoS reasons,
- we could also skip the actual subtitle overlaying and
- possibly the decoding/rendering as well, if the
- implementation and API allows for that to be delayed.
-
- - the sink often knows the actual size of the window/surface/screen
- the output video is rendered to. This *may* make it possible to
- render the overlay image in a higher resolution than the input
- video, solving a long standing issue with pixelated subtitles on
- top of low-resolution videos that are then scaled up in the sink.
- This would require for the rendering to be delayed of course instead
- of just attaching an AYUV/ARGB/RGBA blog of pixels to the video buffer
- in the overlay, but that could all be supported.
-
- - if the video backend / sink has support for high-quality text
- rendering (clutter?) we could just pass the text or pango markup
- to the sink and let it do the rest (this is unlikely to be
- supported in the general case - text and glyph rendering is
- hard; also, we don't really want to make up our own text markup
- system, and pango markup is probably too limited for complex
- karaoke stuff).
-
-
- === 4. API needed ===
-
- (a) Representation of subtitle overlays to be rendered
-
- We need to pass the overlay pixels from the overlay element to the
- sink somehow. Whatever the exact mechanism, let's assume we pass
- a refcounted GstVideoOverlayComposition struct or object.
-
- A composition is made up of one or more overlays/rectangles.
-
- In the simplest case an overlay rectangle is just a blob of
- RGBA/ABGR [FIXME?] or AYUV pixels with positioning info and other
- metadata, and there is only one rectangle to render.
-
- We're keeping the naming generic ("OverlayFoo" rather than
- "SubtitleFoo") here, since this might also be handy for
- other use cases such as e.g. logo overlays or so. It is not
- designed for full-fledged video stream mixing though.
-
- // Note: don't mind the exact implementation details, they'll be hidden
-
- // FIXME: might be confusing in 0.11 though since GstXOverlay was
- // renamed to GstVideoOverlay in 0.11, but not much we can do,
- // maybe we can rename GstVideoOverlay to something better
-
- struct GstVideoOverlayComposition
- {
- guint num_rectangles;
- GstVideoOverlayRectangle ** rectangles;
-
- /* lowest rectangle sequence number still used by the upstream
- * overlay element. This way a renderer maintaining some kind of
- * rectangles <-> surface cache can know when to free cached
- * surfaces/rectangles. */
- guint min_seq_num_used;
-
- /* sequence number for the composition (same series as rectangles) */
- guint seq_num;
- }
-
- struct GstVideoOverlayRectangle
- {
- /* Position on video frame and dimension of output rectangle in
- * output frame terms (already adjusted for the PAR of the output
- * frame). x/y can be negative (overlay will be clipped then) */
- gint x, y;
- guint render_width, render_height;
-
- /* Dimensions of overlay pixels */
- guint width, height, stride;
-
- /* This is the PAR of the overlay pixels */
- guint par_n, par_d;
-
- /* Format of pixels, GST_VIDEO_FORMAT_ARGB on big-endian systems,
- * and BGRA on little-endian systems (i.e. pixels are treated as
- * 32-bit values and alpha is always in the most-significant byte,
- * and blue is in the least-significant byte).
- *
- * FIXME: does anyone actually use AYUV in practice? (we do
- * in our utility function to blend on top of raw video)
- * What about AYUV and endianness? Do we always have [A][Y][U][V]
- * in memory? */
- /* FIXME: maybe use our own enum? */
- GstVideoFormat format;
-
- /* Refcounted blob of memory, no caps or timestamps */
- GstBuffer *pixels;
-
- // FIXME: how to express source like text or pango markup?
- // (just add source type enum + source buffer with data)
- //
- // FOR 0.10: always send pixel blobs, but attach source data in
- // addition (reason: if downstream changes, we can't renegotiate
- // that properly, if we just do a query of supported formats from
- // the start). Sink will just ignore pixels and use pango markup
- // from source data if it supports that.
- //
- // FOR 0.11: overlay should query formats (pango markup, pixels)
- // supported by downstream and then only send that. We can
- // renegotiate via the reconfigure event.
- //
-
- /* sequence number: useful for backends/renderers/sinks that want
- * to maintain a cache of rectangles <-> surfaces. The value of
- * the min_seq_num_used in the composition tells the renderer which
- * rectangles have expired. */
- guint seq_num;
-
- /* FIXME: we also need a (private) way to cache converted/scaled
- * pixel blobs */
- }
-
- (a1) Overlay consumer API:
-
- How would this work in a video sink that supports scaling of textures:
-
- gst_foo_sink_render () {
- /* assume only one for now */
- if video_buffer has composition:
- composition = video_buffer.get_composition()
-
- for each rectangle in composition:
- if rectangle.source_data_type == PANGO_MARKUP
- actor = text_from_pango_markup (rectangle.get_source_data())
- else
- pixels = rectangle.get_pixels_unscaled (FORMAT_RGBA, ...)
- actor = texture_from_rgba (pixels, ...)
-
- .. position + scale on top of video surface ...
- }
-
- (a2) Overlay producer API:
-
- e.g. logo or subpicture overlay: got pixels, stuff into rectangle:
-
- if (logoverlay->cached_composition == NULL) {
- comp = composition_new ();
-
- rect = rectangle_new (format, pixels_buf,
- width, height, stride, par_n, par_d,
- x, y, render_width, render_height);
-
- /* composition adds its own ref for the rectangle */
- composition_add_rectangle (comp, rect);
- rectangle_unref (rect);
-
- /* buffer adds its own ref for the composition */
- video_buffer_attach_composition (comp);
-
- /* we take ownership of the composition and save it for later */
- logoverlay->cached_composition = comp;
- } else {
- video_buffer_attach_composition (logoverlay->cached_composition);
- }
-
- FIXME: also add some API to modify render position/dimensions of
- a rectangle (probably requires creation of new rectangle, unless
- we handle writability like with other mini objects).
-
- (b) Fallback overlay rendering/blitting on top of raw video
-
- Eventually we want to use this overlay mechanism not only for
- hardware-accelerated video, but also for plain old raw video,
- either at the sink or in the overlay element directly.
-
- Apart from the advantages listed earlier in section 3, this
- allows us to consolidate a lot of overlaying/blitting code that
- is currently repeated in every single overlay element in one
- location. This makes it considerably easier to support a whole
- range of raw video formats out of the box, add SIMD-optimised
- rendering using ORC, or handle corner cases correctly.
-
- (Note: side-effect of overlaying raw video at the video sink is
- that if e.g. a screnshotter gets the last buffer via the last-buffer
- property of basesink, it would get an image without the subtitles
- on top. This could probably be fixed by re-implementing the
- property in GstVideoSink though. Playbin2 could handle this
- internally as well).
-
- void
- gst_video_overlay_composition_blend (GstVideoOverlayComposition * comp
- GstBuffer * video_buf)
- {
- guint n;
-
- g_return_if_fail (gst_buffer_is_writable (video_buf));
- g_return_if_fail (GST_BUFFER_CAPS (video_buf) != NULL);
-
- ... parse video_buffer caps into BlendVideoFormatInfo ...
-
- for each rectangle in the composition: {
-
- if (gst_video_format_is_yuv (video_buf_format)) {
- overlay_format = FORMAT_AYUV;
- } else if (gst_video_format_is_rgb (video_buf_format)) {
- overlay_format = FORMAT_ARGB;
- } else {
- /* FIXME: grayscale? */
- return;
- }
-
- /* this will scale and convert AYUV<->ARGB if needed */
- pixels = rectangle_get_pixels_scaled (rectangle, overlay_format);
-
- ... clip output rectangle ...
-
- __do_blend (video_buf_format, video_buf->data,
- overlay_format, pixels->data,
- x, y, width, height, stride);
-
- gst_buffer_unref (pixels);
- }
- }
-
-
- (c) Flatten all rectangles in a composition
-
- We cannot assume that the video backend API can handle any
- number of rectangle overlays, it's possible that it only
- supports one single overlay, in which case we need to squash
- all rectangles into one.
-
- However, we'll just declare this a corner case for now, and
- implement it only if someone actually needs it. It's easy
- to add later API-wise. Might be a bit tricky if we have
- rectangles with different PARs/formats (e.g. subs and a logo),
- though we could probably always just use the code from (b)
- with a fully transparent video buffer to create a flattened
- overlay buffer.
-
- (d) core API: new FEATURE query
-
- For 0.10 we need to add a FEATURE query, so the overlay element
- can query whether the sink downstream and all elements between
- the overlay element and the sink support the new overlay API.
- Elements in between need to support it because the render
- positions and dimensions need to be updated if the video is
- cropped or rescaled, for example.
-
- In order to ensure that all elements support the new API,
- we need to drop the query in the pad default query handler
- (so it only succeeds if all elements handle it explicitly).
-
- Might want two variants of the feature query - one where
- all elements in the chain need to support it explicitly
- and one where it's enough if some element downstream
- supports it.
-
- In 0.11 this could probably be handled via GstMeta and
- ALLOCATION queries (and/or we could simply require
- elements to be aware of this API from the start).
-
- There appears to be no issue with downstream possibly
- not being linked yet at the time when an overlay would
- want to do such a query.
-
-
-Other considerations:
-
- - renderers (overlays or sinks) may be able to handle only ARGB or only AYUV
- (for most graphics/hw-API it's likely ARGB of some sort, while our
- blending utility functions will likely want the same colour space as
- the underlying raw video format, which is usually YUV of some sort).
- We need to convert where required, and should cache the conversion.
-
- - renderers may or may not be able to scale the overlay. We need to
- do the scaling internally if not (simple case: just horizontal scaling
- to adjust for PAR differences; complex case: both horizontal and vertical
- scaling, e.g. if subs come from a different source than the video or the
- video has been rescaled or cropped between overlay element and sink).
-
- - renderers may be able to generate (possibly scaled) pixels on demand
- from the original data (e.g. a string or RLE-encoded data). We will
- ignore this for now, since this functionality can still be added later
- via API additions. The most interesting case would be to pass a pango
- markup string, since e.g. clutter can handle that natively.
-
- - renderers may be able to write data directly on top of the video pixels
- (instead of creating an intermediary buffer with the overlay which is
- then blended on top of the actual video frame), e.g. dvdspu, dvbsuboverlay
-
- However, in the interest of simplicity, we should probably ignore the
- fact that some elements can blend their overlays directly on top of the
- video (decoding/uncompressing them on the fly), even more so as it's
- not obvious that it's actually faster to decode the same overlay
- 70-90 times (say) (ie. ca. 3 seconds of video frames) and then blend
- it 70-90 times instead of decoding it once into a temporary buffer
- and then blending it directly from there, possibly SIMD-accelerated.
- Also, this is only relevant if the video is raw video and not some
- hardware-acceleration backend object.
-
- And ultimately it is the overlay element that decides whether to do
- the overlay right there and then or have the sink do it (if supported).
- It could decide to keep doing the overlay itself for raw video and
- only use our new API for non-raw video.
-
- - renderers may want to make sure they only upload the overlay pixels once
- per rectangle if that rectangle recurs in subsequent frames (as part of
- the same composition or a different composition), as is likely. This caching
- of e.g. surfaces needs to be done renderer-side and can be accomplished
- based on the sequence numbers. The composition contains the lowest
- sequence number still in use upstream (an overlay element may want to
- cache created compositions+rectangles as well after all to re-use them
- for multiple frames), based on that the renderer can expire cached
- objects. The caching needs to be done renderer-side because attaching
- renderer-specific objects to the rectangles won't work well given the
- refcounted nature of rectangles and compositions, making it unpredictable
- when a rectangle or composition will be freed or from which thread
- context it will be freed. The renderer-specific objects are likely bound
- to other types of renderer-specific contexts, and need to be managed
- in connection with those.
-
- - composition/rectangles should internally provide a certain degree of
- thread-safety. Multiple elements (sinks, overlay element) might access
- or use the same objects from multiple threads at the same time, and it
- is expected that elements will keep a ref to compositions and rectangles
- they push downstream for a while, e.g. until the current subtitle
- composition expires.
-
- === 5. Future considerations ===
-
- - alternatives: there may be multiple versions/variants of the same subtitle
- stream. On DVDs, there may be a 4:3 version and a 16:9 version of the same
- subtitles. We could attach both variants and let the renderer pick the best
- one for the situation (currently we just use the 16:9 version). With totem,
- it's ultimately totem that adds the 'black bars' at the top/bottom, so totem
- also knows if it's got a 4:3 display and can/wants to fit 4:3 subs (which
- may render on top of the bars) or not, for example.
-
- === 6. Misc. FIXMEs ===
-
-TEST: should these look (roughly) alike (note text distortion) - needs fixing in textoverlay
-
-gst-launch-0.10 \
- videotestsrc ! video/x-raw,width=640,height=480,pixel-aspect-ratio=1/1 ! textoverlay text=Hello font-desc=72 ! xvimagesink \
- videotestsrc ! video/x-raw,width=320,height=480,pixel-aspect-ratio=2/1 ! textoverlay text=Hello font-desc=72 ! xvimagesink \
- videotestsrc ! video/x-raw,width=640,height=240,pixel-aspect-ratio=1/2 ! textoverlay text=Hello font-desc=72 ! xvimagesink
-
- ~~~ THE END ~~~
-
diff --git a/docs/design/part-interlaced-video.txt b/docs/design/part-interlaced-video.txt
deleted file mode 100644
index 4ac678e95..000000000
--- a/docs/design/part-interlaced-video.txt
+++ /dev/null
@@ -1,107 +0,0 @@
-Interlaced Video
-================
-
-Video buffers have a number of states identifiable through a combination of caps
-and buffer flags.
-
-Possible states:
-- Progressive
-- Interlaced
- - Plain
- - One field
- - Two fields
- - Three fields - this should be a progressive buffer with a repeated 'first'
- field that can be used for telecine pulldown
- - Telecine
- - One field
- - Two fields
- - Progressive
- - Interlaced (a.k.a. 'mixed'; the fields are from different frames)
- - Three fields - this should be a progressive buffer with a repeated 'first'
- field that can be used for telecine pulldown
-
-Note: It can be seen that the difference between the plain interlaced and
-telecine states is that in the telecine state, buffers containing two fields may
-be progressive.
-
-Tools for identification:
-- GstVideoInfo
- - GstVideoInterlaceMode - enum - GST_VIDEO_INTERLACE_MODE_...
- - PROGRESSIVE
- - INTERLEAVED
- - MIXED
-- Buffers flags - GST_VIDEO_BUFFER_FLAG_...
- - TFF
- - RFF
- - ONEFIELD
- - INTERLACED
-
-
-Identification of Buffer States
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Note that flags are not necessarily interpreted in the same way for all
-different states nor are they necessarily required nor make sense in all cases.
-
-
-Progressive
-...........
-
-If the interlace mode in the video info corresponding to a buffer is
-"progressive", then the buffer is progressive.
-
-
-Plain Interlaced
-................
-
-If the video info interlace mode is "interleaved", then the buffer is plain
-interlaced.
-
-GST_VIDEO_BUFFER_FLAG_TFF indicates whether the top or bottom field is to be
-displayed first. The timestamp on the buffer corresponds to the first field.
-
-GST_VIDEO_BUFFER_FLAG_RFF indicates that the first field (indicated by the TFF flag)
-should be repeated. This is generally only used for telecine purposes but as the
-telecine state was added long after the interlaced state was added and defined,
-this flag remains valid for plain interlaced buffers.
-
-GST_VIDEO_BUFFER_FLAG_ONEFIELD means that only the field indicated through the TFF
-flag is to be used. The other field should be ignored.
-
-
-Telecine
-........
-
-If video info interlace mode is "mixed" then the buffers are in some form of
-telecine state.
-
-The TFF and ONEFIELD flags have the same semantics as for the plain interlaced
-state.
-
-GST_VIDEO_BUFFER_FLAG_RFF in the telecine state indicates that the buffer contains
-only repeated fields that are present in other buffers and are as such
-unneeded. For example, in a sequence of three telecined frames, we might have:
-
-AtAb AtBb BtBb
-
-In this situation, we only need the first and third buffers as the second
-buffer contains fields present in the first and third.
-
-Note that the following state can have its second buffer identified using the
-ONEFIELD flag (and TFF not set):
-
-AtAb AtBb BtCb
-
-The telecine state requires one additional flag to be able to identify
-progressive buffers.
-
-The presence of the GST_VIDEO_BUFFER_FLAG_INTERLACED means that the buffer is an
-'interlaced' or 'mixed' buffer that contains two fields that, when combined
-with fields from adjacent buffers, allow reconstruction of progressive frames.
-The absence of the flag implies the buffer containing two fields is a
-progressive frame.
-
-For example in the following sequence, the third buffer would be mixed (yes, it
-is a strange pattern, but it can happen):
-
-AtAb AtBb BtCb CtDb DtDb
diff --git a/docs/design/part-mediatype-audio-raw.txt b/docs/design/part-mediatype-audio-raw.txt
deleted file mode 100644
index 503ef63db..000000000
--- a/docs/design/part-mediatype-audio-raw.txt
+++ /dev/null
@@ -1,76 +0,0 @@
-Media Types
------------
-
- audio/x-raw
-
- format, G_TYPE_STRING, mandatory
- The format of the audio samples, see the Formats section for a list
- of valid sample formats.
-
- rate, G_TYPE_INT, mandatory
- The samplerate of the audio
-
- channels, G_TYPE_INT, mandatory
- The number of channels
-
- channel-mask, GST_TYPE_BITMASK, mandatory for more than 2 channels
- Bitmask of channel positions present. May be omitted for mono and
- stereo. May be set to 0 to denote that the channels are unpositioned.
-
- layout, G_TYPE_STRING, mandatory
- The layout of channels within a buffer. Possible values are
- "interleaved" (for LRLRLRLR) and "non-interleaved" (LLLLRRRR)
-
-Use GstAudioInfo and related helper API to create and parse raw audio caps.
-
-
-Metadata
---------
-
- "GstAudioDownmixMeta"
- A matrix for downmixing multichannel audio to a lower numer of channels.
-
-
-Formats
--------
-
- The following values can be used for the format string property.
-
- "S8" 8-bit signed PCM audio
- "U8" 8-bit unsigned PCM audio
-
- "S16LE" 16-bit signed PCM audio
- "S16BE" 16-bit signed PCM audio
- "U16LE" 16-bit unsigned PCM audio
- "U16BE" 16-bit unsigned PCM audio
-
- "S24_32LE" 24-bit signed PCM audio packed into 32-bit
- "S24_32BE" 24-bit signed PCM audio packed into 32-bit
- "U24_32LE" 24-bit unsigned PCM audio packed into 32-bit
- "U24_32BE" 24-bit unsigned PCM audio packed into 32-bit
-
- "S32LE" 32-bit signed PCM audio
- "S32BE" 32-bit signed PCM audio
- "U32LE" 32-bit unsigned PCM audio
- "U32BE" 32-bit unsigned PCM audio
-
- "S24LE" 24-bit signed PCM audio
- "S24BE" 24-bit signed PCM audio
- "U24LE" 24-bit unsigned PCM audio
- "U24BE" 24-bit unsigned PCM audio
-
- "S20LE" 20-bit signed PCM audio
- "S20BE" 20-bit signed PCM audio
- "U20LE" 20-bit unsigned PCM audio
- "U20BE" 20-bit unsigned PCM audio
-
- "S18LE" 18-bit signed PCM audio
- "S18BE" 18-bit signed PCM audio
- "U18LE" 18-bit unsigned PCM audio
- "U18BE" 18-bit unsigned PCM audio
-
- "F32LE" 32-bit floating-point audio
- "F32BE" 32-bit floating-point audio
- "F64LE" 64-bit floating-point audio
- "F64BE" 64-bit floating-point audio
-
diff --git a/docs/design/part-mediatype-text-raw.txt b/docs/design/part-mediatype-text-raw.txt
deleted file mode 100644
index 82fbdd52e..000000000
--- a/docs/design/part-mediatype-text-raw.txt
+++ /dev/null
@@ -1,28 +0,0 @@
-Media Types
------------
-
- text/x-raw
-
- format, G_TYPE_STRING, mandatory
- The format of the text, see the Formats section for a list of valid format
- strings.
-
-Metadata
---------
-
- There are no common metas for this raw format yet.
-
-Formats
--------
-
- "utf8" plain timed utf8 text (formerly text/plain)
-
- Parsed timed text in utf8 format.
-
- "pango-markup" plain timed utf8 text with pango markup (formerly text/x-pango-markup)
-
- Same as "utf8", but text embedded in an XML-style markup language for
- size, colour, emphasis, etc.
-
- See http://developer.gnome.org/pango/stable/PangoMarkupFormat.html
-
diff --git a/docs/design/part-mediatype-video-raw.txt b/docs/design/part-mediatype-video-raw.txt
deleted file mode 100644
index c7fc837c7..000000000
--- a/docs/design/part-mediatype-video-raw.txt
+++ /dev/null
@@ -1,1258 +0,0 @@
-Media Types
------------
-
- video/x-raw
-
- width, G_TYPE_INT, mandatory
- The width of the image in pixels.
-
- height, G_TYPE_INT, mandatory
- The height of the image in pixels
-
- framerate, GST_TYPE_FRACTION, default 0/1
- The framerate of the video 0/1 for variable framerate
-
- max-framerate, GST_TYPE_FRACTION, default as framerate
- For variable framerates this would be the maximum framerate that
- is expected. This value is only valid when the framerate is 0/1
-
- views, G_TYPE_INT, default 1
- The number of views for multiview video. Each buffer contains
- multiple GstVideoMeta buffers that describe each view. use the frame id to
- get access to the different views.
-
- interlace-mode, G_TYPE_STRING, default progressive
- The interlace mode. The following values are possible:
-
- "progressive" : all frames are progressive
- "interleaved" : 2 fields are interleaved in one video frame. Extra buffer
- flags describe the field order.
- "mixed" : progressive and interleaved frames, extra buffer flags describe
- the frame and fields.
- "fields" : 2 fields are stored in one buffer, use the frame ID
- to get access to the required field. For multiview (the
- 'views' property > 1) the fields of view N can be found at
- frame ID (N * 2) and (N * 2) + 1.
- Each view has only half the amount of lines as noted in the
- height property, pads specifying the "fields" property
- must be prepared for this. This mode requires multiple
- GstVideoMeta metadata to describe the fields.
-
- chroma-site, G_TYPE_STRING, default UNKNOWN
- The chroma siting of the video frames.
-
- "jpeg" : GST_VIDEO_CHROMA_SITE_JPEG
- "mpeg2": GST_VIDEO_CHROMA_SITE_MPEG2
- "dv" : GST_VIDEO_CHROMA_SITE_DV
-
- colorimetry, G_TYPE_STRING, default UNKNOWN
- The colorimetry of the video frames predefined colorimetry is given with
- the following values:
-
- "bt601"
- "bt709"
- "smpte240m"
-
- pixel-aspect-ratio, GST_TYPE_FRACTION, default 1/1
- The pixel aspect ration of the video
-
- format, G_TYPE_STRING, mandatory
- The format of the video, see the Formats section for a list of valid format
- strings.
-
-Metadata
---------
-
- "GstVideoMeta"
- contains the description of one video field or frame. It has
- stride support and support for having multiple memory regions per frame.
-
- Multiple GstVideoMeta can be added to a buffer and can be identified with a
- unique id. This id can be used to select fields in interlaced formats or
- views in multiview formats.
-
- "GstVideoCropMeta"
- Contains the cropping region of the video.
-
-
-Formats
--------
-
- "I420" planar 4:2:0 YUV
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth: 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU4 (RU2 (width) / 2)
- default size: rstride (component1) * RU2 (height) / 2
-
- Component 2: V
- depth 8
- pstride: 1
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (RU2 (width) / 2)
- default size: rstride (component2) * RU2 (height) / 2
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "YV12" planar 4:2:0 YUV
-
- Same as I420 but with U and V planes swapped
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: offset (component2) + size (component2)
- default rstride: RU4 (RU2 (width) / 2)
- default size: rstride (component1) * RU2 (height) / 2
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU4 (RU2 (width) / 2)
- default size: rstride (component2) * RU2 (height) / 2
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "YUY2" packed 4:2:2 YUV
-
- +--+--+--+--+ +--+--+--+--+
- |Y0|U0|Y1|V0| |Y2|U2|Y3|V2| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: Y
- depth: 8
- pstride: 2
- offset: 0
-
- Component 1: U
- depth: 8
- offset: 1
- pstride: 4
-
- Component 2: V
- depth 8
- offset: 3
- pstride: 4
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
-
- "YVYU" packed 4:2:2 YUV
-
- Same as "YUY2" but with U and V planes swapped
-
- +--+--+--+--+ +--+--+--+--+
- |Y0|V0|Y1|U0| |Y2|V2|Y3|U2| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: Y
- depth: 8
- pstride: 2
- offset: 0
-
- Component 1: U
- depth: 8
- pstride: 4
- offset: 3
-
- Component 2: V
- depth 8
- pstride: 4
- offset: 1
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
-
- "UYVY" packed 4:2:2 YUV
-
- +--+--+--+--+ +--+--+--+--+
- |U0|Y0|V0|Y1| |U2|Y2|V2|Y3| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: Y
- depth: 8
- pstride: 2
- offset: 1
-
- Component 1: U
- depth: 8
- pstride: 4
- offset: 0
-
- Component 2: V
- depth 8
- pstride: 4
- offset: 2
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
-
- "AYUV" packed 4:4:4 YUV with alpha channel
-
- +--+--+--+--+ +--+--+--+--+
- |A0|Y0|U0|V0| |A1|Y1|U1|V1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: Y
- depth: 8
- pstride: 4
- offset: 1
-
- Component 1: U
- depth: 8
- pstride: 4
- offset: 2
-
- Component 2: V
- depth 8
- pstride: 4
- offset: 3
-
- Component 3: A
- depth 8
- pstride: 4
- offset: 0
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
-
- "RGBx" sparse rgb packed into 32 bit, space last
-
- +--+--+--+--+ +--+--+--+--+
- |R0|G0|B0|X | |R1|G1|B1|X | ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 0
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 1
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 2
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "BGRx" sparse reverse rgb packed into 32 bit, space last
-
- +--+--+--+--+ +--+--+--+--+
- |B0|G0|R0|X | |B1|G1|R1|X | ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 2
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 1
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 0
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "xRGB" sparse rgb packed into 32 bit, space first
-
- +--+--+--+--+ +--+--+--+--+
- |X |R0|G0|B0| |X |R1|G1|B1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 1
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 2
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 3
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "xBGR" sparse reverse rgb packed into 32 bit, space first
-
- +--+--+--+--+ +--+--+--+--+
- |X |B0|G0|R0| |X |B1|G1|R1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 3
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 2
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 1
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "RGBA" rgb with alpha channel last
-
- +--+--+--+--+ +--+--+--+--+
- |R0|G0|B0|A0| |R1|G1|B1|A1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 0
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 1
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 2
-
- Component 3: A
- depth 8
- pstride: 4
- offset: 3
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "BGRA" reverse rgb with alpha channel last
-
- +--+--+--+--+ +--+--+--+--+
- |B0|G0|R0|A0| |B1|G1|R1|A1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 2
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 1
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 0
-
- Component 3: A
- depth 8
- pstride: 4
- offset: 3
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "ARGB" rgb with alpha channel first
-
- +--+--+--+--+ +--+--+--+--+
- |A0|R0|G0|B0| |A1|R1|G1|B1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 1
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 2
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 3
-
- Component 3: A
- depth 8
- pstride: 4
- offset: 0
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "ABGR" reverse rgb with alpha channel first
-
- +--+--+--+--+ +--+--+--+--+
- |A0|R0|G0|B0| |A1|R1|G1|B1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 4
- offset: 1
-
- Component 1: G
- depth: 8
- pstride: 4
- offset: 2
-
- Component 2: B
- depth 8
- pstride: 4
- offset: 3
-
- Component 3: A
- depth 8
- pstride: 4
- offset: 0
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "RGB" rgb
-
- +--+--+--+ +--+--+--+
- |R0|G0|B0| |R1|G1|B1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 3
- offset: 0
-
- Component 1: G
- depth: 8
- pstride: 3
- offset: 1
-
- Component 2: B
- depth 8
- pstride: 3
- offset: 2
-
- Image
- default rstride: RU4 (width * 3)
- default size: rstride (image) * height
-
- "BGR" reverse rgb
-
- +--+--+--+ +--+--+--+
- |B0|G0|R0| |B1|G1|R1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 8
- pstride: 3
- offset: 2
-
- Component 1: G
- depth: 8
- pstride: 3
- offset: 1
-
- Component 2: B
- depth 8
- pstride: 3
- offset: 0
-
- Image
- default rstride: RU4 (width * 3)
- default size: rstride (image) * height
-
- "Y41B" planar 4:1:1 YUV
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU16 (width) / 4
- default size: rstride (component1) * height
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: offset (component1) + size (component1)
- default rstride: RU16 (width) / 4
- default size: rstride (component2) * height
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "Y42B" planar 4:2:2 YUV
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU8 (width) / 2
- default size: rstride (component1) * height
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: offset (component1) + size (component1)
- default rstride: RU8 (width) / 2
- default size: rstride (component2) * height
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "Y444" planar 4:4:4 YUV
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU4 (width)
- default size: rstride (component1) * height
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (width)
- default size: rstride (component2) * height
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "v210" packed 4:2:2 10-bit YUV, complex format
-
- Component 0: Y
- depth: 10
-
- Component 1: U
- depth 10
-
- Component 2: V
- depth: 10
-
- Image
- default rstride: RU48 (width) * 128
- default size: rstride (image) * height
-
-
- "v216" packed 4:2:2 16-bit YUV, Y0-U0-Y1-V1 order
-
- +--+--+--+--+ +--+--+--+--+
- |U0|Y0|V0|Y1| |U1|Y2|V1|Y3| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: Y
- depth: 16 LE
- pstride: 4
- offset: 2
-
- Component 1: U
- depth 16 LE
- pstride: 8
- offset: 0
-
- Component 2: V
- depth: 16 LE
- pstride: 8
- offset: 4
-
- Image
- default rstride: RU8 (width * 2)
- default size: rstride (image) * height
-
- "NV12" planar 4:2:0 YUV with interleaved UV plane
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth 8
- pstride: 2
- default offset: size (component0)
- default rstride: RU4 (width)
-
- Component 2: V
- depth: 8
- pstride: 2
- default offset: offset (component1) + 1
- default rstride: RU4 (width)
-
- Image
- default size: RU4 (width) * RU2 (height) * 3 / 2
-
-
- "NV21" planar 4:2:0 YUV with interleaved VU plane
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth 8
- pstride: 2
- default offset: offset (component1) + 1
- default rstride: RU4 (width)
-
- Component 2: V
- depth: 8
- pstride: 2
- default offset: size (component0)
- default rstride: RU4 (width)
-
- Image
- default size: RU4 (width) * RU2 (height) * 3 / 2
-
- "GRAY8" 8-bit grayscale
- "Y800" same as "GRAY8"
-
- Component 0: Y
- depth: 8
- offset: 0
- pstride: 1
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Image
- default size: size (component0)
-
- "GRAY16_BE" 16-bit grayscale, most significant byte first
-
- Component 0: Y
- depth: 16
- offset: 0
- pstride: 2
- default rstride: RU4 (width * 2)
- default size: rstride (component0) * height
-
- Image
- default size: size (component0)
-
- "GRAY16_LE" 16-bit grayscale, least significant byte first
- "Y16" same as "GRAY16_LE"
-
- Component 0: Y
- depth: 16 LE
- offset: 0
- pstride: 2
- default rstride: RU4 (width * 2)
- default size: rstride (component0) * height
-
- Image
- default size: size (component0)
-
- "v308" packed 4:4:4 YUV
-
- +--+--+--+ +--+--+--+
- |Y0|U0|V0| |Y1|U1|V1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: Y
- depth: 8
- pstride: 3
- offset: 0
-
- Component 1: U
- depth 8
- pstride: 3
- offset: 1
-
- Component 2: V
- depth: 8
- pstride: 3
- offset: 2
-
- Image
- default rstride: RU4 (width * 3)
- default size: rstride (image) * height
-
- "IYU2" packed 4:4:4 YUV, U-Y-V order
-
- +--+--+--+ +--+--+--+
- |U0|Y0|V0| |U1|Y1|V1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: Y
- depth: 8
- pstride: 3
- offset: 1
-
- Component 1: U
- depth 8
- pstride: 3
- offset: 0
-
- Component 2: V
- depth: 8
- pstride: 3
- offset: 2
-
- Image
- default rstride: RU4 (width * 3)
- default size: rstride (image) * height
-
- "RGB16" rgb 5-6-5 bits per component
-
- +--+--+--+ +--+--+--+
- |R0|G0|B0| |R1|G1|B1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 5
- pstride: 2
-
- Component 1: G
- depth 6
- pstride: 2
-
- Component 2: B
- depth: 5
- pstride: 2
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
- "BGR16" reverse rgb 5-6-5 bits per component
-
- +--+--+--+ +--+--+--+
- |B0|G0|R0| |B1|G1|R1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 5
- pstride: 2
-
- Component 1: G
- depth 6
- pstride: 2
-
- Component 2: B
- depth: 5
- pstride: 2
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
- "RGB15" rgb 5-5-5 bits per component
-
- +--+--+--+ +--+--+--+
- |R0|G0|B0| |R1|G1|B1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 5
- pstride: 2
-
- Component 1: G
- depth 5
- pstride: 2
-
- Component 2: B
- depth: 5
- pstride: 2
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
- "BGR15" reverse rgb 5-5-5 bits per component
-
- +--+--+--+ +--+--+--+
- |B0|G0|R0| |B1|G1|R1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 5
- pstride: 2
-
- Component 1: G
- depth 5
- pstride: 2
-
- Component 2: B
- depth: 5
- pstride: 2
-
- Image
- default rstride: RU4 (width * 2)
- default size: rstride (image) * height
-
- "UYVP" packed 10-bit 4:2:2 YUV (U0-Y0-V0-Y1 U2-Y2-V2-Y3 U4 ...)
-
- Component 0: Y
- depth: 10
-
- Component 1: U
- depth 10
-
- Component 2: V
- depth: 10
-
- Image
- default rstride: RU4 (width * 2 * 5)
- default size: rstride (image) * height
-
- "A420" planar 4:4:2:0 AYUV
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU4 (RU2 (width) / 2)
- default size: rstride (component1) * (RU2 (height) / 2)
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: size (component0) + size (component1)
- default rstride: RU4 (RU2 (width) / 2)
- default size: rstride (component2) * (RU2 (height) / 2)
-
- Component 3: A
- depth: 8
- pstride: 1
- default offset: size (component0) + size (component1) +
- size (component2)
- default rstride: RU4 (width)
- default size: rstride (component3) * RU2 (height)
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2) +
- size (component3)
-
- "RGB8P" 8-bit paletted RGB
-
- Component 0: INDEX
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Component 1: PALETTE
- depth 32
- pstride: 4
- default offset: size (component0)
- rstride: 4
- size: 256 * 4
-
- Image
- default size: size (component0) + size (component1)
-
- "YUV9" planar 4:1:0 YUV
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU4 (RU4 (width) / 4)
- default size: rstride (component1) * (RU4 (height) / 4)
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (RU4 (width) / 4)
- default size: rstride (component2) * (RU4 (height) / 4)
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "YVU9" planar 4:1:0 YUV (like YUV9 but UV planes swapped)
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU4 (width)
- default size: rstride (component0) * height
-
- Component 1: U
- depth 8
- pstride: 1
- default offset: offset (component2) + size (component2)
- default rstride: RU4 (RU4 (width) / 4)
- default size: rstride (component1) * (RU4 (height) / 4)
-
- Component 2: V
- depth: 8
- pstride: 1
- default offset: size (component0)
- default rstride: RU4 (RU4 (width) / 4)
- default size: rstride (component2) * (RU4 (height) / 4)
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "IYU1" packed 4:1:1 YUV (Cb-Y0-Y1-Cr-Y2-Y3 ...)
-
- +--+--+--+ +--+--+--+
- |B0|G0|R0| |B1|G1|R1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: Y
- depth: 8
- offset: 1
- pstride: 2
-
- Component 1: U
- depth 5
- offset: 0
- pstride: 2
-
- Component 2: V
- depth: 5
- offset: 4
- pstride: 2
-
- Image
- default rstride: RU4 (RU4 (width) + RU4 (width) / 2)
- default size: rstride (image) * height
-
- "ARGB64" rgb with alpha channel first, 16 bits per channel
-
- +--+--+--+--+ +--+--+--+--+
- |A0|R0|G0|B0| |A1|R1|G1|B1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: R
- depth: 16 LE
- pstride: 8
- offset: 2
-
- Component 1: G
- depth 16 LE
- pstride: 8
- offset: 4
-
- Component 2: B
- depth: 16 LE
- pstride: 8
- offset: 6
-
- Component 3: A
- depth: 16 LE
- pstride: 8
- offset: 0
-
- Image
- default rstride: width * 8
- default size: rstride (image) * height
-
- "AYUV64" packed 4:4:4 YUV with alpha channel, 16 bits per channel (A0-Y0-U0-V0 ...)
-
- +--+--+--+--+ +--+--+--+--+
- |A0|Y0|U0|V0| |A1|Y1|U1|V1| ...
- +--+--+--+--+ +--+--+--+--+
-
- Component 0: Y
- depth: 16 LE
- pstride: 8
- offset: 2
-
- Component 1: U
- depth 16 LE
- pstride: 8
- offset: 4
-
- Component 2: V
- depth: 16 LE
- pstride: 8
- offset: 6
-
- Component 3: A
- depth: 16 LE
- pstride: 8
- offset: 0
-
- Image
- default rstride: width * 8
- default size: rstride (image) * height
-
- "r210" packed 4:4:4 RGB, 10 bits per channel
-
- +--+--+--+ +--+--+--+
- |R0|G0|B0| |R1|G1|B1| ...
- +--+--+--+ +--+--+--+
-
- Component 0: R
- depth: 10
- pstride: 4
-
- Component 1: G
- depth 10
- pstride: 4
-
- Component 2: B
- depth: 10
- pstride: 4
-
- Image
- default rstride: width * 4
- default size: rstride (image) * height
-
- "I420_10LE" planar 4:2:0 YUV, 10 bits per channel LE
-
- Component 0: Y
- depth: 10 LE
- pstride: 2
- default offset: 0
- default rstride: RU4 (width * 2)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth: 10 LE
- pstride: 2
- default offset: size (component0)
- default rstride: RU4 (width)
- default size: rstride (component1) * RU2 (height) / 2
-
- Component 2: V
- depth 10 LE
- pstride: 2
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (width)
- default size: rstride (component2) * RU2 (height) / 2
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "I420_10BE" planar 4:2:0 YUV, 10 bits per channel BE
-
- Component 0: Y
- depth: 10 BE
- pstride: 2
- default offset: 0
- default rstride: RU4 (width * 2)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth: 10 BE
- pstride: 2
- default offset: size (component0)
- default rstride: RU4 (width)
- default size: rstride (component1) * RU2 (height) / 2
-
- Component 2: V
- depth 10 BE
- pstride: 2
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (width)
- default size: rstride (component2) * RU2 (height) / 2
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "I422_10LE" planar 4:2:2 YUV, 10 bits per channel LE
-
- Component 0: Y
- depth: 10 LE
- pstride: 2
- default offset: 0
- default rstride: RU4 (width * 2)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth: 10 LE
- pstride: 2
- default offset: size (component0)
- default rstride: RU4 (width)
- default size: rstride (component1) * RU2 (height)
-
- Component 2: V
- depth 10 LE
- pstride: 2
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (width)
- default size: rstride (component2) * RU2 (height)
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "I422_10BE" planar 4:2:2 YUV, 10 bits per channel BE
-
- Component 0: Y
- depth: 10 BE
- pstride: 2
- default offset: 0
- default rstride: RU4 (width * 2)
- default size: rstride (component0) * RU2 (height)
-
- Component 1: U
- depth: 10 BE
- pstride: 2
- default offset: size (component0)
- default rstride: RU4 (width)
- default size: rstride (component1) * RU2 (height)
-
- Component 2: V
- depth 10 BE
- pstride: 2
- default offset: offset (component1) + size (component1)
- default rstride: RU4 (width)
- default size: rstride (component2) * RU2 (height)
-
- Image
- default size: size (component0) +
- size (component1) +
- size (component2)
-
- "Y444_10BE" planar 4:4:4 YUV, 10 bits per channel
- "Y444_10LE" planar 4:4:4 YUV, 10 bits per channel
-
- "GBR" planar 4:4:4 RGB, 8 bits per channel
- "GBR_10BE" planar 4:4:4 RGB, 10 bits per channel
- "GBR_10LE" planar 4:4:4 RGB, 10 bits per channel
-
- "NV16" planar 4:2:2 YUV with interleaved UV plane
- "NV61" planar 4:2:2 YUV with interleaved VU plane
- "NV24" planar 4:4:4 YUV with interleaved UV plane
-
-
- "NV12_64Z32" planar 4:2:0 YUV with interleaved UV plane in 64x32 tiles zigzag
-
- Component 0: Y
- depth: 8
- pstride: 1
- default offset: 0
- default rstride: RU128 (width)
- default size: rstride (component0) * RU32 (height)
-
- Component 1: U
- depth 8
- pstride: 2
- default offset: size (component0)
- default rstride: (y_tiles << 16) | x_tiles
- default x_tiles: RU128 (width) >> tile_width
- default y_tiles: RU32 (height) >> tile_height
-
- Component 2: V
- depth: 8
- pstride: 2
- default offset: offset (component1) + 1
- default rstride: (y_tiles << 16) | x_tiles
- default x_tiles: RU128 (width) >> tile_width
- default y_tiles: RU64 (height) >> (tile_height + 1)
-
- Image
- default size: RU128 (width) * (RU32 (height) + RU64 (height) / 2)
- tile mode: ZFLIPZ_2X2
- tile width: 6
- tile height: 5
-
diff --git a/docs/design/part-playbin.txt b/docs/design/part-playbin.txt
deleted file mode 100644
index 232ac0cd2..000000000
--- a/docs/design/part-playbin.txt
+++ /dev/null
@@ -1,69 +0,0 @@
-playbin
---------
-
-The purpose of this element is to decode and render the media contained in a
-given generic uri. The element extends GstPipeline and is typically used in
-playback situations.
-
-Required features:
-
- - accept and play any valid uri. This includes
- - rendering video/audio
- - overlaying subtitles on the video
- - optionally read external subtitle files
- - allow for hardware (non raw) sinks
- - selection of audio/video/subtitle streams based on language.
- - perform network buffering/incremental download
- - gapless playback
- - support for visualisations with configurable sizes
- - ability to reject files that are too big, or of a format that would require
- too much CPU/memory usage.
- - be very efficient with adding elements such as converters to reduce the
- amount of negotiation that has to happen.
- - handle chained oggs. This includes having support for dynamic pad add and
- remove from a demuxer.
-
-Components
-----------
-
-* decodebin2
-
- - performs the autoplugging of demuxers/decoders
- - emits signals when for steering the autoplugging
- - to decide if a non-raw media format is acceptable as output
- - to sort the possible decoders for a non-raw format
- - see also decodebin2 design doc
-
-* uridecodebin
-
- - combination of a source to handle the given uri, an optional queueing element
- and one or more decodebin2 elements to decode the non-raw streams.
-
-* playsink
-
- - handles display of audio/video/text.
- - has request audio/video/text input pad. There is only one sinkpad per type.
- The requested pads define the configuration of the internal pipeline.
- - allows for setting audio/video sinks or does automatic sink selection.
- - allows for configuration of visualisation element.
- - allows for enable/disable of visualisation, audio and video.
-
-* playbin
-
- - combination of one or more uridecodebin elements to read the uri and subtitle
- uri.
- - support for queuing new media to support gapless playback.
- - handles stream selection.
- - uses playsink to display.
- - selection of sinks and configuration of uridecodebin with raw output formats.
-
-
-Gapless playback
-----------------
-
-playbin has an "about-to-finish" signal. The application should configure a new
-uri (and optional suburi) in the callback. When the current media finishes, this
-new media will be played next.
-
-
-
diff --git a/docs/design/part-stereo-multiview-video.markdown b/docs/design/part-stereo-multiview-video.markdown
deleted file mode 100644
index 838e6d8d8..000000000
--- a/docs/design/part-stereo-multiview-video.markdown
+++ /dev/null
@@ -1,278 +0,0 @@
-Design for Stereoscopic & Multiview Video Handling
-==================================================
-
-There are two cases to handle:
-
-* Encoded video output from a demuxer to parser / decoder or from encoders into a muxer.
-* Raw video buffers
-
-The design below is somewhat based on the proposals from
-[bug 611157](https://bugzilla.gnome.org/show_bug.cgi?id=611157)
-
-Multiview is used as a generic term to refer to handling both
-stereo content (left and right eye only) as well as extensions for videos
-containing multiple independent viewpoints.
-
-Encoded Signalling
-------------------
-This is regarding the signalling in caps and buffers from demuxers to
-parsers (sometimes) or out from encoders.
-
-For backward compatibility with existing codecs many transports of
-stereoscopic 3D content use normal 2D video with 2 views packed spatially
-in some way, and put extra new descriptions in the container/mux.
-
-Info in the demuxer seems to apply to stereo encodings only. For all
-MVC methods I know, the multiview encoding is in the video bitstream itself
-and therefore already available to decoders. Only stereo systems have been retro-fitted
-into the demuxer.
-
-Also, sometimes extension descriptions are in the codec (e.g. H.264 SEI FPA packets)
-and it would be useful to be able to put the info onto caps and buffers from the
-parser without decoding.
-
-To handle both cases, we need to be able to output the required details on
-encoded video for decoders to apply onto the raw video buffers they decode.
-
-*If there ever is a need to transport multiview info for encoded data the
-same system below for raw video or some variation should work*
-
-### Encoded Video: Properties that need to be encoded into caps
-1. multiview-mode (called "Channel Layout" in bug 611157)
- * Whether a stream is mono, for a single eye, stereo, mixed-mono-stereo
- (switches between mono and stereo - mp4 can do this)
- * Uses a buffer flag to mark individual buffers as mono or "not mono"
- (single|stereo|multiview) for mixed scenarios. The alternative (not
- proposed) is for the demuxer to switch caps for each mono to not-mono
- change, and not used a 'mixed' caps variant at all.
- * _single_ refers to a stream of buffers that only contain 1 view.
- It is different from mono in that the stream is a marked left or right
- eye stream for later combining in a mixer or when displaying.
- * _multiple_ marks a stream with multiple independent views encoded.
- It is included in this list for completeness. As noted above, there's
- currently no scenario that requires marking encoded buffers as MVC.
-2. Frame-packing arrangements / view sequence orderings
- * Possible frame packings: side-by-side, side-by-side-quincunx,
- column-interleaved, row-interleaved, top-bottom, checker-board
- * bug 611157 - sreerenj added side-by-side-full and top-bottom-full but
- I think that's covered by suitably adjusting pixel-aspect-ratio. If
- not, they can be added later.
- * _top-bottom_, _side-by-side_, _column-interleaved_, _row-interleaved_ are as the names suggest.
- * _checker-board_, samples are left/right pixels in a chess grid +-+-+-/-+-+-+
- * _side-by-side-quincunx_. Side By Side packing, but quincunx sampling -
- 1 pixel offset of each eye needs to be accounted when upscaling or displaying
- * there may be other packings (future expansion)
- * Possible view sequence orderings: frame-by-frame, frame-primary-secondary-tracks, sequential-row-interleaved
- * _frame-by-frame_, each buffer is left, then right view etc
- * _frame-primary-secondary-tracks_ - the file has 2 video tracks (primary and secondary), one is left eye, one is right.
- Demuxer info indicates which one is which.
- Handling this means marking each stream as all-left and all-right views, decoding separately, and combining automatically (inserting a mixer/combiner in playbin)
- -> *Leave this for future expansion*
- * _sequential-row-interleaved_ Mentioned by sreerenj in bug patches, I can't find a mention of such a thing. Maybe it's in MPEG-2
- -> *Leave this for future expansion / deletion*
-3. view encoding order
- * Describes how to decide which piece of each frame corresponds to left or right eye
- * Possible orderings left, right, left-then-right, right-then-left
- - Need to figure out how we find the correct frame in the demuxer to start decoding when seeking in frame-sequential streams
- - Need a buffer flag for marking the first buffer of a group.
-4. "Frame layout flags"
- * flags for view specific interpretation
- * horizontal-flip-left, horizontal-flip-right, vertical-flip-left, vertical-flip-right
- Indicates that one or more views has been encoded in a flipped orientation, usually due to camera with mirror or displays with mirrors.
- * This should be an actual flags field. Registered GLib flags types aren't generally well supported in our caps - the type might not be loaded/registered yet when parsing a caps string, so they can't be used in caps templates in the registry.
- * It might be better just to use a hex value / integer
-
-Buffer representation for raw video
------------------------------------
-* Transported as normal video buffers with extra metadata
-* The caps define the overall buffer width/height, with helper functions to
- extract the individual views for packed formats
-* pixel-aspect-ratio adjusted if needed to double the overall width/height
-* video sinks that don't know about multiview extensions yet will show the packed view as-is
- For frame-sequence outputs, things might look weird, but just adding multiview-mode to the sink caps
- can disallow those transports.
-* _row-interleaved_ packing is actually just side-by-side memory layout with half frame width, twice
- the height, so can be handled by adjusting the overall caps and strides
-* Other exotic layouts need new pixel formats defined (checker-board, column-interleaved, side-by-side-quincunx)
-* _Frame-by-frame_ - one view per buffer, but with alternating metas marking which buffer is which left/right/other view and using a new buffer flag as described above
- to mark the start of a group of corresponding frames.
-* New video caps addition as for encoded buffers
-
-### Proposed Caps fields
-Combining the requirements above and collapsing the combinations into mnemonics:
-
-* multiview-mode =
- mono | left | right | sbs | sbs-quin | col | row | topbot | checkers |
- frame-by-frame | mixed-sbs | mixed-sbs-quin | mixed-col | mixed-row |
- mixed-topbot | mixed-checkers | mixed-frame-by-frame | multiview-frames mixed-multiview-frames
-* multiview-flags =
- + 0x0000 none
- + 0x0001 right-view-first
- + 0x0002 left-h-flipped
- + 0x0004 left-v-flipped
- + 0x0008 right-h-flipped
- + 0x0010 right-v-flipped
-
-### Proposed new buffer flags
-Add two new GST_VIDEO_BUFFER flags in video-frame.h and make it clear that those
-flags can apply to encoded video buffers too. wtay says that's currently the
-case anyway, but the documentation should say it.
-
-**GST_VIDEO_BUFFER_FLAG_MULTIPLE_VIEW** - Marks a buffer as representing non-mono content, although it may be a single (left or right) eye view.
-**GST_VIDEO_BUFFER_FLAG_FIRST_IN_BUNDLE** - for frame-sequential methods of transport, mark the "first" of a left/right/other group of frames
-
-### A new GstMultiviewMeta
-This provides a place to describe all provided views in a buffer / stream,
-and through Meta negotiation to inform decoders about which views to decode if
-not all are wanted.
-
-* Logical labels/names and mapping to GstVideoMeta numbers
-* Standard view labels LEFT/RIGHT, and non-standard ones (strings)
-
- GST_VIDEO_MULTIVIEW_VIEW_LEFT = 1
- GST_VIDEO_MULTIVIEW_VIEW_RIGHT = 2
-
- struct GstVideoMultiviewViewInfo {
- guint view_label;
- guint meta_id; // id of the GstVideoMeta for this view
-
- padding;
- }
-
- struct GstVideoMultiviewMeta {
- guint n_views;
- GstVideoMultiviewViewInfo *view_info;
- }
-
-The meta is optional, and probably only useful later for MVC
-
-
-Outputting stereo content
--------------------------
-The initial implementation for output will be stereo content in glimagesink
-
-### Output Considerations with OpenGL
-* If we have support for stereo GL buffer formats, we can output separate left/right eye images and let the hardware take care of display.
-* Otherwise, glimagesink needs to render one window with left/right in a suitable frame packing
- and that will only show correctly in fullscreen on a device set for the right 3D packing -> requires app intervention to set the video mode.
-* Which could be done manually on the TV, or with HDMI 1.4 by setting the right video mode for the screen to inform the TV or third option, we
- support rendering to two separate overlay areas on the screen - one for left eye, one for right which can be supported using the 'splitter' element and 2 output sinks or, better, add a 2nd window overlay for split stereo output
-* Intel hardware doesn't do stereo GL buffers - only nvidia and AMD, so initial implementation won't include that
-
-## Other elements for handling multiview content
-* videooverlay interface extensions
- * __Q__: Should this be a new interface?
- * Element message to communicate the presence of stereoscopic information to the app
- * App needs to be able to override the input interpretation - ie, set multiview-mode and multiview-flags
- * Most videos I've seen are side-by-side or top-bottom with no frame-packing metadata
- * New API for the app to set rendering options for stereo/multiview content
- * This might be best implemented as a **multiview GstContext**, so that
- the pipeline can share app preferences for content interpretation and downmixing
- to mono for output, or in the sink and have those down as far upstream/downstream as possible.
-* Converter element
- * convert different view layouts
- * Render to anaglyphs of different types (magenta/green, red/blue, etc) and output as mono
-* Mixer element
- * take 2 video streams and output as stereo
- * later take n video streams
- * share code with the converter, it just takes input from n pads instead of one.
-* Splitter element
- * Output one pad per view
-
-### Implementing MVC handling in decoders / parsers (and encoders)
-Things to do to implement MVC handling
-
-1. Parsing SEI in h264parse and setting caps (patches available in
- bugzilla for parsing, see below)
-2. Integrate gstreamer-vaapi MVC support with this proposal
-3. Help with [libav MVC implementation](https://wiki.libav.org/Blueprint/MVC)
-4. generating SEI in H.264 encoder
-5. Support for MPEG2 MVC extensions
-
-## Relevant bugs
-[bug 685215](https://bugzilla.gnome.org/show_bug.cgi?id=685215) - codecparser h264: Add initial MVC parser
-[bug 696135](https://bugzilla.gnome.org/show_bug.cgi?id=696135) - h264parse: Add mvc stream parsing support
-[bug 732267](https://bugzilla.gnome.org/show_bug.cgi?id=732267) - h264parse: extract base stream from MVC or SVC encoded streams
-
-## Other Information
-[Matroska 3D support notes](http://www.matroska.org/technical/specs/notes.html#3D)
-
-## Open Questions
-
-### Background
-
-### Representation for GstGL
-When uploading raw video frames to GL textures, the goal is to implement:
-
-2. Split packed frames into separate GL textures when uploading, and
-attach multiple GstGLMemory's to the GstBuffer. The multiview-mode and
-multiview-flags fields in the caps should change to reflect the conversion
-from one incoming GstMemory to multiple GstGLMemory, and change the
-width/height in the output info as needed.
-
-This is (currently) targetted as 2 render passes - upload as normal
-to a single stereo-packed RGBA texture, and then unpack into 2
-smaller textures, output with GST_VIDEO_MULTIVIEW_MODE_SEPARATED, as
-2 GstGLMemory attached to one buffer. We can optimise the upload later
-to go directly to 2 textures for common input formats.
-
-Separat output textures have a few advantages:
-
-* Filter elements can more easily apply filters in several passes to each
-texture without fundamental changes to our filters to avoid mixing pixels
-from separate views.
-* Centralises the sampling of input video frame packings in the upload code,
-which makes adding new packings in the future easier.
-* Sampling multiple textures to generate various output frame-packings
-for display is conceptually simpler than converting from any input packing
-to any output packing.
-* In implementations that support quad buffers, having separate textures
-makes it trivial to do GL_LEFT/GL_RIGHT output
-
-For either option, we'll need new glsink output API to pass more
-information to applications about multiple views for the draw signal/callback.
-
-I don't know if it's desirable to support *both* methods of representing
-views. If so, that should be signalled in the caps too. That could be a
-new multiview-mode for passing views in separate GstMemory objects
-attached to a GstBuffer, which would not be GL specific.
-
-### Overriding frame packing interpretation
-Most sample videos available are frame packed, with no metadata
-to say so. How should we override that interpretation?
-
-* Simple answer: Use capssetter + new properties on playbin to
- override the multiview fields
- *Basically implemented in playbin, using a pad probe. Needs more work for completeness*
-
-### Adding extra GstVideoMeta to buffers
-There should be one GstVideoMeta for the entire video frame in packed
-layouts, and one GstVideoMeta per GstGLMemory when views are attached
-to a GstBuffer separately. This should be done by the buffer pool,
-which knows from the caps.
-
-### videooverlay interface extensions
-GstVideoOverlay needs:
-
-* A way to announce the presence of multiview content when it is
- detected/signalled in a stream.
-* A way to tell applications which output methods are supported/available
-* A way to tell the sink which output method it should use
-* Possibly a way to tell the sink to override the input frame
- interpretation / caps - depends on the answer to the question
- above about how to model overriding input interpretation.
-
-### What's implemented
-* Caps handling
-* gst-plugins-base libsgstvideo pieces
-* playbin caps overriding
-* conversion elements - glstereomix, gl3dconvert (needs a rename),
- glstereosplit.
-
-### Possible future enhancements
-* Make GLupload split to separate textures at upload time?
- * Needs new API to extract multiple textures from the upload. Currently only outputs 1 result RGBA texture.
-* Make GLdownload able to take 2 input textures, pack them and colorconvert / download as needed.
- - current done by packing then downloading which isn't OK overhead for RGBA download
-* Think about how we integrate GLstereo - do we need to do anything special,
- or can the app just render to stereo/quad buffers if they're available?