summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEdward Hervey <bilboed@bilboed.com>2014-08-27 10:12:23 +0200
committerEdward Hervey <bilboed@bilboed.com>2014-08-27 10:12:23 +0200
commit0c1c99be429cff83f1e30a94671995bc1b1ae7eb (patch)
tree68475fb942cce263f60ea3dae34bf997b17a657f
parentb38264a9cb6096f9a92ee45ea0c084c7a877c0c4 (diff)
design: Proposal for handling SKIP flag in demuxersdesign
-rw-r--r--docs/design/part-demuxer-skip.txt259
1 files changed, 259 insertions, 0 deletions
diff --git a/docs/design/part-demuxer-skip.txt b/docs/design/part-demuxer-skip.txt
new file mode 100644
index 000000000..5dfd0710b
--- /dev/null
+++ b/docs/design/part-demuxer-skip.txt
@@ -0,0 +1,259 @@
+Usage of SKIP seek flag in demuxers
+
+This document is to describe a potential way forward to support fast
+rates (forward and backward) in demuxers while lowering
+throughput. This concentrates mostly on container formats that have
+full visibility of the stream (i.e. contain an index).
+
+Out of scope:
+ Specifying what actions should be taken by the various elements in
+ the pipeline when the SKIP flag is used (should the decoder only
+ decode I frames ? At what rate ? Should the demuxer skip data or not
+ ? Should the demuxer not output anything on audio streams ? ...).
+
+When dealing with fast forward/reverse rates, it would be ideal to
+have a way for demuxers to only process a smaller amount of data.
+
+Rationale:
+* Directly reduce the amount of data to be processed by downstream
+ elements (decoders, ...) since the demuxer knows where decoding
+ start point (or keyframes) are located.
+* Reduce the amount of data fetch from upstream (file, http
+ location,...), thereby reducing the i/o and only fetching what is
+ needed.
+* Due to knowing the location of keyframes, a demuxer can handle trick
+ mode by skipping very quickly to such keyframes (and potentially a
+ few frames just after), which can be decoded straight away.
+
+The proposed logic is quite similar in intent to the proposal in
+gstreamer/docs/design/part-trickmodes.txt (client side forward
+trickmodes).
+
+The difference is that this proposal tries to:
+* Reduce processing complexity downstream (data outputted is always
+ sent with a forward rate of 1.0).
+* Avoid any changes in downstream elements
+* Reduce amount of changes required in demuxers and other elements.
+
+
+Mode of operation
+-----------------
+
+As stated in the GStreamer core design document, when the SKIP seek
+flag is used, the demuxer can chose to output only a part of the data.
+
+R : Requested rate (can be negative)
+S : Absolute value of R (always positive, called the Speedup).
+
+The global idea is to:
+1) Output at most the same average amount of data downstream to be
+ decoded in realtime (There are still <fps> frames decoded for each
+ walltime seconds, regardless of the rate).
+2) Fetch from upstream the same average bitrate. This ensures that if
+ the file was readable at normal rate from a local or remote storage,
+ it can still be read at high rate.
+
+The first section exposes a way to do this from the application side,
+and then follows the actual proposal.
+
+
+Option 0 : Achievable currently without any modification
+--------------------------------------------------------
+
+One way to handle this is just to ... skip through the pipeline from
+the application.
+
+In order to achieve this, a player application (controlling a
+GStreamer pipeline) _could_ send on a regular basis a series of 1.0
+non-flushing KEY_UNIT SEGMENT seeks for small intervals (say
+500ms). Note that this is a feature already supported by existing
+GStreamer elements.
+
+ KEY_UNIT : We want to go to keyframes (decoding start point) so that
+ decoding can happen as quickly as possible
+ SEGMENT : We want to be informed asynchronously when the demuxer is
+ done handling the requested seek range so that we can
+ calculate and trigger the next range.
+ non-flushing (except for the initial seek) : we want the requested
+ segments to be decoded fully (without flushing out the
+ pending data).
+
+
+1) Player requests initial seek :
+ Start : Pos
+ Stop : Pos + 500ms
+ Rate : 1.0
+ Flags : (FLUSHING on first time) + KEY_UNIT + SEGMENT
+
+2) Demuxer fetch closest keyframe (PosK1) and pushes out a segment
+from
+ Segment
+ Rate : 1.0
+ Start : PosK1
+ Stop : Pos + 500ms
+ Time : PosK1
+ Base : 0
+
+ Note that the duration is potentially bigger than the requested seek
+ duration ( Pos + 500ms - PosK1 >= 500ms).
+
+3) When the demuxer has pushed out that segment and relevant data it
+will emit on the bus a SEGMENT_DONE message with the end position.
+
+4) The player knows how much data was pushed out (from SEGMENT_START
+and SEGMENT_DONE messages), it can then calculate by how much it
+should move forward:
+
+ Amount played: (Pos + 500ms) - PosK1
+ This is the amount of data being played back (and potentially
+ queued). We now want to skip forward
+
+ Next pos : Pos2 = PosK1 +/- (Amount_played * S)
+ By this, we are essentially moving forward/backward by the
+ requested speedup.
+
+5) The player requests a new seek :
+ Start : Pos2
+ Stop : Pos2 + 500ms
+ Rate : 1.0
+ Flags : KEY_UNIT + SEGMENT
+
+6) Demuxer fetches closest keyframe (PosK2), and pushes a segment
+(Taking into account previously played data):
+ Segment
+ Rate : 1.0
+ Start : PosK2
+ Stop : Pos2 + 500ms
+ Time : PosK2
+ Base : Pos + 500ms - PosK1
+
+7) After playback, demuxer pushes SEGMENT_DONE
+8) ....
+
+Pros:
+ * No changes required in any elements
+
+Cons:
+ * Logic needs to be (re)implemented in every application
+ * Delay caused by back/forth between element and application for
+ every segment, could potentially cause delays.
+ * Application doesn't know optimal location of keyframes, so can't
+ push out only the requested amount of data (chunks played out will
+ be of variable length).
+ * There's a potential issue where the "next" position the
+ application requests is not beyond the next keyframe, resulting in
+ the demuxer pushing a very big segment again. This would require
+ the application to detect and handle such cases gracefully.
+
+
+
+Proposal : Move logic in demuxers
+---------------------------------
+
+In order to avoid the various "Cons" from option 0, the proposed way
+forward is to move the logic of figuring out which segments to be
+played back into the demuxers themselves and activate that mode if the
+application sends a seek with the SKIP flag enabled.
+
+1) Application sends a seek event with an initial position and rate:
+ Start : Pos
+ Stop : -1
+ Rate : R
+ Flags : SKIP
+
+2) Demuxer figures out the initial optimal initial previous and next
+ keyframe for Pos.
+
+ Pos
+ v
+ ------------------------------------------------------------------
+ ... Kn Kn+1 Kn+2 Kn+2 ...
+ ------------------------------------------------------------------
+ ^ ^- PosKn+1
+ PosKn
+
+ Based on the location of previous and next keyframe, it has the
+ option of either outputting just the keyframe, or of also outputting
+ some of the frames from just after that initial keyframe. For the
+ sake of completeness, we will hereafter see the case where we take
+ the second path.
+
+ The demuxer knows where to start outputting from (PosKn), and can
+ also figure out how much it should output:
+
+ Amount to push (Dn) : Next keyframe - Current Keyframe
+ --------------------------------
+ Speedup
+ : (PosKn+1 - PosKn) / S.
+
+ It will therefore push out the following Segment:
+ Rate : 1.0
+ Start : PosKn
+ Stop : PosKn + Dn
+ Time : PosKn
+ Base : 0
+
+ And then all the buffers (first one with DISCONT) from PosKn to
+ PosKn + Dn.
+
+ PosKn + Dn
+ v
+ ------------------------------------------------------------------
+ ... Kn////| Kn+1 Kn+2 Kn+2 ...
+ ------------------------------------------------------------------
+ ^- PosKn
+
+
+ Once the demuxer has pushed all that data, it goes back to
+ calculating what the next chunk is.
+
+ Figure out new current Keyframe : PosKn+1
+ Figure out new next Keyframe : PosKn+2
+ Figure out new duration to push : Dn+1 : (PosKn+2 - PosKn+1) / S
+
+ PosKn+1 + Dn+1
+ v
+ ------------------------------------------------------------------
+ ... Kn Kn+1////| Kn+2 Kn+2 ...
+ ------------------------------------------------------------------
+ ^ ^- PosKn+2
+ PosKn+1
+
+ And pushes out the following Segment:
+ Rate : 1.0
+ Start : PosKn+1
+ Stop : PosKn+1 + Dn+1
+ Time : Dn (previously elapsed duration)
+ Base : 0
+
+
+ Pros :
+ * Re-use segment-based algorithm already present in demuxers
+ * No re-timestamping of buffers is needed, only requirement is to
+ calculate the proper (updated) base value so that the running
+ time is coherent.
+
+
+Possible variants and improvements
+----------------------------------
+
+1) Keyframe only
+ The demuxer could just push out the keyframe (and no following
+ frames). This is the only change required. It still needs to
+ calculate the elapsed duration that keyframe will be displayed for
+ in the following Segment base values.
+
+ This might need to be activated for very high rates (> 8) where
+ displaying a few frames offers no visual benefits.
+
+2) Global speedup calculation
+ If the keyframe interval is non-constant, the demuxer would need to
+ figure out a better algorithm for which frames to output and how
+ much to skip (instead of blinding using next keyframes and
+ (next-keyfram-interval divided by speedup).
+
+3) Disable audio-track
+ In some higher rates, we might want to disable audio
+ altogether. Since the algorithms already know how much we're
+ skipping, the demuxer could push out GAP events on the audio tracks,
+ ensuring decoders/sinks move ahead properly.