design: Proposal for handling SKIP flag in demuxersdesign

author: Edward Hervey <bilboed@bilboed.com> 2014-08-27 10:12:23 +0200
committer: Edward Hervey <bilboed@bilboed.com> 2014-08-27 10:12:23 +0200
commit: 0c1c99be429cff83f1e30a94671995bc1b1ae7eb (patch)
tree: 68475fb942cce263f60ea3dae34bf997b17a657f
parent: b38264a9cb6096f9a92ee45ea0c084c7a877c0c4 (diff)
1 files changed, 259 insertions, 0 deletions
diff --git a/docs/design/part-demuxer-skip.txt b/docs/design/part-demuxer-skip.txt
new file mode 100644
index 000000000..5dfd0710b
--- /dev/null
+++ b/docs/design/part-demuxer-skip.txt
@@ -0,0 +1,259 @@
+Usage of SKIP seek flag in demuxers
+
+This document is to describe a potential way forward to support fast
+rates (forward and backward) in demuxers while lowering
+throughput. This concentrates mostly on container formats that have
+full visibility of the stream (i.e. contain an index).
+
+Out of scope:
+  Specifying what actions should be taken by the various elements in
+  the pipeline when the SKIP flag is used (should the decoder only
+  decode I frames ? At what rate ? Should the demuxer skip data or not
+  ? Should the demuxer not output anything on audio streams ? ...).
+
+When dealing with fast forward/reverse rates, it would be ideal to
+have a way for demuxers to only process a smaller amount of data.
+
+Rationale:
+* Directly reduce the amount of data to be processed by downstream
+  elements (decoders, ...) since the demuxer knows where decoding
+  start point (or keyframes) are located.
+* Reduce the amount of data fetch from upstream (file, http
+  location,...), thereby reducing the i/o and only fetching what is
+  needed.
+* Due to knowing the location of keyframes, a demuxer can handle trick
+  mode by skipping very quickly to such keyframes (and potentially a
+  few frames just after), which can be decoded straight away.
+
+The proposed logic is quite similar in intent to the proposal in
+gstreamer/docs/design/part-trickmodes.txt (client side forward
+trickmodes).
+
+The difference is that this proposal tries to:
+* Reduce processing complexity downstream (data outputted is always
+  sent with a forward rate of 1.0).
+* Avoid any changes in downstream elements
+* Reduce amount of changes required in demuxers and other elements.
+
+
+Mode of operation
+-----------------
+
+As stated in the GStreamer core design document, when the SKIP seek
+flag is used, the demuxer can chose to output only a part of the data.
+
+R : Requested rate (can be negative)
+S : Absolute value of R (always positive, called the Speedup).
+
+The global idea is to:
+1) Output at most the same average amount of data downstream to be
+  decoded in realtime (There are still <fps> frames decoded for each
+  walltime seconds, regardless of the rate). 
+2) Fetch from upstream the same average bitrate. This ensures that if
+  the file was readable at normal rate from a local or remote storage,
+  it can still be read at high rate.
+
+The first section exposes a way to do this from the application side,
+and then follows the actual proposal.
+
+
+Option 0 : Achievable currently without any modification
+--------------------------------------------------------
+
+One way to handle this is just to ... skip through the pipeline from
+the application.
+
+In order to achieve this, a player application (controlling a
+GStreamer pipeline) _could_ send on a regular basis a series of 1.0
+non-flushing KEY_UNIT SEGMENT seeks for small intervals (say
+500ms). Note that this is a feature already supported by existing
+GStreamer elements.
+
+ KEY_UNIT : We want to go to keyframes (decoding start point) so that
+            decoding can happen as quickly as possible
+ SEGMENT  : We want to be informed asynchronously when the demuxer is
+            done handling the requested seek range so that we can
+            calculate and trigger the next range.
+ non-flushing (except for the initial seek) : we want the requested
+            segments to be decoded fully (without flushing out the
+            pending data).
+
+
+1) Player requests initial seek :
+  Start : Pos
+  Stop  : Pos + 500ms
+  Rate  : 1.0
+  Flags : (FLUSHING on first time) + KEY_UNIT + SEGMENT
+
+2) Demuxer fetch closest keyframe (PosK1) and pushes out a segment
+from
+  Segment
+   Rate  : 1.0
+   Start : PosK1
+   Stop  : Pos + 500ms
+   Time  : PosK1
+   Base  : 0
+
+  Note that the duration is potentially bigger than the requested seek
+  duration ( Pos + 500ms - PosK1 >= 500ms).
+
+3) When the demuxer has pushed out that segment and relevant data it
+will emit on the bus a SEGMENT_DONE message with the end position.
+
+4) The player knows how much data was pushed out (from SEGMENT_START
+and SEGMENT_DONE messages), it can then calculate by how much it
+should move forward:
+
+   Amount played: (Pos + 500ms) - PosK1
+      This is the amount of data being played back (and potentially
+      queued). We now want to skip forward 
+
+   Next pos : Pos2 = PosK1 +/- (Amount_played * S)
+      By this, we are essentially moving forward/backward by the
+      requested speedup.
+
+5) The player requests a new seek :
+  Start : Pos2
+  Stop  : Pos2 + 500ms
+  Rate  : 1.0
+  Flags : KEY_UNIT + SEGMENT
+
+6) Demuxer fetches closest keyframe (PosK2), and pushes a segment
+(Taking into account previously played data):
+  Segment
+   Rate  : 1.0
+   Start : PosK2
+   Stop  : Pos2 + 500ms
+   Time  : PosK2
+   Base  : Pos + 500ms - PosK1
+
+7) After playback, demuxer pushes SEGMENT_DONE
+8) ....
+
+Pros:
+  * No changes required in any elements
+
+Cons:
+  * Logic needs to be (re)implemented in every application
+  * Delay caused by back/forth between element and application for
+    every segment, could potentially cause delays.
+  * Application doesn't know optimal location of keyframes, so can't
+    push out only the requested amount of data (chunks played out will
+    be of variable length).
+  * There's a potential issue where the "next" position the
+    application requests is not beyond the next keyframe, resulting in
+    the demuxer pushing a very big segment again. This would require
+    the application to detect and handle such cases gracefully.
+
+
+
+Proposal : Move logic in demuxers
+---------------------------------
+
+In order to avoid the various "Cons" from option 0, the proposed way
+forward is to move the logic of figuring out which segments to be
+played back into the demuxers themselves and activate that mode if the
+application sends a seek with the SKIP flag enabled.
+
+1) Application sends a seek event with an initial position and rate:
+  Start : Pos
+  Stop  : -1
+  Rate  : R
+  Flags : SKIP
+
+2) Demuxer figures out the initial optimal initial previous and next
+  keyframe for Pos.
+
+         Pos
+          v
+  ------------------------------------------------------------------
+ ... Kn                Kn+1              Kn+2             Kn+2  ...
+  ------------------------------------------------------------------
+     ^                  ^- PosKn+1
+    PosKn
+
+  Based on the location of previous and next keyframe, it has the
+  option of either outputting just the keyframe, or of also outputting
+  some of the frames from just after that initial keyframe. For the
+  sake of completeness, we will hereafter see the case where we take
+  the second path.
+
+  The demuxer knows where to start outputting from (PosKn), and can
+  also figure out how much it should output:
+
+    Amount to push (Dn) : Next keyframe - Current Keyframe
+                          --------------------------------
+                                     Speedup
+                        : (PosKn+1 - PosKn) / S.
+
+  It will therefore push out the following Segment:
+   Rate  : 1.0
+   Start : PosKn
+   Stop  : PosKn + Dn
+   Time  : PosKn
+   Base  : 0
+
+  And then all the buffers (first one with DISCONT) from PosKn to
+  PosKn + Dn.
+
+        PosKn + Dn
+           v
+  ------------------------------------------------------------------
+ ... Kn////|           Kn+1              Kn+2             Kn+2  ...
+  ------------------------------------------------------------------
+     ^- PosKn
+
+
+  Once the demuxer has pushed all that data, it goes back to
+  calculating what the next chunk is.
+
+    Figure out new current Keyframe : PosKn+1
+    Figure out new next Keyframe    : PosKn+2
+    Figure out new duration to push : Dn+1 : (PosKn+2 - PosKn+1) / S
+
+                          PosKn+1 + Dn+1
+                               v
+  ------------------------------------------------------------------
+ ... Kn                Kn+1////|         Kn+2             Kn+2  ...
+  ------------------------------------------------------------------
+                        ^                 ^- PosKn+2
+                       PosKn+1
+
+  And pushes out the following Segment:
+   Rate  : 1.0
+   Start : PosKn+1
+   Stop  : PosKn+1 + Dn+1
+   Time  : Dn (previously elapsed duration)
+   Base  : 0
+
+
+  Pros :
+   * Re-use segment-based algorithm already present in demuxers
+   * No re-timestamping of buffers is needed, only requirement is to
+     calculate the proper (updated) base value so that the running
+     time is coherent.
+
+
+Possible variants and improvements
+----------------------------------
+
+1) Keyframe only
+  The demuxer could just push out the keyframe (and no following
+  frames). This is the only change required. It still needs to
+  calculate the elapsed duration that keyframe will be displayed for
+  in the following Segment base values.
+
+  This might need to be activated for very high rates (> 8) where
+  displaying a few frames offers no visual benefits.
+
+2) Global speedup calculation
+  If the keyframe interval is non-constant, the demuxer would need to
+  figure out a better algorithm for which frames to output and how
+  much to skip (instead of blinding using next keyframes and
+  (next-keyfram-interval divided by speedup).
+
+3) Disable audio-track
+  In some higher rates, we might want to disable audio
+  altogether. Since the algorithms already know how much we're
+  skipping, the demuxer could push out GAP events on the audio tracks,
+  ensuring decoders/sinks move ahead properly.
author	Edward Hervey <bilboed@bilboed.com>	2014-08-27 10:12:23 +0200
committer	Edward Hervey <bilboed@bilboed.com>	2014-08-27 10:12:23 +0200
commit	0c1c99be429cff83f1e30a94671995bc1b1ae7eb (patch)
tree	68475fb942cce263f60ea3dae34bf997b17a657f
parent	b38264a9cb6096f9a92ee45ea0c084c7a877c0c4 (diff)