diff options
author | Edward Hervey <bilboed@bilboed.com> | 2014-08-27 10:12:23 +0200 |
---|---|---|
committer | Edward Hervey <bilboed@bilboed.com> | 2014-08-27 10:12:23 +0200 |
commit | 0c1c99be429cff83f1e30a94671995bc1b1ae7eb (patch) | |
tree | 68475fb942cce263f60ea3dae34bf997b17a657f | |
parent | b38264a9cb6096f9a92ee45ea0c084c7a877c0c4 (diff) |
design: Proposal for handling SKIP flag in demuxersdesign
-rw-r--r-- | docs/design/part-demuxer-skip.txt | 259 |
1 files changed, 259 insertions, 0 deletions
diff --git a/docs/design/part-demuxer-skip.txt b/docs/design/part-demuxer-skip.txt new file mode 100644 index 000000000..5dfd0710b --- /dev/null +++ b/docs/design/part-demuxer-skip.txt @@ -0,0 +1,259 @@ +Usage of SKIP seek flag in demuxers + +This document is to describe a potential way forward to support fast +rates (forward and backward) in demuxers while lowering +throughput. This concentrates mostly on container formats that have +full visibility of the stream (i.e. contain an index). + +Out of scope: + Specifying what actions should be taken by the various elements in + the pipeline when the SKIP flag is used (should the decoder only + decode I frames ? At what rate ? Should the demuxer skip data or not + ? Should the demuxer not output anything on audio streams ? ...). + +When dealing with fast forward/reverse rates, it would be ideal to +have a way for demuxers to only process a smaller amount of data. + +Rationale: +* Directly reduce the amount of data to be processed by downstream + elements (decoders, ...) since the demuxer knows where decoding + start point (or keyframes) are located. +* Reduce the amount of data fetch from upstream (file, http + location,...), thereby reducing the i/o and only fetching what is + needed. +* Due to knowing the location of keyframes, a demuxer can handle trick + mode by skipping very quickly to such keyframes (and potentially a + few frames just after), which can be decoded straight away. + +The proposed logic is quite similar in intent to the proposal in +gstreamer/docs/design/part-trickmodes.txt (client side forward +trickmodes). + +The difference is that this proposal tries to: +* Reduce processing complexity downstream (data outputted is always + sent with a forward rate of 1.0). +* Avoid any changes in downstream elements +* Reduce amount of changes required in demuxers and other elements. + + +Mode of operation +----------------- + +As stated in the GStreamer core design document, when the SKIP seek +flag is used, the demuxer can chose to output only a part of the data. + +R : Requested rate (can be negative) +S : Absolute value of R (always positive, called the Speedup). + +The global idea is to: +1) Output at most the same average amount of data downstream to be + decoded in realtime (There are still <fps> frames decoded for each + walltime seconds, regardless of the rate). +2) Fetch from upstream the same average bitrate. This ensures that if + the file was readable at normal rate from a local or remote storage, + it can still be read at high rate. + +The first section exposes a way to do this from the application side, +and then follows the actual proposal. + + +Option 0 : Achievable currently without any modification +-------------------------------------------------------- + +One way to handle this is just to ... skip through the pipeline from +the application. + +In order to achieve this, a player application (controlling a +GStreamer pipeline) _could_ send on a regular basis a series of 1.0 +non-flushing KEY_UNIT SEGMENT seeks for small intervals (say +500ms). Note that this is a feature already supported by existing +GStreamer elements. + + KEY_UNIT : We want to go to keyframes (decoding start point) so that + decoding can happen as quickly as possible + SEGMENT : We want to be informed asynchronously when the demuxer is + done handling the requested seek range so that we can + calculate and trigger the next range. + non-flushing (except for the initial seek) : we want the requested + segments to be decoded fully (without flushing out the + pending data). + + +1) Player requests initial seek : + Start : Pos + Stop : Pos + 500ms + Rate : 1.0 + Flags : (FLUSHING on first time) + KEY_UNIT + SEGMENT + +2) Demuxer fetch closest keyframe (PosK1) and pushes out a segment +from + Segment + Rate : 1.0 + Start : PosK1 + Stop : Pos + 500ms + Time : PosK1 + Base : 0 + + Note that the duration is potentially bigger than the requested seek + duration ( Pos + 500ms - PosK1 >= 500ms). + +3) When the demuxer has pushed out that segment and relevant data it +will emit on the bus a SEGMENT_DONE message with the end position. + +4) The player knows how much data was pushed out (from SEGMENT_START +and SEGMENT_DONE messages), it can then calculate by how much it +should move forward: + + Amount played: (Pos + 500ms) - PosK1 + This is the amount of data being played back (and potentially + queued). We now want to skip forward + + Next pos : Pos2 = PosK1 +/- (Amount_played * S) + By this, we are essentially moving forward/backward by the + requested speedup. + +5) The player requests a new seek : + Start : Pos2 + Stop : Pos2 + 500ms + Rate : 1.0 + Flags : KEY_UNIT + SEGMENT + +6) Demuxer fetches closest keyframe (PosK2), and pushes a segment +(Taking into account previously played data): + Segment + Rate : 1.0 + Start : PosK2 + Stop : Pos2 + 500ms + Time : PosK2 + Base : Pos + 500ms - PosK1 + +7) After playback, demuxer pushes SEGMENT_DONE +8) .... + +Pros: + * No changes required in any elements + +Cons: + * Logic needs to be (re)implemented in every application + * Delay caused by back/forth between element and application for + every segment, could potentially cause delays. + * Application doesn't know optimal location of keyframes, so can't + push out only the requested amount of data (chunks played out will + be of variable length). + * There's a potential issue where the "next" position the + application requests is not beyond the next keyframe, resulting in + the demuxer pushing a very big segment again. This would require + the application to detect and handle such cases gracefully. + + + +Proposal : Move logic in demuxers +--------------------------------- + +In order to avoid the various "Cons" from option 0, the proposed way +forward is to move the logic of figuring out which segments to be +played back into the demuxers themselves and activate that mode if the +application sends a seek with the SKIP flag enabled. + +1) Application sends a seek event with an initial position and rate: + Start : Pos + Stop : -1 + Rate : R + Flags : SKIP + +2) Demuxer figures out the initial optimal initial previous and next + keyframe for Pos. + + Pos + v + ------------------------------------------------------------------ + ... Kn Kn+1 Kn+2 Kn+2 ... + ------------------------------------------------------------------ + ^ ^- PosKn+1 + PosKn + + Based on the location of previous and next keyframe, it has the + option of either outputting just the keyframe, or of also outputting + some of the frames from just after that initial keyframe. For the + sake of completeness, we will hereafter see the case where we take + the second path. + + The demuxer knows where to start outputting from (PosKn), and can + also figure out how much it should output: + + Amount to push (Dn) : Next keyframe - Current Keyframe + -------------------------------- + Speedup + : (PosKn+1 - PosKn) / S. + + It will therefore push out the following Segment: + Rate : 1.0 + Start : PosKn + Stop : PosKn + Dn + Time : PosKn + Base : 0 + + And then all the buffers (first one with DISCONT) from PosKn to + PosKn + Dn. + + PosKn + Dn + v + ------------------------------------------------------------------ + ... Kn////| Kn+1 Kn+2 Kn+2 ... + ------------------------------------------------------------------ + ^- PosKn + + + Once the demuxer has pushed all that data, it goes back to + calculating what the next chunk is. + + Figure out new current Keyframe : PosKn+1 + Figure out new next Keyframe : PosKn+2 + Figure out new duration to push : Dn+1 : (PosKn+2 - PosKn+1) / S + + PosKn+1 + Dn+1 + v + ------------------------------------------------------------------ + ... Kn Kn+1////| Kn+2 Kn+2 ... + ------------------------------------------------------------------ + ^ ^- PosKn+2 + PosKn+1 + + And pushes out the following Segment: + Rate : 1.0 + Start : PosKn+1 + Stop : PosKn+1 + Dn+1 + Time : Dn (previously elapsed duration) + Base : 0 + + + Pros : + * Re-use segment-based algorithm already present in demuxers + * No re-timestamping of buffers is needed, only requirement is to + calculate the proper (updated) base value so that the running + time is coherent. + + +Possible variants and improvements +---------------------------------- + +1) Keyframe only + The demuxer could just push out the keyframe (and no following + frames). This is the only change required. It still needs to + calculate the elapsed duration that keyframe will be displayed for + in the following Segment base values. + + This might need to be activated for very high rates (> 8) where + displaying a few frames offers no visual benefits. + +2) Global speedup calculation + If the keyframe interval is non-constant, the demuxer would need to + figure out a better algorithm for which frames to output and how + much to skip (instead of blinding using next keyframes and + (next-keyfram-interval divided by speedup). + +3) Disable audio-track + In some higher rates, we might want to disable audio + altogether. Since the algorithms already know how much we're + skipping, the demuxer could push out GAP events on the audio tracks, + ensuring decoders/sinks move ahead properly. |