Age | Commit message (Collapse) | Author | Files | Lines |
|
Otherwise it will cause the incorrect intra-prediction for encoding on
Broadwell.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
|
|
The optimization by which the VA surface storage is deallocated after
it is displayed and not used for reference or vaDeriveImage() purposes
cannot be implemented safely. We need to honour explicit lifetimes
defined by the upper codec layer.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Keep the VA surface storage live until it is explicitly scheduled
for destruction through vaDestroySurfaces() interface. Otherwise,
subsequent vaPutSurface() calls would have no effect.
This fixes various use cases like: display of interlaced frames
that are not marked for reference, multiple rendering to Pixmap
for EXT_texture_from_pixmap and more precisely interlaced streams.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
H.264 MVC decoding support is defined as follows:
- Stereo High profile on Sandybridge and newer ;
- Multiview High profile on Haswell and newer.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Fill and submit MFX_AVC_PICID_STATE commands to Gen7.5+ hardware.
This optimizes the management of the DPB as the binding array can
now contain entries in any order. This also makes it possible to
support H.264 MultiView High profiles, with any particular number
of views.
v2: added more comments for clarity, removed an assert [Yakui]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Add new avc_find_picture() helper function to search for a VAPictureH264
struct based on the supplied VA surface id.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
If the RefPicListX[] entry has no valid picture_id associated to it,
then set the resulting state to 0xff. If that entry has no surface
buffer storage either, then compose a valid state that maps to the
first item in the reference frames list, as mandated by the PRM.
v2: dropped the superfluous "found" variable [Yakui]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Simplify and optimize the update process of the reference frame store.
Use less iterations to look up existing objects. Use a cache to store
the free'd slots.
Prerequisite: the reference_objects[] array was previously arranged in
a way that the element at index i is exactly the object_surface that
corresponds to the VA surface identified by the VAPictureH264.picture_id
located at index i in the ReferenceFrames[] array.
Theory of operations:
1. Obsolete entries are removed first, i.e. entries in the internal DPB
that no longer have a match in the supplied ReferenceFrames[] array.
That obsolete entry index is stored in a local cache: free_slots[].
2. This cache is completed with entries considered as "invalid" or "not
present", sequentially while traversing the frame store for obsolete
entries. At the end of this removal process, the free_slots[] array
represents all possible indices in there that could be re-used for
new reference frames to track.
3. The list of ReferenceFrames[] objects is traversed for new entries
that are not already in the frame store. If an entry needs to be
added, it is placed at the index obtained from the next free_slots[]
element. There is no need to traverse the frame store array again,
the next available slot can be known from that free_slots[] cache.
v2: dropped the superfluous "found" variable [Yakui]
v3: renamed "free_slots" array to "free_refs", which now holds
GenFrameStore entries
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Sometimes, a dummy frame comes from the codec layer and it is used
as a reference, per the comment in the existing code. Even though
this looks suspicious, keep this criterion but make sure to try
allocating the VA surface, if needed, earlier in the function that
sanity checks the parameters for decoding the current frame.
This makes it possible to fail at a much earlier time, and actually
make it possible to return a sensible error code to the upper layer.
Also fix the reference_objects[] array elements to be an exact 1:1
match for ReferenceFrames[] array elements, including possible but
unlikely holes in it. The former array holds object_surface structs
corresponding to the VA surfaces present in the ReferenceFrames[]
array and identified by VAPictureH264.picture_id.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Drop the optimization whereby surfaces that are no longer marked as
reference and that were already displayed are to be destroyed. This
is wrong mainly for two reasons:
1. The surface was displayed... once but it may still be needed for
subsequent operations like displaying it again, using it for a
transcode pipeline (encode) for instance, etc.
2. The new set of ReferenceFrames[] correspond to the active set of
reference frames used for decoding the current slice. In presence
of Multiview Coding (MVC), that could correspond to the current
view, in view order index, but the surface may still be needed
for decoding the next view with the same view_id, while also
decoding other views with another set of reference frames for them.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
|
|
Broadwell now uses a unique DMV buffer, irrespective of any field
coding mode. The dmv_buffer is not used, so it doesn't need to be
allocated at all.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
(cherry picked from commit 60ea472b116a2e245fa8579355c47eb501bfa20a)
|
|
Don't allocate tiled surfaces on Ironlake platforms and earlier, stick
to linear surfaces.
This is a regression from 6d76944.
Reported-by: Haihao Xiang <haihao.xiang@intel.com>
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
(cherry picked from commit 628c958f4881900548ed80be1286060db68e0115)
|
|
Zero initialize the packed raw data index array and
packed slice header index array during each preallocation.
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Reviewed-by: Zhao, Yakui <yakui.zhao@intel.com>
|
|
VA_INTEL_DEBUG_ASSERT decides assert() is enabled or not
VA_INTEL_DEBUG_BENCH decides skipping swapbuffer in dri output
|
|
Optimize support for grayscale surfaces in two aspects: (i) space
by only allocating the luminance component ; (ii) speed by avoiding
initialization of the (now inexistent) chrominance planes.
Keep backward compatibility with older codec layers that only
supported YUV 4:2:0 and not grayscale formats properly.
v2: fix check for extra H.264 chroma formats [Haihao]
v3: add h264 chroma formats to SNB+ and not { ILK, IVB+ }
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Add new avc_ensure_surface_bo() helper function to factor out the
allocatiion and initialization processes of the reconstructed VA
surface buffer stores.
Keep preferred native format (NV12) and initialize chroma values
to 0.0 (0x80) when needed for "fake" grayscale (Y800) surfaces
implemented on top of existing NV12.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
If the hardware supports JPEG decoding, then we have to expose the
right set of chroma formats for the output (decoded) VA surface. In
particular, we could support YUV 4:0:0, 4:1:0, 4:2:2 and 4:4:4.
v2: export support for YUV 4:0:0 (grayscale) too [Haihao]
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Only validate the user-defined chroma format (VAConfigAttribRTFormat)
attribute, if any. Don't override it. i.e. append a pre-defined value
only if it was not defined by the user beforehand.
Propertly return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT if the supplied
chroma format is not supported.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Factor out code to validate profile/entrypoint per the underlying
hardware capabilities. Also fix vaGetConfigAttributes() to really
validate the profile/entrypoint pair.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Introduce a new i965_destroy_surface_storage() helper function to
unreference the underlying GEM buffer object, and any associated
private data, if any.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Fix size of the allocated buffer used to represent grayscale (Y800)
surfaces. Only the luminance component is needed, thus implying a
single plane.
Likewise, update render routines to only submit the first plane.
The existing render kernels readily only care about that single
plane.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
|
|
Scaling is done on each 16x16 block. The shader for scaling
might write pixels out-of-rectangle if the rectangle width/height
isn't aligned to 16.
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
v2: bpp[] is in unit of bits
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
and hold all supported fourcc in an array
v2: bpp[] in bit and fix the vertical factor for 411P (Yakui)
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=79065
The regression is caused by commit 42258e1
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Sometimes pending datas are added in slice data buffer, however
HW requires slice data length excludes pending datas, otherwise
the behavior is undefined
https://bugs.freedesktop.org/show_bug.cgi?id=77041
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>
Reviewed-by: Zhao, Yakui <yakui.zhao@intel.com>
|
|
slice_header data
Otherwise the slice qp is inconsistent and the encoding is incorrect.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
|
|
Under some encoding scenario, the user hopes to generate the packed slice
header data by themself and then the driver can insert the passed slice
header packed data into the coded clip.
1.The VA_ENC_PACKED_HEADER_SLICE flag is exported and it is treated as optional.
This is to say: if packed slice header data is passed, it will be
inserted directly. If no packed slice header data is passed, the driver will
help to generate it.
2.Another restriction is that the packed slice header data is inserted after
the packed rawdata for one slice. That is to say: If it needs to insert the
packed rawdata and slice header data, the packed rawdata will be inserted
firstly(This is handled by the driver).
Signed-off-by: Zhao, Yakui <yakui.zhao@intel.com>
|
|
packed header type/data
After adding the support of inserting the packed rawdata, more group of packed header data
can be passed. In order to insert the packed rawdata correctly, the packed header type/
data should be paired.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
|
|
Under some encoding scenario, the user-space application hopes that the driver
can insert the passed packed rawdata into the coded clip. This is to allow the
insertion of packed rawdata passed from user. As the position of packed rawdata
is related with the slice. So the following restrictions are added:
1. the packed rawdata header type/data should be paired.
2. the packed rawdata data is inserted by following the passed order
3. the packed rawdata header type/data is split by using VAEncSliceParameterBuffer.
That is to say: The packed rawdata for slice 0 should be passed before the first
VAEncSliceParameterBuffer. After one VAEncSliceParameterBuffer is parsed,
the subseuquent packed rawdata is for another new slice. The subsequent
packed rawdata after the last VAEncSliceParameterBuffer is ignored.
4. it does not change the rule for the packed data of SPS/PPS/MISC type.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
|
|
Under some encoding scenario, the user-space application hopes that the driver
can insert the passed packed rawdata into the coded clip. But the insertion of
packed rawdata is related with the slice. So some data structures are added so
that it can store how the packed rawdata is inserted into the coded clip
per-slice.
Signed-off-by: Zhao, Yakui <yakui.zhao@intel.com>
|
|
of HW skip bytes
When the packed header data from user is inserted into the coded clip, it uses
the hacked code to check the number of HW skip emulation bytes. This is wrong.
So fix it.
Of course if the packed header data is generated by the driver, it is
unnecessary to check it and it can still use the pre-defined number of HW
skip bytes.
V1->V2: Based on Gwenole's comment more nal_unit_type is added.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
|
|
Signed-off-by: Sebastian Ramacher <sramacher@debian.org>
Reviewed-by: Zhao, Yakui <yakui.zhao@intel.com>
|
|
Signed-off-by: Sebastian Ramacher <sramacher@debian.org>
Reviewed-by: Zhao, Yakui <yakui.zhao@intel.com>
|
|
Set the right surface states for reference, STMM and output surface,
fix the shader as well
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Tested-By: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
|
|
Some MPEG-2 videos set progressive_frame to 1 and set
frame_pred_frame_dct to 0, which is not conformed to MPEG-2 spec.
bottom field may be used to form prediction if frame_pred_frame_dct is
0. Previously the bottom field is excluded from the frame store list
https://bugs.freedesktop.org/show_bug.cgi?id=73424
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
pitch must be 64 at least for linear surface for most functions on IVB/HSW/BDW
such VEBOX, Data port media read/write
https://bugs.freedesktop.org/show_bug.cgi?id=72522
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Directly check the flag of has_vpp in codec_info
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
If segmentation is enabled, then the segmentation map shall be live
across frames until the current frame updates the segment ids. This
means that the driver needs to maintain the segmentation map buffer
allocation and enable writes (resp. reads) whenever necessary.
This fixes decoding of 00-comprehensive-010.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
(cherry picked from commit 61fbb1bba1ad8a41ffae4fd1ba90391adf819b6e)
|
|
Ivy/Haswell/BDW
Currently zero is written to alpha channel when doing the conversion
from NV12 to RGBA(BGRA), which affects the following the rendering operation.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
|
|
It is always true or false
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
It is to reduce the usage of IS_GENxxx() as well.
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
It is to reduce the usage of IS_GENxxx()
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Now it can directly use the information in intel_device_info instead of
checking the pci id.
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Instead directly use the value stored in intel_device_info
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Instead directly use the value stored in intel_device_info
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|
|
To store statically known device information
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
|