Age | Commit message (Collapse) | Author | Files | Lines |
|
1. Enable compiler_argument_structure_indirect.
2. Add compiler_argument_structure_indirect, which has select address and load argument instruction.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
v2: correct kernel to be suitable for simd_width both 8 and 16
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
it is defined in https://www.khronos.org/registry/cl/extensions/intel/cl_intel_subgroups.txt
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
We add the test case for uniform when doing the bswap.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
To generate SPIR binary, please refer to the page
https://github.com/KhronosGroup/SPIR.
For llvm3.2, the command is "clang -cc1 -emit-llvm-bc -triple
spir-unknown-unknown -cl-std=CL1.2 -include opencl_spir.h
compiler_ceil.cl -o compiler_ceil32.spir"
For llvm3.5, the option -cl-kernel-arg-info is required,
and option -fno-builtin is required to avoid warning.
v2: add missing load_program_from_spir.cpp file.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
the SPIR header file requirs these functions to be overlable.
(https://github.com/KhronosGroup/SPIR-Tools/blob/master/headers/opencl_spir.h)
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
clCreateImageFromLibvaIntel.
This example reads a source nv12 file to a VASurface, and creates a
target
VASurface. Then creates corresponding cl image objects from them. After
using ocl to do mirror effect post-processing on source VASurface,
target
VASurface is shown on screen by default.
Code of loading nv12 file to VASurface are referenced from
libva/test/encode/avcenc.c.
v2:
Delete 1920x1080.nv12 and 640x480.nv12 because of large size, add
256_128.nv12 as default test image.
v3:
1. Re-org files: add libva as a submodule then use display related
files.
2. Show result on screen by default instead of saving as a file.
3. Fix warnings.
v4: Fix whitespace format warnings.
v5:
1. Modify upload_nv12_to_surface to read a nv12 file and then upload
it to an NV12 VASurface. Also modify store_surface_to_nv12.
2. Change the cl post-processing kernel from gray effect to mirror
effect, which make demo cooler.
3. Minor fix of other problems.
v6: Remove unnecessary OUTPUT_NV12_DEFAULT related code.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: "Guo, Yejun" <yejun.guo@intel.com>
|
|
should use clz function instead of __builtin_clz.
add zero input check.
v2: add signed type test. remove redundant case.
v3: remove printf.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
this kernl calls the llvm __builtin_clz to generate llvm.clz function
then call the gen instruction clz, different from the test
compiler_clz_int, which use the fbh to implement.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Add there two benchmark to compare the buffer and image performance
V2: init the coord before read image.
V3: Correct the image's width and buffer's read index.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
two bswap call in one block would trigger nsetc failures.
the fail was fixed in backend already, just update the utest.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
As we can't find the original license of these test cases, we
have to remove them from beignet's unit test cases.
Reported by "Rebecca N. Palmer" <rebecca_palmer@zoho.com>.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
This reverts commit 9e236b18542f2564e399bf13d4d1fbcc48a5ec9f.
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
This test case shows that assignment operation in if block seems
does not affect lvalue.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
use MACRO to define the corresponding function.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
According to https://bugs.debian.org/758442, we should not use
Len(n)a standard test image in our package. I just select a picture
took by myself. Thanks Rebecca for pointing this out.
v2:
forgot to add sample.bmp.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Change box_blur_image to read integer type surface thus
it could be totally identical to the box_blur thus they
can share the same reference image.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
this llvm instrincs bswap function is generated by calling
__builtin_bswap.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
as llvm couldn't recognize the pattern of usub overflow, this usub
with is generated by calling the intrinsic function __builtin_usub_overflow;
also this type of uadd intrinsic funtion couldn't support short/byte type
overflow, we choose another way for the uadd kernel to generate short/byte
overflow.
will send patch to llvm later to fix the 2 issues.
v2: split the patch.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
this case only runs for uadd_with_over_flow function so far.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
The static analyse for printf can not totally work
when the printf inst is within the loop and the loop
can not be unrolled. This causes the printf just to
print one info for a loop and to lose all the others.
We now increment the exec number every time the printf
inst is triggered. The number is stored for output all
the message later.
The problem is that we can not caculate the exact loops
number for each printf inst. The wrong loop number will
cause the data overwritten. We now assume all the printf
inst are in loop and store the data like this:
| PRINTF1_DATA PRINTF2_DATA ... | PRINTF1_DATA PRINTF2_DATA ... | ...
| DATA_LOOP_ONE | DATA_LOOP_TWO | ...
Although this may cause some space wasted.
Another problem is that we need to decide the size of printf buffer
because the loop upbound can not be caculated. We just set
it yo 1M for small info slot request and 4M for big one.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
v2: add all types to test.
v3: fix signed type count bits error.
Signed-off-by: Luo Xionghu <xionghu.luo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Gen provide tm0 register for intra-kernel profiling.
Here we provide an API __gen_ocl_get_timestamp() to return
the timestamp in TM.
The return type is defined as:
struct time_stamp {
ulong tick;
uint event;
};
'tick' is a 64bit time tick. 'event' stores a value which means
whether a tmEvent has occured (non-zero) or not (0). tmEvent includes
time-impacting event such as context switch or frequency change
since last time tm0 was read.
I add a sample in the kernels/compiler_time_stamp.cl. Hope it
would help you understand how to use it.
V2:
Introduce ir::ARFRegister to avoid directly use of nr/subnr in Gen IR.
Rename __gen_ocl_extract_reg to __gen_ocl_region.
Rename beignet_get_time_stamp to __gen_ocl_get_timestamp.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
V2:
Replace all the long and ulong to int64_t
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
v2:
refine the benchmark case and don't mix it with normal
unit test cases.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
If we use 3-component vector in a union, it may introduce
some complex constant expression as below:
float bitcast (i32 trunc (i128 bitcast (<4 x i32> <i32 1065353216, i32 1073741824, i32 1077936128, i32 undef> to i128) to i32) to float).
To test the constant expression processing function.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
Previously, we simply map 2G surface for memory access,
which has obvious security issue, user can easily read/write graphics
memory that does not belong to him. To prevent such kind of behaviour,
We bind each surface to a dedicated bti. HW provides automatic
bounds check. For out-of-bound write, it will be ignored. And for read
out-of-bound, hardware will simply return zero value.
The idea behind the patch is for a load/store instruction, it will search
through the LLVM use-def chain until finding out where the address
comes from. Then the bti is saved in ir::Instruction and used for
the later code generation. And for mixed pointer case, a load/store
will access more than one bti.
To simplify some code, '0' is reserved for constant address space,
'1' is reserved for private address space. Other btis are assigned
automatically by backend.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
Still can't handle the sampler_t which is not used actually.
Access qualifier seems broken with llvm 3.3.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
v2:
should not released the buffer which is handled by the utest helper.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|