summaryrefslogtreecommitdiff
path: root/kernels
AgeCommit message (Collapse)AuthorFilesLines
2015-05-15Add stuct argument indirect load test.Yang Rong2-2/+20
1. Enable compiler_argument_structure_indirect. 2. Add compiler_argument_structure_indirect, which has select address and load argument instruction. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2015-05-13add utest for intel_sub_group_shuffleGuo Yejun1-0/+18
v2: correct kernel to be suitable for simd_width both 8 and 16 Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2015-05-12rename __gen_ocl_simd_any/all to sub_group_any/allGuo Yejun4-27/+27
it is defined in https://www.khronos.org/registry/cl/extensions/intel/cl_intel_subgroups.txt Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-05-12rename __gen_ocl_get_simd_id/size to get_sub_group_id/sizeGuo Yejun4-13/+13
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2015-04-24add utest for __gen_ocl_get_simd_idGuo Yejun1-0/+8
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-04-24add utest for __gen_ocl_get_simd_sizeGuo Yejun1-0/+5
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-03-09Modify the utest case for bswap.Junyan He1-10/+14
We add the test case for uniform when doing the bswap. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-03-09add utest for load spir binary.Luo Xionghu1-0/+0
To generate SPIR binary, please refer to the page https://github.com/KhronosGroup/SPIR. For llvm3.2, the command is "clang -cc1 -emit-llvm-bc -triple spir-unknown-unknown -cl-std=CL1.2 -include opencl_spir.h compiler_ceil.cl -o compiler_ceil32.spir" For llvm3.5, the option -cl-kernel-arg-info is required, and option -fno-builtin is required to avoid warning. v2: add missing load_program_from_spir.cpp file. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-03-09change the workitem related api to OVERLOABABLE.Luo Xionghu2-4/+4
the SPIR header file requirs these functions to be overlable. (https://github.com/KhronosGroup/SPIR-Tools/blob/master/headers/opencl_spir.h) Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-02-06Add example to show libva buffer sharing with extension ↵Chuanbo Weng1-0/+24
clCreateImageFromLibvaIntel. This example reads a source nv12 file to a VASurface, and creates a target VASurface. Then creates corresponding cl image objects from them. After using ocl to do mirror effect post-processing on source VASurface, target VASurface is shown on screen by default. Code of loading nv12 file to VASurface are referenced from libva/test/encode/avcenc.c. v2: Delete 1920x1080.nv12 and 640x480.nv12 because of large size, add 256_128.nv12 as default test image. v3: 1. Re-org files: add libva as a submodule then use display related files. 2. Show result on screen by default instead of saving as a file. 3. Fix warnings. v4: Fix whitespace format warnings. v5: 1. Modify upload_nv12_to_surface to read a nv12 file and then upload it to an NV12 VASurface. Also modify store_surface_to_nv12. 2. Change the cl post-processing kernel from gray effect to mirror effect, which make demo cooler. 3. Minor fix of other problems. v6: Remove unnecessary OUTPUT_NV12_DEFAULT related code. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: "Guo, Yejun" <yejun.guo@intel.com>
2015-01-28fix clz utest issue.Luo Xionghu3-12/+6
should use clz function instead of __builtin_clz. add zero input check. v2: add signed type test. remove redundant case. v3: remove printf. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20Add test case for long bitcast.Junyan He1-0/+47
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20Add long NOT test case.Junyan He1-0/+6
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20Add test case for i64 div and rem.Junyan He1-0/+12
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20Add test case for long mul_sat and mul_hiJunyan He1-0/+19
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-15add clz(count leading zero) utest.Luo Xionghu1-0/+12
this kernl calls the llvm __builtin_clz to generate llvm.clz function then call the gen instruction clz, different from the test compiler_clz_int, which use the fbh to implement. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-09Add read buffer/image benchmark.Yang Rong2-0/+40
Add there two benchmark to compare the buffer and image performance V2: init the coord before read image. V3: Correct the image's width and buffer's read index. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-12-04refine bswap utest to cover nsetc fail cases.Luo Xionghu1-0/+1
two bswap call in one block would trigger nsetc failures. the fail was fixed in backend already, just update the utest. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-12-04utests: Add const private array initialization test.Ruiling Song1-0/+9
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-12-02add utest of CL_MEM_ALLOC_HOST_PTRGuo Yejun1-0/+6
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-12-01utests: Add one case to test negative index array access.Zhigang Gong1-0/+9
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-27add test for clCreateImageFromLibvaIntelGuo Yejun1-0/+8
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-11utests: remove all shader toy test cases.Zhigang Gong17-1050/+0
As we can't find the original license of these test cases, we have to remove them from beignet's unit test cases. Reported by "Rebecca N. Palmer" <rebecca_palmer@zoho.com>. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-11-11Revert "add test for clCreateImageFromLibvaIntel"Zhigang Gong1-8/+0
This reverts commit 9e236b18542f2564e399bf13d4d1fbcc48a5ec9f.
2014-11-10add test for clCreateImageFromLibvaIntelGuo Yejun1-0/+8
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-07add test for cl buffer created with CL_MEM_USE_HOST_PTRGuo Yejun1-0/+6
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2014-11-06utests: add a test to trigger cl_float3 bug in clSetKernelArg.Ruiling Song1-0/+20
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-06utest: add new test that trigger an assignment operation bug in if.Chuanbo Weng1-0/+12
This test case shows that assignment operation in if block seems does not affect lvalue. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-05fix bswap kernel function type issue.Luo Xionghu1-5/+10
use MACRO to define the corresponding function. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-04utests: replace the nodistriutable picture.Zhigang Gong4-0/+0
According to https://bugs.debian.org/758442, we should not use Len(n)a standard test image in our package. I just select a picture took by myself. Thanks Rebecca for pointing this out. v2: forgot to add sample.bmp. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-11-04utest: change the box_blur_image to be identical to box_blur.Zhigang Gong1-3/+5
Change box_blur_image to read integer type surface thus it could be totally identical to the box_blur thus they can share the same reference image. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-11-04add utest function bswap.Luo Xionghu1-0/+7
this llvm instrincs bswap function is generated by calling __builtin_bswap. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-03add utest for llvm intrinsic call usub_with_overflow funtion.Luo Xionghu1-15/+40
as llvm couldn't recognize the pattern of usub overflow, this usub with is generated by calling the intrinsic function __builtin_usub_overflow; also this type of uadd intrinsic funtion couldn't support short/byte type overflow, we choose another way for the uadd kernel to generate short/byte overflow. will send patch to llvm later to fix the 2 issues. v2: split the patch. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-28add utest compiler_overflow for llvm intrinsic function.Luo Xionghu1-0/+20
this case only runs for uadd_with_over_flow function so far. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-23Add the test case for image 2d array fillJunyan He1-0/+13
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-23Add the test case for image 1d array fillJunyan He1-0/+11
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-15Fit the printf bug in loopJunyan He1-11/+16
The static analyse for printf can not totally work when the printf inst is within the loop and the loop can not be unrolled. This causes the printf just to print one info for a loop and to lose all the others. We now increment the exec number every time the printf inst is triggered. The number is stored for output all the message later. The problem is that we can not caculate the exact loops number for each printf inst. The wrong loop number will cause the data overwritten. We now assume all the printf inst are in loop and store the data like this: | PRINTF1_DATA PRINTF2_DATA ... | PRINTF1_DATA PRINTF2_DATA ... | ... | DATA_LOOP_ONE | DATA_LOOP_TWO | ... Although this may cause some space wasted. Another problem is that we need to decide the size of printf buffer because the loop upbound can not be caculated. We just set it yo 1M for small info slot request and 4M for big one. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-14add utest popcount for all types.Luo Xionghu1-0/+16
v2: add all types to test. v3: fix signed type count bits error. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.Ruiling Song1-0/+28
Gen provide tm0 register for intra-kernel profiling. Here we provide an API __gen_ocl_get_timestamp() to return the timestamp in TM. The return type is defined as: struct time_stamp { ulong tick; uint event; }; 'tick' is a 64bit time tick. 'event' stores a value which means whether a tmEvent has occured (non-zero) or not (0). tmEvent includes time-impacting event such as context switch or frequency change since last time tm0 was read. I add a sample in the kernels/compiler_time_stamp.cl. Hope it would help you understand how to use it. V2: Introduce ir::ARFRegister to avoid directly use of nr/subnr in Gen IR. Rename __gen_ocl_extract_reg to __gen_ocl_region. Rename beignet_get_time_stamp to __gen_ocl_get_timestamp. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18Add long support for printfJunyan He1-0/+3
V2: Replace all the long and ulong to int64_t Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-03Add new vload benchmark/test case.Zhigang Gong1-0/+33
v2: refine the benchmark case and don't mix it with normal unit test cases. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-07-31utest: add new test for constant expression processing.Zhigang Gong1-0/+23
If we use 3-component vector in a union, it may introduce some complex constant expression as below: float bitcast (i32 trunc (i128 bitcast (<4 x i32> <i32 1065353216, i32 1073741824, i32 1077936128, i32 undef> to i128) to i32) to float). To test the constant expression processing function. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-07-30GBE: Refine bti usage in backend & runtime.Ruiling Song1-0/+23
Previously, we simply map 2G surface for memory access, which has obvious security issue, user can easily read/write graphics memory that does not belong to him. To prevent such kind of behaviour, We bind each surface to a dedicated bti. HW provides automatic bounds check. For out-of-bound write, it will be ignored. And for read out-of-bound, hardware will simply return zero value. The idea behind the patch is for a load/store instruction, it will search through the LLVM use-def chain until finding out where the address comes from. Then the bti is saved in ir::Instruction and used for the later code generation. And for mixed pointer case, a load/store will access more than one bti. To simplify some code, '0' is reserved for constant address space, '1' is reserved for private address space. Other btis are assigned automatically by backend. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-24Add the support for vector type in printf.Junyan He1-2/+8
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-23Add the test cases for 1D Image ArrayJunyan He2-0/+38
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-23Update the printf test case.Junyan He1-0/+19
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-06-23GBE: fix some get kernel arg info bugs.Zhigang Gong1-1/+1
Still can't handle the sampler_t which is not used actually. Access qualifier seems broken with llvm 3.3. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-06-13Add the utest case for clGetKernelArgInfoJunyan He1-0/+8
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-13add [opencl-1.2] test case runtime_compile_link.Luo4-0/+27
Signed-off-by: Luo <xionghu.luo@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-06-13Add the test case for 1D image from bufferJunyan He1-0/+13
v2: should not released the buffer which is handled by the utest helper. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>