~ldeks/beignet - Personal branch of beignet.

Age	Commit message (Collapse)	Author	Files	Lines
2015-05-15	Add stuct argument indirect load test.	Yang Rong	2	-2/+20
	1. Enable compiler_argument_structure_indirect. 2. Add compiler_argument_structure_indirect, which has select address and load argument instruction. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2015-05-13	add utest for intel_sub_group_shuffle	Guo Yejun	1	-0/+18
	v2: correct kernel to be suitable for simd_width both 8 and 16 Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@intel.com>
2015-05-12	rename __gen_ocl_simd_any/all to sub_group_any/all	Guo Yejun	4	-27/+27
	it is defined in https://www.khronos.org/registry/cl/extensions/intel/cl_intel_subgroups.txt Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-05-12	rename __gen_ocl_get_simd_id/size to get_sub_group_id/size	Guo Yejun	4	-13/+13
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2015-04-24	add utest for __gen_ocl_get_simd_id	Guo Yejun	1	-0/+8
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-04-24	add utest for __gen_ocl_get_simd_size	Guo Yejun	1	-0/+5
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2015-03-09	Modify the utest case for bswap.	Junyan He	1	-10/+14
	We add the test case for uniform when doing the bswap. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-03-09	add utest for load spir binary.	Luo Xionghu	1	-0/+0
	To generate SPIR binary, please refer to the page https://github.com/KhronosGroup/SPIR. For llvm3.2, the command is "clang -cc1 -emit-llvm-bc -triple spir-unknown-unknown -cl-std=CL1.2 -include opencl_spir.h compiler_ceil.cl -o compiler_ceil32.spir" For llvm3.5, the option -cl-kernel-arg-info is required, and option -fno-builtin is required to avoid warning. v2: add missing load_program_from_spir.cpp file. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-03-09	change the workitem related api to OVERLOABABLE.	Luo Xionghu	2	-4/+4
	the SPIR header file requirs these functions to be overlable. (https://github.com/KhronosGroup/SPIR-Tools/blob/master/headers/opencl_spir.h) Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-02-06	Add example to show libva buffer sharing with extension ↵	Chuanbo Weng	1	-0/+24
	clCreateImageFromLibvaIntel. This example reads a source nv12 file to a VASurface, and creates a target VASurface. Then creates corresponding cl image objects from them. After using ocl to do mirror effect post-processing on source VASurface, target VASurface is shown on screen by default. Code of loading nv12 file to VASurface are referenced from libva/test/encode/avcenc.c. v2: Delete 1920x1080.nv12 and 640x480.nv12 because of large size, add 256_128.nv12 as default test image. v3: 1. Re-org files: add libva as a submodule then use display related files. 2. Show result on screen by default instead of saving as a file. 3. Fix warnings. v4: Fix whitespace format warnings. v5: 1. Modify upload_nv12_to_surface to read a nv12 file and then upload it to an NV12 VASurface. Also modify store_surface_to_nv12. 2. Change the cl post-processing kernel from gray effect to mirror effect, which make demo cooler. 3. Minor fix of other problems. v6: Remove unnecessary OUTPUT_NV12_DEFAULT related code. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: "Guo, Yejun" <yejun.guo@intel.com>
2015-01-28	fix clz utest issue.	Luo Xionghu	3	-12/+6
	should use clz function instead of __builtin_clz. add zero input check. v2: add signed type test. remove redundant case. v3: remove printf. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20	Add test case for long bitcast.	Junyan He	1	-0/+47
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20	Add long NOT test case.	Junyan He	1	-0/+6
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20	Add test case for i64 div and rem.	Junyan He	1	-0/+12
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-20	Add test case for long mul_sat and mul_hi	Junyan He	1	-0/+19
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-15	add clz(count leading zero) utest.	Luo Xionghu	1	-0/+12
	this kernl calls the llvm __builtin_clz to generate llvm.clz function then call the gen instruction clz, different from the test compiler_clz_int, which use the fbh to implement. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2015-01-09	Add read buffer/image benchmark.	Yang Rong	2	-0/+40
	Add there two benchmark to compare the buffer and image performance V2: init the coord before read image. V3: Correct the image's width and buffer's read index. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-12-04	refine bswap utest to cover nsetc fail cases.	Luo Xionghu	1	-0/+1
	two bswap call in one block would trigger nsetc failures. the fail was fixed in backend already, just update the utest. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-12-04	utests: Add const private array initialization test.	Ruiling Song	1	-0/+9
	Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-12-02	add utest of CL_MEM_ALLOC_HOST_PTR	Guo Yejun	1	-0/+6
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-12-01	utests: Add one case to test negative index array access.	Zhigang Gong	1	-0/+9
	Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2014-11-27	add test for clCreateImageFromLibvaIntel	Guo Yejun	1	-0/+8
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-11	utests: remove all shader toy test cases.	Zhigang Gong	17	-1050/+0
	As we can't find the original license of these test cases, we have to remove them from beignet's unit test cases. Reported by "Rebecca N. Palmer" <rebecca_palmer@zoho.com>. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-11-11	Revert "add test for clCreateImageFromLibvaIntel"	Zhigang Gong	1	-8/+0
	This reverts commit 9e236b18542f2564e399bf13d4d1fbcc48a5ec9f.
2014-11-10	add test for clCreateImageFromLibvaIntel	Guo Yejun	1	-0/+8
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-07	add test for cl buffer created with CL_MEM_USE_HOST_PTR	Guo Yejun	1	-0/+6
	Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2014-11-06	utests: add a test to trigger cl_float3 bug in clSetKernelArg.	Ruiling Song	1	-0/+20
	Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-06	utest: add new test that trigger an assignment operation bug in if.	Chuanbo Weng	1	-0/+12
	This test case shows that assignment operation in if block seems does not affect lvalue. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-05	fix bswap kernel function type issue.	Luo Xionghu	1	-5/+10
	use MACRO to define the corresponding function. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-04	utests: replace the nodistriutable picture.	Zhigang Gong	4	-0/+0
	According to https://bugs.debian.org/758442, we should not use Len(n)a standard test image in our package. I just select a picture took by myself. Thanks Rebecca for pointing this out. v2: forgot to add sample.bmp. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-11-04	utest: change the box_blur_image to be identical to box_blur.	Zhigang Gong	1	-3/+5
	Change box_blur_image to read integer type surface thus it could be totally identical to the box_blur thus they can share the same reference image. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-11-04	add utest function bswap.	Luo Xionghu	1	-0/+7
	this llvm instrincs bswap function is generated by calling __builtin_bswap. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-11-03	add utest for llvm intrinsic call usub_with_overflow funtion.	Luo Xionghu	1	-15/+40
	as llvm couldn't recognize the pattern of usub overflow, this usub with is generated by calling the intrinsic function __builtin_usub_overflow; also this type of uadd intrinsic funtion couldn't support short/byte type overflow, we choose another way for the uadd kernel to generate short/byte overflow. will send patch to llvm later to fix the 2 issues. v2: split the patch. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-28	add utest compiler_overflow for llvm intrinsic function.	Luo Xionghu	1	-0/+20
	this case only runs for uadd_with_over_flow function so far. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-23	Add the test case for image 2d array fill	Junyan He	1	-0/+13
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-23	Add the test case for image 1d array fill	Junyan He	1	-0/+11
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-15	Fit the printf bug in loop	Junyan He	1	-11/+16
	The static analyse for printf can not totally work when the printf inst is within the loop and the loop can not be unrolled. This causes the printf just to print one info for a loop and to lose all the others. We now increment the exec number every time the printf inst is triggered. The number is stored for output all the message later. The problem is that we can not caculate the exact loops number for each printf inst. The wrong loop number will cause the data overwritten. We now assume all the printf inst are in loop and store the data like this: \| PRINTF1_DATA PRINTF2_DATA ... \| PRINTF1_DATA PRINTF2_DATA ... \| ... \| DATA_LOOP_ONE \| DATA_LOOP_TWO \| ... Although this may cause some space wasted. Another problem is that we need to decide the size of printf buffer because the loop upbound can not be caculated. We just set it yo 1M for small info slot request and 4M for big one. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Tested-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-10-14	add utest popcount for all types.	Luo Xionghu	1	-0/+16
	v2: add all types to test. v3: fix signed type count bits error. Signed-off-by: Luo Xionghu <xionghu.luo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18	GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp.	Ruiling Song	1	-0/+28
	Gen provide tm0 register for intra-kernel profiling. Here we provide an API __gen_ocl_get_timestamp() to return the timestamp in TM. The return type is defined as: struct time_stamp { ulong tick; uint event; }; 'tick' is a 64bit time tick. 'event' stores a value which means whether a tmEvent has occured (non-zero) or not (0). tmEvent includes time-impacting event such as context switch or frequency change since last time tm0 was read. I add a sample in the kernels/compiler_time_stamp.cl. Hope it would help you understand how to use it. V2: Introduce ir::ARFRegister to avoid directly use of nr/subnr in Gen IR. Rename __gen_ocl_extract_reg to __gen_ocl_region. Rename beignet_get_time_stamp to __gen_ocl_get_timestamp. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-18	Add long support for printf	Junyan He	1	-0/+3
	V2: Replace all the long and ulong to int64_t Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-09-03	Add new vload benchmark/test case.	Zhigang Gong	1	-0/+33
	v2: refine the benchmark case and don't mix it with normal unit test cases. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-07-31	utest: add new test for constant expression processing.	Zhigang Gong	1	-0/+23
	If we use 3-component vector in a union, it may introduce some complex constant expression as below: float bitcast (i32 trunc (i128 bitcast (<4 x i32> <i32 1065353216, i32 1073741824, i32 1077936128, i32 undef> to i128) to i32) to float). To test the constant expression processing function. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-07-30	GBE: Refine bti usage in backend & runtime.	Ruiling Song	1	-0/+23
	Previously, we simply map 2G surface for memory access, which has obvious security issue, user can easily read/write graphics memory that does not belong to him. To prevent such kind of behaviour, We bind each surface to a dedicated bti. HW provides automatic bounds check. For out-of-bound write, it will be ignored. And for read out-of-bound, hardware will simply return zero value. The idea behind the patch is for a load/store instruction, it will search through the LLVM use-def chain until finding out where the address comes from. Then the bti is saved in ir::Instruction and used for the later code generation. And for mixed pointer case, a load/store will access more than one bti. To simplify some code, '0' is reserved for constant address space, '1' is reserved for private address space. Other btis are assigned automatically by backend. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-24	Add the support for vector type in printf.	Junyan He	1	-2/+8
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-23	Add the test cases for 1D Image Array	Junyan He	2	-0/+38
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-23	Update the printf test case.	Junyan He	1	-0/+19
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-06-23	GBE: fix some get kernel arg info bugs.	Zhigang Gong	1	-1/+1
	Still can't handle the sampler_t which is not used actually. Access qualifier seems broken with llvm 3.3. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-06-13	Add the utest case for clGetKernelArgInfo	Junyan He	1	-0/+8
	Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-13	add [opencl-1.2] test case runtime_compile_link.	Luo	4	-0/+27
	Signed-off-by: Luo <xionghu.luo@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-06-13	Add the test case for 1D image from buffer	Junyan He	1	-0/+13
	v2: should not released the buffer which is handled by the utest helper. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>