summaryrefslogtreecommitdiff
path: root/kernels
AgeCommit message (Collapse)AuthorFilesLines
2014-09-12runtime: fix program binary type bug.Zhigang Gong1-1/+1
If the binary is a executable type, the first byte is zero and we need to set the binary type correctly to CL_PROGRAM_BINARY_TYPE_EXECUTABLE. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com> Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-09-05GBE: Refine bti usage in backend & runtime.Ruiling Song1-0/+23
Previously, we simply map 2G surface for memory access, which has obvious security issue, user can easily read/write graphics memory that does not belong to him. To prevent such kind of behaviour, We bind each surface to a dedicated bti. HW provides automatic bounds check. For out-of-bound write, it will be ignored. And for read out-of-bound, hardware will simply return zero value. The idea behind the patch is for a load/store instruction, it will search through the LLVM use-def chain until finding out where the address comes from. Then the bti is saved in ir::Instruction and used for the later code generation. And for mixed pointer case, a load/store will access more than one bti. To simplify some code, '0' is reserved for constant address space, '1' is reserved for private address space. Other btis are assigned automatically by backend. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-08-28utest: add new test for constant expression processing.Zhigang Gong1-0/+23
If we use 3-component vector in a union, it may introduce some complex constant expression as below: float bitcast (i32 trunc (i128 bitcast (<4 x i32> <i32 1065353216, i32 1073741824, i32 1077936128, i32 undef> to i128) to i32) to float). To test the constant expression processing function. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-06-24Add the support for vector type in printf.Junyan He1-2/+8
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-23Add the test cases for 1D Image ArrayJunyan He2-0/+38
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-23Update the printf test case.Junyan He1-0/+19
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-06-23GBE: fix some get kernel arg info bugs.Zhigang Gong1-1/+1
Still can't handle the sampler_t which is not used actually. Access qualifier seems broken with llvm 3.3. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-06-13Add the utest case for clGetKernelArgInfoJunyan He1-0/+8
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-13add [opencl-1.2] test case runtime_compile_link.Luo4-0/+27
Signed-off-by: Luo <xionghu.luo@intel.com> Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
2014-06-13Add the test case for 1D image from bufferJunyan He1-0/+13
v2: should not released the buffer which is handled by the utest helper. Signed-off-by: Junyan He <junyan.he@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
2014-06-13Add test cases for 1d image fill and copyJunyan He2-0/+17
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-11fix utest simd_any for simd width 8 and 16Guo Yejun1-1/+1
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-11Add the utest case for printfJunyan He1-0/+13
Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-06-09utests: add a double precision check test case.Zhigang Gong1-0/+11
v2: fix some bugs in test case. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> Reviewed-by: Yang Rong <rong.r.yang@intel.com>
2014-05-29utests: disable double test case.Ruiling Song2-2/+2
As we could not provide full support of double now, and my patch to refine long support breaks double load/store. So, we disable all double test cases. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2014-04-22add test for __gen_ocl_simd_any and __gen_ocl_simd_allGuo Yejun2-0/+27
Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-02-28GBE: support getelementptr with ConstantExpr operandGuo Yejun1-0/+18
Add support during LLVM IR -> Gen IR period when the first operand of getelementptr is ConstantExpr. utest is also added. Signed-off-by: Guo Yejun <yejun.guo@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-01-20Add utest compiler_private_data_overflowYongjia Zhang1-0/+10
utests: compiler_private_data_overflow is aimed to hit a larger than 1KB stack. It will fail with the old beignet which allocate 1KB stack size no matter the actual usage of stack in the kernel. Signed-off-by: Yongjia Zhang<zhang_yong_jia@126.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2014-01-16Change compiler_function_argument3 to cover llvm.memcpy.Yang Rong1-0/+2
We found clang wound emit llvm.memcpy when assign a stuct to another, if sizeof(struct) > 64. Add a assignment to produce llvm.memcpy. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-11-13GBE: fix the constant data allocation.Zhigang Gong1-1/+1
Need to keep consistency between the constant data allocation and the constant register allocation. So we need to skip the unused constant data at the constant data allocation stage. To avoid possible mismatching, add a new assert in the constant register(address) allocation stage to make sure the address register match the eaxct constant data. Also modify the constant utest slightly to hit this code path. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2013-11-07GBE: fix a 64bit scalar register issue.Ruiling Song1-3/+4
For scalar register, should use stride 0. also change the unit test to hit the point. v2: fix h2() Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-11-06utests: add test case for structure argumentLu Guanqun1-0/+69
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-11-06utests: use mad which will get better precision.Ruiling Song1-1/+1
Normal mul/add could not met the precision requirement of this case. Previously it passed because we will do mad optimization in backend. Use mad directly, so the test case does not depend on backend optimization. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-10-29utest: add test case for builtin function exp/exp2/exp10/expm1.Yi Sun1-0/+10
Signed-off-by: Yi Sun <yi.sun@intel.com> Signed-off-by: Yangwei Shui <yangweix.shui@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-10-29utest: Add test case for built-in function pow.Yi Sun1-0/+7
Signed-off-by: Yi Sun <yi.sun@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-10-22Add a test for vector argument deallocate assert.Yang Rong1-0/+12
V2: Add result compare. Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Ruiling Song <ruiling.song@intel.com>
2013-10-21Add more type for async copy test case.Yang Rong1-15/+23
Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-10-21GBE: Handle all-zero constant.Ruiling Song1-2/+13
Also refine Undef value support. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-10-18support saturated converting from 64-bit intHomer Hsing1-0/+16
This patch supports saturated converting from 64-bit int to shorter int, and from 32-bit float to 64-bit int. This patch also contains test case. version 2: ulong had been declared in some platform Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-10-18Add test case for newValueProxy of InsertElementInst.Yang Rong1-0/+11
Signed-off-by: Yang Rong <rong.r.yang@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-10-17utests: add test cases for function call.Ruiling Song2-0/+233
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-10-14GBE: Support local variable inside kernel function.Ruiling Song1-7/+21
As Clang treat local variable in similar way like global constant, (they are treated as Global variable in each own address space) we refine the previous constant implementation in order to share same code between local variable and global constant. We will allocate an address register for each GlobalVariable (constant or local) through calling newRegister(). In later step, through getRegister() we will get a proper register derived from the allocated address register. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-10-10saturated conversion of native GPU data type, larger to narrowerHomer Hsing1-0/+32
This patch supports saturated conversion of native GPU data type (char/short/int/float), from a larger-range data type to a narrower-range data type. For instance, convert_uchar_sat(int) Several test cases are in this patch. v2: add uint->int, int->uint Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-09-26GBE: Refine the curbe entry allocation for sampler/image information.Zhigang Gong1-1/+1
After the previous patch, we can move the image infomation curbe entry allocation to prior to the instruction selection. Then we can concentrate all curbe allocation before we do the normal register allocation. This way can bring two advantages: 1. Avoid the image information curbe entry is allocated among the normal registers. 2. The register interval analyzing could handle the image/sampler information correctly. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-09-25fix scalarizing of llvm phi nodeHomer Hsing1-0/+13
llvm phi node can have odd number of args. this patch also contains a test case. Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-09-18Utests: refine the previous fake 3D test cases.Zhigang Gong3-7/+24
All the previous 3D test cases are only using depth 1, and not really touch the 3D read/write code path. Now fix them. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Reviewed-by: He Junyan <junyan.he@inbox.com>
2013-09-18utests: add more constant test cases for composite type.Ruiling Song1-2/+57
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-09-17change constant test case to cover short/long type.Ruiling Song1-1/+12
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-09-17support converting 64-bit integer to 32-bit floatHomer Hsing1-0/+5
version 2: improve algorithm to convert signed integer fix source operand type in llvm_gen_backend enable predicate in addWithCarry change test case to test signed integer Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-09-11support converting 64-bit integer to shorter integerHomer Hsing1-0/+7
Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-09-11add built-in function "atan2"Homer Hsing1-0/+4
also improve the accuracy of built-in function "atan" also add a test case Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
2013-09-04Change constant unit test to cover 4 byte data type.Ruiling Song1-1/+1
Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-30Fix utest compiler_group_size4 error.Ruiling Song1-3/+3
Per opencl spec, bitfield is not supported. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-30add built-in function "lgamma", "lgamma_r"Homer Hsing2-0/+8
also include test cases Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-30add built-in function "tgamma"Homer Hsing1-0/+4
also include a test case Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-30improve built-in function "sinpi"Homer Hsing1-0/+4
"sinpi" was calculated as "sin(pi * x)". But that was not a quite-good way. This patch improved the function, also included a test case. v2: fix compiling warning Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-29Add a test case that trigger a known bug.Chuanbo Weng1-0/+21
This unit test case trigger a known bug: ASSERTION FAILED: TODO Boolean values cannot escape their definition basic block. Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-29utests: Add a unit test for non-aligned group size.Ruiling Song1-0/+17
To hit prediction logic. Signed-off-by: Ruiling Song <ruiling.song@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-19Utests: enable long/ulong for abs_diff test case.Zhigang Gong1-0/+2
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2013-08-19fix a typoHomer Hsing1-1/+1
Signed-off-by: Homer Hsing <homer.xing@intel.com> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>