Age | Commit message (Collapse) | Author | Files | Lines |
|
If the binary is a executable type, the first byte is zero and
we need to set the binary type correctly to CL_PROGRAM_BINARY_TYPE_EXECUTABLE.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
Tested-by: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
Previously, we simply map 2G surface for memory access,
which has obvious security issue, user can easily read/write graphics
memory that does not belong to him. To prevent such kind of behaviour,
We bind each surface to a dedicated bti. HW provides automatic
bounds check. For out-of-bound write, it will be ignored. And for read
out-of-bound, hardware will simply return zero value.
The idea behind the patch is for a load/store instruction, it will search
through the LLVM use-def chain until finding out where the address
comes from. Then the bti is saved in ir::Instruction and used for
the later code generation. And for mixed pointer case, a load/store
will access more than one bti.
To simplify some code, '0' is reserved for constant address space,
'1' is reserved for private address space. Other btis are assigned
automatically by backend.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
If we use 3-component vector in a union, it may introduce
some complex constant expression as below:
float bitcast (i32 trunc (i128 bitcast (<4 x i32> <i32 1065353216, i32 1073741824, i32 1077936128, i32 undef> to i128) to i32) to float).
To test the constant expression processing function.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
Still can't handle the sampler_t which is not used actually.
Access qualifier seems broken with llvm 3.3.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Luo <xionghu.luo@intel.com>
Reviewed-by: "Song, Ruiling" <ruiling.song@intel.com>
|
|
v2:
should not released the buffer which is handled by the utest helper.
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Junyan He <junyan.he@linux.intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
v2:
fix some bugs in test case.
Signed-off-by: Zhigang Gong <zhigang.gong@intel.com>
Reviewed-by: Yang Rong <rong.r.yang@intel.com>
|
|
As we could not provide full support of double now,
and my patch to refine long support breaks double load/store.
So, we disable all double test cases.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Add support during LLVM IR -> Gen IR period when the
first operand of getelementptr is ConstantExpr.
utest is also added.
Signed-off-by: Guo Yejun <yejun.guo@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
utests: compiler_private_data_overflow is aimed to hit a larger than
1KB stack. It will fail with the old beignet which allocate 1KB stack
size no matter the actual usage of stack in the kernel.
Signed-off-by: Yongjia Zhang<zhang_yong_jia@126.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
We found clang wound emit llvm.memcpy when assign a stuct to another,
if sizeof(struct) > 64. Add a assignment to produce llvm.memcpy.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Need to keep consistency between the constant data
allocation and the constant register allocation.
So we need to skip the unused constant data at the
constant data allocation stage.
To avoid possible mismatching, add a new assert in
the constant register(address) allocation stage to
make sure the address register match the eaxct constant
data.
Also modify the constant utest slightly to hit this
code path.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
For scalar register, should use stride 0.
also change the unit test to hit the point.
v2: fix h2()
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Normal mul/add could not met the precision requirement of this case.
Previously it passed because we will do mad optimization in backend.
Use mad directly, so the test case does not depend on backend optimization.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Yi Sun <yi.sun@intel.com>
Signed-off-by: Yangwei Shui <yangweix.shui@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Yi Sun <yi.sun@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
V2: Add result compare.
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Ruiling Song <ruiling.song@intel.com>
|
|
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Also refine Undef value support.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
This patch supports saturated converting from 64-bit int to shorter int,
and from 32-bit float to 64-bit int.
This patch also contains test case.
version 2: ulong had been declared in some platform
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Yang Rong <rong.r.yang@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
As Clang treat local variable in similar way like global constant,
(they are treated as Global variable in each own address space)
we refine the previous constant implementation in order to
share same code between local variable and global constant.
We will allocate an address register for each GlobalVariable
(constant or local) through calling newRegister().
In later step, through getRegister() we will get a proper
register derived from the allocated address register.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
This patch supports saturated conversion of
native GPU data type (char/short/int/float),
from a larger-range data type to a narrower-range data type.
For instance, convert_uchar_sat(int)
Several test cases are in this patch.
v2: add uint->int, int->uint
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
After the previous patch, we can move the image infomation curbe
entry allocation to prior to the instruction selection.
Then we can concentrate all curbe allocation before we do the
normal register allocation. This way can bring two advantages:
1. Avoid the image information curbe entry is allocated among the normal registers.
2. The register interval analyzing could handle the image/sampler information correctly.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
llvm phi node can have odd number of args.
this patch also contains a test case.
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
All the previous 3D test cases are only using depth 1, and not
really touch the 3D read/write code path. Now fix them.
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Reviewed-by: He Junyan <junyan.he@inbox.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
version 2:
improve algorithm to convert signed integer
fix source operand type in llvm_gen_backend
enable predicate in addWithCarry
change test case to test signed integer
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
also improve the accuracy of built-in function "atan"
also add a test case
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: "Yang, Rong R" <rong.r.yang@intel.com>
|
|
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Per opencl spec, bitfield is not supported.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
also include test cases
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
also include a test case
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
"sinpi" was calculated as "sin(pi * x)".
But that was not a quite-good way.
This patch improved the function, also included a test case.
v2: fix compiling warning
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
This unit test case trigger a known bug:
ASSERTION FAILED: TODO Boolean values cannot escape their definition
basic block.
Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
To hit prediction logic.
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|
|
Signed-off-by: Homer Hsing <homer.xing@intel.com>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
|